From arnodel at googlemail.com Sun Mar 1 11:34:59 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Sun, 1 Mar 2009 10:34:59 +0000 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49A9A488.4070308@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> Message-ID: On 28 Feb 2009, at 20:54, Jacob Holm wrote: > Replying to myself here... > > Jacob Holm wrote: >> I think I can actually see a way to do it that should be fast >> enough, but I'd like to >> work out the details first. If it works it will be O(1) with low >> constants as long as >> you don't build trees, and similar to traversing a delegation chain >> in the worst case. >> >> All this depends on getting it working using delegation chains >> first though, as most of >> the StopIteration and Exception handling would be the same. > > I have now worked out the details, and it is indeed possible to get > O(1) for simple cases and amortized O(logN) > in general, all with fairly low constants. I'm sorry if I'm missing something obvious, but there are two things I can't work out: * What you are measuring the time complexity of. * What N stands for. I suspect that N is the 'delegation depth': the number of yield-from that have to be gone through. I imagine that you are measuring the time it takes to get the next element in the generator. These are guesses - can you correct me? > I have implemented the tree structure as a python module and added a > trampoline-based pure-python implementation of "yield-from" to try > it out. > > It seems that this version beats a normal "for v in it: yield v" > when the delegation chains get around 90 generators deep. Can you give an example? -- Arnaud From jh at improva.dk Sun Mar 1 13:30:36 2009 From: jh at improva.dk (Jacob Holm) Date: Sun, 01 Mar 2009 13:30:36 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> Message-ID: <49AA7FEC.4090609@improva.dk> Hi Arnaud Arnaud Delobelle wrote: >> I have now worked out the details, and it is indeed possible to get >> O(1) for simple cases and amortized O(logN) >> in general, all with fairly low constants. > > I'm sorry if I'm missing something obvious, but there are two things I > can't work out: I am glad you asked. Reading it again, I can see that this is definitely not obvious. > > * What you are measuring the time complexity of. The time for a single 'next', 'send', 'throw' or 'close' call to a generator or a single "yield from" expression, excluding the time spent running the user-defined code in it. (Does that make sense?) In other words, the total overhead of finding the actual user code to run and handling the return values/exceptions according to the PEP. > * What N stands for. > > I suspect that N is the 'delegation depth': the number of yield-from > that have to be gone through. I imagine that you are measuring the > time it takes to get the next element in the generator. These are > guesses - can you correct me? N is the total number of suspended generators in the tree(s) of generators involved in the operation. Remember that it is possible to have multiple generators yield from the same 'parent' generator. The 'delegation depth' would be the height of that tree. One interesting thing to note is that all non-contrived examples I have seen only build simple chains of iterators, and only do that by adding one at a time in the deepest nested one. This is the best possible case for my algorithm. If you stick to that, the time per operation is O(1). If we decided to only allow that, the algorithm can be simplified significantly. > >> I have implemented the tree structure as a python module and added a >> trampoline-based pure-python implementation of "yield-from" to try it >> out. >> >> It seems that this version beats a normal "for v in it: yield v" when >> the delegation chains get around 90 generators deep. > > Can you give an example? > Sure, here is the simple code I used for timing the 'next' call import itertools, time from yieldfrom import uses_from, from_ # my module... @uses_from def child(it): yield from_(it) def child2(it): for i in it: yield i def longchain(N): it = itertools.count() for i in xrange(N): it = child(it) # replace this with child2 to test the current # "for v in it: yield v" pattern. it.next() # we are timing the 'next' calls (not the setup of # the chain) so skip the setup by calling next once. return it it = longchain(90) times = [] for i in xrange(10): t1 = time.time() for i in xrange(100000): it.next() t2 = time.time() times.append(t2-t1) print min(times) This version takes about the same time whether you use child or child2 to build the chain. However, the version using yield..from takes the same time no matter how long the chain is, while the time for the "for v in it: yield v" version is linear in the length of the chain, and so would lose big-time for longer chains. I hope this answers your questions. Regards Jacob From lists at cheimes.de Sun Mar 1 14:49:03 2009 From: lists at cheimes.de (Christian Heimes) Date: Sun, 01 Mar 2009 14:49:03 +0100 Subject: [Python-ideas] with statement: multiple context manager Message-ID: Hello fellow Pythonistas! On a regularly basis I'm bothered and annoyed by the fact that the with statement takes only one context manager. Often I need to open two files to read from one and write to the other. I propose to modify the with statement to accept multiple context manangers. Example ======= The nested block:: with lock: with open(infile) as fin: with open(outfile, 'w') as fout: fout.write(fin.read()) could be written as:: with lock, open(infile) as fin, open(outfile, 'w') as fout: fout.write(fin.read()) The context managers' __enter__() and __exit__() methods are called FILO (first in, last out). When an exception is raised by the __enter__() method, the right handed context managers are omitted. Grammar ======= I'm not sure if I got the grammar right but I *think* the new grammar should look like:: with_stmt: 'with' with_vars ':' suite with_var: test ['as' expr] with_vars: with_var (',' with_var)* [','] Christian From grosser.meister.morti at gmx.net Sun Mar 1 15:30:38 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sun, 01 Mar 2009 15:30:38 +0100 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: Message-ID: <49AA9C0E.4060401@gmx.net> Why not use this? from contextlib import nested with nested(lock, open(infile), open(outfile, 'w')) as (_, fin, fout): fout.write(fin.read()) Ok, the _ is ugly, but is it ugly enough so we need this extension to the with statement? -panzi From guido at python.org Sun Mar 1 20:46:17 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Mar 2009 11:46:17 -0800 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: Message-ID: On Sun, Mar 1, 2009 at 5:49 AM, Christian Heimes wrote: > On a regularly basis I'm bothered and annoyed by the fact that the with > statement takes only one context manager. Often I need to open two files > to read from one and write to the other. I propose to modify the with > statement to accept multiple context manangers. > > Example > ======= > > The nested block:: > > ?with lock: > ? ? ?with open(infile) as fin: > ? ? ? ? ?with open(outfile, 'w') as fout: > ? ? ? ? ? ? ?fout.write(fin.read()) > > > could be written as:: > > ?with lock, open(infile) as fin, open(outfile, 'w') as fout: > ? ? ?fout.write(fin.read()) > > > The context managers' __enter__() and __exit__() methods are called FILO > (first in, last out). When an exception is raised by the __enter__() > method, the right handed context managers are omitted. > > Grammar > ======= > > I'm not sure if I got the grammar right but I *think* the new grammar > should look like:: > > with_stmt: 'with' with_vars ':' suite > with_var: test ['as' expr] > with_vars: with_var (',' with_var)* [','] I am sympathetic to this desire -- I think we almost added this to the original PEP but decided to hold off until a clear need was found. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From eli at courtwright.org Sun Mar 1 20:53:28 2009 From: eli at courtwright.org (Eli Courtwright) Date: Sun, 1 Mar 2009 14:53:28 -0500 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: Message-ID: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> On Sun, Mar 1, 2009 at 2:46 PM, Guido van Rossum wrote: > I am sympathetic to this desire -- I think we almost added this to the > original PEP but decided to hold off until a clear need was found. I second the motion to have this syntax added to the language. I've often had to write nested with blocks to open one file for reading and another for writing. - Eli From pyideas at rebertia.com Sun Mar 1 21:22:53 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 1 Mar 2009 12:22:53 -0800 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> Message-ID: <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> On Sun, Mar 1, 2009 at 11:53 AM, Eli Courtwright wrote: > On Sun, Mar 1, 2009 at 2:46 PM, Guido van Rossum wrote: >> I am sympathetic to this desire -- I think we almost added this to the >> original PEP but decided to hold off until a clear need was found. > > I second the motion to have this syntax added to the language. ?I've > often had to write nested with blocks to open one file for reading and > another for writing. It does seem slightly incongruous though, given that the for-statement, which is quite similar to the with-statement in that they both bind new variables in a subsidiary block of code, does not directly support multiple simultaneous bindings. To put it more concretely, currently one must write: for a, b, c in zip(seq1, seq2, seq3): #body Rather than: for a in seq1, b in seq2, c in seq3: #body But for some reason we're proposing to, in a way, make nested() built into `with` but not make zip() likewise built into `for`. While I still mostly like the idea, it does seem to undermine Python's uniformity a bit. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From daniel at stutzbachenterprises.com Sun Mar 1 21:49:06 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Sun, 1 Mar 2009 14:49:06 -0600 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> Message-ID: On Fri, Feb 27, 2009 at 1:20 PM, Guido van Rossum wrote: > On Fri, Feb 27, 2009 at 4:49 AM, Curt Hagenlocher > wrote: > > That way lies madness. What distinguishes "with" from other compound > > statements is that it's already about resource management in the face > > of possible exceptions. > > Still, a firm -1 from me. Once we have "try try" I'm sure people are > going to clamor for "try if", "try while", "try for", even (oh horror > :-) "try try". I don't think we should complicate the syntax just to > save one level of indentation occasionally. > In addition to reasons outlined by Curt, "with" is unique because it's short-hand for a "try" block with a "finally" clause. Unfortunately, "with" doesn't allow for other clauses and so I often end up using both "try" and "with". Also, "try if", "try while", and "try for" wouldn't work because they already have a meaning for the "else" clause. "with" does not. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Sun Mar 1 21:57:50 2009 From: lists at cheimes.de (Christian Heimes) Date: Sun, 01 Mar 2009 21:57:50 +0100 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> Message-ID: Chris Rebert wrote: > While I still mostly like the idea, it does seem to undermine Python's > uniformity a bit. I played with both possible versions before I wrote the proposal. Both ways have their pros and cons. I'm preferring the proposed way:: with a, b as x, d as y: ... over the other possibility:: with a, b, c as _, x, y: ... for two reasons. For one I dislike the temporary variable that is required for some cases, e.g. the case I used in my initial proposal. It doesn't feel quite right to use a useless placeholder. The proposed way follows the example of the import statement, too:: from module import a, b as x, d as y Christian From greg at krypto.org Sun Mar 1 22:01:47 2009 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 1 Mar 2009 13:01:47 -0800 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: Message-ID: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com> On Sun, Mar 1, 2009 at 11:46 AM, Guido van Rossum wrote: > On Sun, Mar 1, 2009 at 5:49 AM, Christian Heimes wrote: > > On a regularly basis I'm bothered and annoyed by the fact that the with > > statement takes only one context manager. Often I need to open two files > > to read from one and write to the other. I propose to modify the with > > statement to accept multiple context manangers. > > > > Example > > ======= > > > > The nested block:: > > > > with lock: > > with open(infile) as fin: > > with open(outfile, 'w') as fout: > > fout.write(fin.read()) > > > > > > could be written as:: > > > > with lock, open(infile) as fin, open(outfile, 'w') as fout: > > fout.write(fin.read()) > Alternatively if closer conformity with for loop syntax is desirable consider this: with lock, open(infile), open(outfile) as lock, fin, fout: fout.fwrite(fin.read()) > > > > > > The context managers' __enter__() and __exit__() methods are called FILO > > (first in, last out). When an exception is raised by the __enter__() > > method, the right handed context managers are omitted. > > > > Grammar > > ======= > > > > I'm not sure if I got the grammar right but I *think* the new grammar > > should look like:: > > > > with_stmt: 'with' with_vars ':' suite > > with_var: test ['as' expr] > > with_vars: with_var (',' with_var)* [','] > > I am sympathetic to this desire -- I think we almost added this to the > original PEP but decided to hold off until a clear need was found. > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ironfroggy at gmail.com Sun Mar 1 22:15:45 2009 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 1 Mar 2009 16:15:45 -0500 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com> References: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com> Message-ID: <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com> On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith wrote: > Alternatively if closer conformity with for loop syntax is desirable > consider this: > with lock, open(infile), open(outfile) as lock, fin, fout: > ?? ?fout.fwrite(fin.read()) +1 We don't have multi-assignment statements in favor of the unpacking concept, and I think it carries over here. Also, as mentioned, this goes along with the lack of any multi-for statement. The `x as y` part of the with statement is basically an assignment with extras, and the original suggestion then combines multiple assignments on one line. This option, I think, is more concise and readable. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From pyideas at rebertia.com Sun Mar 1 22:35:07 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 1 Mar 2009 13:35:07 -0800 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> Message-ID: <50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com> On Sun, Mar 1, 2009 at 12:57 PM, Christian Heimes wrote: > Chris Rebert wrote: >> While I still mostly like the idea, it does seem to undermine Python's >> uniformity a bit. > > I played with both possible versions before I wrote the proposal. Both > ways have their pros and cons. I'm preferring the proposed way:: > > with a, b as x, d as y: > ... > > over the other possibility:: > > with a, b, c as _, x, y: > ... You misunderstand me. My quibble isn't over the exact syntax (in fact, I completely agree about the superiority of the proposed ordering), but rather that we're introducing syntax to do something that can already be done with a function (nested()) and is /extremely/ similar to another case (parallel for-loop) where we are opting to still require the use of a function (zip()). This proposed asymmetry concerns me. > The proposed way > follows the example of the import statement, too:: > > from module import a, b as x, d as y This parallel does quell my concern somewhat. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From tjreedy at udel.edu Sun Mar 1 22:45:07 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 01 Mar 2009 16:45:07 -0500 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com> References: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com> <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com> Message-ID: Calvin Spealman wrote: > On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith wrote: >> Alternatively if closer conformity with for loop syntax is desirable >> consider this: >> with lock, open(infile), open(outfile) as lock, fin, fout: >> fout.fwrite(fin.read()) > > +1 > > We don't have multi-assignment statements in favor of the unpacking > concept, and I think it carries over here. Also, as mentioned, this > goes along with the lack of any multi-for statement. The `x as y` part > of the with statement is basically an assignment with extras, and the > original suggestion then combines multiple assignments on one line. > This option, I think, is more concise and readable. I prefer this also, for the same reasons. tjr From guido at python.org Sun Mar 1 23:01:47 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Mar 2009 14:01:47 -0800 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com> References: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com> <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com> Message-ID: On Sun, Mar 1, 2009 at 1:15 PM, Calvin Spealman wrote: > On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith wrote: >> Alternatively if closer conformity with for loop syntax is desirable >> consider this: >> with lock, open(infile), open(outfile) as lock, fin, fout: >> ?? ?fout.fwrite(fin.read()) > > +1 > > We don't have multi-assignment statements in favor of the unpacking > concept, and I think it carries over here. Also, as mentioned, this > goes along with the lack of any multi-for statement. The `x as y` part > of the with statement is basically an assignment with extras, and the > original suggestion then combines multiple assignments on one line. > This option, I think, is more concise and readable. -1 for this variant. The syntactic model is import: import foo as bar, bletch, quuz as frobl. If we're doing this it should be like this. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Mar 1 23:14:30 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Mar 2009 14:14:30 -0800 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com> References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> <50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com> Message-ID: On Sun, Mar 1, 2009 at 1:35 PM, Chris Rebert wrote: > On Sun, Mar 1, 2009 at 12:57 PM, Christian Heimes wrote: >> Chris Rebert wrote: >>> While I still mostly like the idea, it does seem to undermine Python's >>> uniformity a bit. >> >> I played with both possible versions before I wrote the proposal. Both >> ways have their pros and cons. I'm preferring the proposed way:: >> >> ?with a, b as x, d as y: >> ? ? ? ... >> >> over the other possibility:: >> >> ?with a, b, c as _, x, y: >> ? ? ?... > > You misunderstand me. My quibble isn't over the exact syntax (in fact, > I completely agree about the superiority of the proposed ordering), > but rather that we're introducing syntax to do something that can > already be done with a function (nested()) and is /extremely/ similar > to another case (parallel for-loop) where we are opting to still > require the use of a function (zip()). This proposed asymmetry > concerns me. Hm. While we can indeed write the equivalent of the proposed "with a, b:" today as "with nested(a, b):", I don't think that the situation is quite comparable to a for-loop over a zip() call. The nested() context manager isn't particularly intuitive to me (and Nick just found a problem in a corner case of its semantics). Compared to nested(), I find "with a, b:" very obvious as a shorthand for nested with-statements: with a: with b: ... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pyideas at rebertia.com Sun Mar 1 23:22:56 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 1 Mar 2009 14:22:56 -0800 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> <50697b2c0903011335s38083491t80726e803df171bf@mail.gmail.com> Message-ID: <50697b2c0903011422l1f2dad1dv2ed566093dd1addf@mail.gmail.com> On Sun, Mar 1, 2009 at 2:14 PM, Guido van Rossum wrote: > On Sun, Mar 1, 2009 at 1:35 PM, Chris Rebert wrote: >> On Sun, Mar 1, 2009 at 12:57 PM, Christian Heimes wrote: >>> Chris Rebert wrote: >>>> While I still mostly like the idea, it does seem to undermine Python's >>>> uniformity a bit. >>> >>> I played with both possible versions before I wrote the proposal. Both >>> ways have their pros and cons. I'm preferring the proposed way:: >>> >>> ?with a, b as x, d as y: >>> ? ? ? ... >>> >>> over the other possibility:: >>> >>> ?with a, b, c as _, x, y: >>> ? ? ?... >> >> You misunderstand me. My quibble isn't over the exact syntax (in fact, >> I completely agree about the superiority of the proposed ordering), >> but rather that we're introducing syntax to do something that can >> already be done with a function (nested()) and is /extremely/ similar >> to another case (parallel for-loop) where we are opting to still >> require the use of a function (zip()). This proposed asymmetry >> concerns me. > > Hm. While we can indeed write the equivalent of the proposed "with a, > b:" today as "with nested(a, b):", I don't think that the situation is > quite comparable to a for-loop over a zip() call. The nested() context > manager isn't particularly intuitive to me (and Nick just found a > problem in a corner case of its semantics). Compared to nested(), I > find "with a, b:" very obvious as a shorthand for nested > with-statements: > > ?with a: > ? ?with b: > ? ? ?... > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > Ok, fine by me. Just wanted to ensure the point was brought up and adaquately responded to. On another note, someone ought to draft a revision to PEP 343 to document this proposal. Cheers, Chris -- Follow the path of the Iguana... http://rebertia.com From greg.ewing at canterbury.ac.nz Sun Mar 1 23:32:21 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Mar 2009 11:32:21 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AA7FEC.4090609@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> Message-ID: <49AB0CF5.6020900@canterbury.ac.nz> Jacob Holm wrote: > > Arnaud Delobelle wrote: > >> * What you are measuring the time complexity of. > > The time for a single 'next', 'send', 'throw' or 'close' call to a > generator or a single "yield from" expression, How are you measuring 'time', though? Usually when discussing time complexities we're talking about the number of some fundamental operation being performed, and assuming that all such operations take the same time. What operations are you counting here? > One interesting thing to note is that all non-contrived examples > I have seen only build simple chains of iterators, and only do > that by adding one at a time in the deepest nested one. This is > the best possible case for my algorithm. If you stick to that, > the time per operation is O(1). I don't understand that limitation. If you keep a stack of active generators, you always have constant-time access to the one to be resumed next. There's some overhead for pushing and popping the stack, but it follows exactly the same pattern as the recursive calls you'd be making if you weren't using some kind of yield-from, so it's irrelevant when comparing the two. -- Greg From guido at python.org Sun Mar 1 23:31:58 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Mar 2009 14:31:58 -0800 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> Message-ID: On Sun, Mar 1, 2009 at 12:49 PM, Daniel Stutzbach wrote: > On Fri, Feb 27, 2009 at 1:20 PM, Guido van Rossum wrote: >> >> On Fri, Feb 27, 2009 at 4:49 AM, Curt Hagenlocher >> wrote: >> > That way lies madness. ?What distinguishes "with" from other compound >> > statements is that it's already about resource management in the face >> > of possible exceptions. >> >> Still, a firm -1 from me. Once we have "try try" I'm sure people are >> going to clamor for "try if", "try while", "try for", even (oh horror >> :-) "try try". I don't think we should complicate the syntax just to >> save one level of indentation occasionally. > > In addition to reasons outlined by Curt, "with" is unique because it's > short-hand for a "try" block with a "finally" clause.? Unfortunately, "with" > doesn't allow for other clauses and so I often end up using both "try" and > "with". > > Also, "try if", "try while", and "try for" wouldn't work because they > already have a meaning for the "else" clause.? "with" does not. Sorry, but my gut keeps telling me that "try with" is not taking the language into a direction I am comfortable with. Programming language design is not a rational science. Most reasoning about is is at best rationalization of gut feelings, and at worst plain wrong. So, sorry, but I'm going with my gut feelings, so it's still -1. (Or if you wish, -1000.) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From daniel at stutzbachenterprises.com Sun Mar 1 23:53:09 2009 From: daniel at stutzbachenterprises.com (Daniel Stutzbach) Date: Sun, 1 Mar 2009 16:53:09 -0600 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> Message-ID: On Sun, Mar 1, 2009 at 4:31 PM, Guido van Rossum wrote: > Sorry, but my gut keeps telling me that "try with" is not taking the > language into a direction I am comfortable with. Programming language > design is not a rational science. Most reasoning about is is at best > rationalization of gut feelings, and at worst plain wrong. So, sorry, > but I'm going with my gut feelings, so it's still -1. (Or if you wish, > -1000.) > I thought that might be the case based on your first response, but figured I'd give it one more shot. ;-) I respect your opinion. Consider it dropped. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Mar 2 00:51:44 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Mar 2009 12:51:44 +1300 Subject: [Python-ideas] Revised**7 PEP on Yield-From Message-ID: <49AB1F90.7070201@canterbury.ac.nz> I've made another couple of tweaks to the formal semantics (so as not to over-specify when the iterator methods are looked up). Latest version of the PEP, together with the prototype implementation and other related material, is available here: http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ -- Greg From jh at improva.dk Mon Mar 2 01:09:14 2009 From: jh at improva.dk (Jacob Holm) Date: Mon, 02 Mar 2009 01:09:14 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AB0CF5.6020900@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> Message-ID: <49AB23AA.4060106@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: >> >> Arnaud Delobelle wrote: >> >>> * What you are measuring the time complexity of. >> >> The time for a single 'next', 'send', 'throw' or 'close' call to a >> generator or a single "yield from" expression, > > How are you measuring 'time', though? Usually when discussing > time complexities we're talking about the number of some > fundamental operation being performed, and assuming that > all such operations take the same time. What operations > are you counting here? All operations are simple attribute lookups, assignments, addition and the like. No hashing, no dynamic arrays, no magic, just plain' ol' fundamental operations as they would be in a C program :) >> One interesting thing to note is that all non-contrived examples > > I have seen only build simple chains of iterators, and only do > > that by adding one at a time in the deepest nested one. This is > > the best possible case for my algorithm. If you stick to that, > > the time per operation is O(1). > > I don't understand that limitation. If you keep a stack of > active generators, you always have constant-time access to the > one to be resumed next. But that is exactly the point. This is not a simple stack. After doing "yield from R" in generator A, you can go ahead and do "yield from R" in generator B as well. If R is also using "yield from" you will have the following situation: A \ R --- (whatever R is waiting for) / B Yes, A sees it as a stack [A,R,...], and B sees it as a stack [B,R,...], but the part of the stack following R is the same in the two. If R does a "yield from", the new iterator must appear at the top of both "stacks". Efficiently getting to the top of this "shared stack", and doing the equivalent of push, pop, ... is what I am trying to achieve. And I think I have succeeded. > There's some overhead for pushing and > popping the stack, but it follows exactly the same pattern > as the recursive calls you'd be making if you weren't using > some kind of yield-from, so it's irrelevant when comparing the > two. > This would be correct if it was somehow forbidden to create the scenario I sketched above. As long as that scenario is possible I can construct an example where treating it as a simple stack will either do the wrong thing, or do the right thing but slower than a standard "for v in it: yield v". Of course, you could argue that these examples are contrived and we don't have to care about them. I just think that we should do better if we can. FWIW, here is the example from above in more detail... def R(): yield 'A' yield 'B' yield from xrange(3) # we could do anything here really... r = R() def A(): yield from r a = A() def B() yield from r b = B() a.next() # returns 'A' b.next() # returns 'B' r.next() # returns 0 a.next() # returns 1 b.next() # returns 2 This is clearly legal based on the PEP, and it generalizes to a way of building an arbitrary tree with suspended generators as nodes. What my algorithm does is break this tree down into what is essentially a set of stacks. Managed such that the common case has just a single stack, and no matter how twisted your program is, the path from any generator to the root is split across at most O(logN) stacks (where N is the size of the tree). I hope this helps to explain what I think I have done... Best regards Jacob From guido at python.org Mon Mar 2 01:28:48 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 1 Mar 2009 16:28:48 -0800 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> Message-ID: On Sun, Mar 1, 2009 at 2:53 PM, Daniel Stutzbach wrote: > On Sun, Mar 1, 2009 at 4:31 PM, Guido van Rossum wrote: >> >> Sorry, but my gut keeps telling me that "try with" is not taking the >> language into a direction I am comfortable with. Programming language >> design is not a rational science. Most reasoning about is is at best >> rationalization of gut feelings, and at worst plain wrong. So, sorry, >> but I'm going with my gut feelings, so it's still -1. (Or if you wish, >> -1000.) > > I thought that might be the case based on your first response, but figured > I'd give it one more shot.? ;-)? I respect your opinion.? Consider it > dropped. Thanks. I've learned the hard way to trust my gut instincts in this area. If anyone can figure out how to explain my gut's responses to people wanting rational answers I'd love to hear about it. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jh at improva.dk Mon Mar 2 01:35:10 2009 From: jh at improva.dk (Jacob Holm) Date: Mon, 02 Mar 2009 01:35:10 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <9A1B0516-AF3E-4BB2-99DA-16EA9F6E7F56@googlemail.com> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0522.1010106@improva.dk> <9A1B0516-AF3E-4BB2-99DA-16EA9F6E7F56@googlemail.com> Message-ID: <49AB29BE.1070508@improva.dk> Arnaud Delobelle wrote: >> * I don't think you can handle 'close' in accordance with the PEP, >> and fixing this does not look easy. > > I haven't thought about this - unfortunately probably won't have time > till next weekend :( :( >> * It seems like in your version, you only gain when only the >> outermost generator is decorated. (You conveniently modified the >> testing code to only decorate with co *after* building the long >> chain). You need to have *all* generators using FROM decorated, or >> disaster will strike as soon as someone calls next on one of the >> intermediate generators. >> > > I don't think this is really a problem. If yield-from was in the > language, this 'decoration' could be added automatically at > compile-time. I haven't really thought about it in detail however :) Agreed, but the purpose of the test is to see how it *would* behave if the 'decoration' happened at compile time. Stripping it off will only hide problems, such as the fact that there is no speed benefit at all... > >>> So the turning point is a depth of 10, after which my implementation >>> wins. At a depth of 90, cochild is about 10 times faster than child. >> As could be expected, since you also have O(1) in this case. However, >> if you decorate all the generators in the chain you lose no matter >> what N is... Most of the speed difference between our versions is >> probably due to the fact that I am using classes instead of >> generators to allow me to have everything decorated, and to handle >> some more convoluted cases as well. >> > > But decorating each generator would defeat the purpose! I guess one > way to find out would be to reimplement it with a class - see below! And what purpose is that, if I might ask? My two purposes were to a) test my algorithm for speeding up arbitrarily complex cases, and b) have a way to play with the "yield from" concept as specified in the PEP. It seems that your original purpose was similar to b). My claim is that using your 'FROM' outside a generator decorated with your 'co' has no equivalent in the PEP and might be a bad idea, and I gave an example where this gives a bad/unexpected result. >>> >>> I don't know if my implementation behaves well with other examples >>> you have in mind, I would like to know! >>> >> I'll write a few more tests tonight so you can see the kind of >> problems I am looking at. > > That would be good. If I think my method can handle them, then I might > reimplement it with a class - it wouldn't take very long but I have > very little time at the moment! > > Thanks, > It looks like it might not be tonight after all. I'll try to get to it some time during this week. Best regards Jacob From greg.ewing at canterbury.ac.nz Mon Mar 2 01:39:00 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Mar 2009 13:39:00 +1300 Subject: [Python-ideas] Yield-from patches on Rietveld In-Reply-To: References: <49A13D45.7080608@canterbury.ac.nz> Message-ID: <49AB2AA4.903@canterbury.ac.nz> I have uploaded the patches for my yield-from implementation to Rietveld if anyone wants to take a look at them: http://codereview.appspot.com/20101/show This is the first time I've attempted to use Rietveld, so I hope I've got things right. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 2 02:09:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Mar 2009 14:09:20 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AB23AA.4060106@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> Message-ID: <49AB31C0.5000005@canterbury.ac.nz> Jacob Holm wrote: > After doing > "yield from R" in generator A, you can go ahead and do "yield from R" in > generator B as well. If R is also using "yield from" you will have the > following situation: > > A > \ > R --- (whatever R is waiting for) > / > B Unless you're doing something extremely unusual, this situation wouldn't arise. The yield-from statement is intended to run the iterator to exhaustion, and normally you'd create a fresh iterator for each yield-from that you want to do. So A and B would really be yielding from different iterators, even if they were both iterating over the same underlying object. If you did try to share iterators between yield-froms like that, you would have to arrange things so that at least all but one of them broke out of the loop early, otherwise something is going to get an exception due to trying to resume an exhausted iterator. But in any case, I think you can still model this as two separate stacks, with R appearing in both stacks: [A, R] and [B, R]. Whichever one of them finishes yielding from R first pops it from its stack, and when the other one tries to resume R it gets an exception. Either that or it breaks out of its yield-from early and discards its version of R. > As long as that scenario is possible I can construct > an example where treating it as a simple stack will either do the wrong > thing, or do the right thing but slower than a standard "for v in it: > yield v". That depends on what you think the "right thing" is. If you think that somehow A needs to notice that B has finished yielding from R and gracefully stop doing so itself, then that's not something I intended and it's not the way the current prototype implementation would behave. So IMO you're worrying about a problem that doesn't exist. -- Greg From jh at improva.dk Mon Mar 2 03:11:04 2009 From: jh at improva.dk (Jacob Holm) Date: Mon, 02 Mar 2009 03:11:04 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AB31C0.5000005@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> Message-ID: <49AB4038.6070205@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: >> After doing >> "yield from R" in generator A, you can go ahead and do "yield from R" >> in generator B as well. If R is also using "yield from" you will have >> the following situation: >> >> A >> \ >> R --- (whatever R is waiting for) >> / >> B > > Unless you're doing something extremely unusual, this situation > wouldn't arise. The yield-from statement is intended to run the > iterator to exhaustion, and normally you'd create a fresh iterator > for each yield-from that you want to do. So A and B would really > be yielding from different iterators, even if they were both > iterating over the same underlying object. Unusual, yes. Extremely? I'm not sure. If you/we allow this, someone will find a use for it. > > If you did try to share iterators between yield-froms like that, > you would have to arrange things so that at least all but one > of them broke out of the loop early, otherwise something is > going to get an exception due to trying to resume an exhausted > iterator. Did you read the spelled-out version at the bottom? No need to "break out" of anything. That happens automatically because of the "yield from". Just a few well-placed calls to next... > > But in any case, I think you can still model this as two separate > stacks, with R appearing in both stacks: [A, R] and [B, R]. Whichever > one of them finishes yielding from R first pops it from its stack, > and when the other one tries to resume R it gets an exception. Either > that or it breaks out of its yield-from early and discards its > version of R. I am not worried about R running out, each of A and B would find out about that next time they tried to get a value. I *am* worried about R doing a yield-from to X (the xrange in this example) which then needs to appear in both stacks to get the expected behavior from the PEP. > > > As long as that scenario is possible I can construct >> an example where treating it as a simple stack will either do the >> wrong thing, or do the right thing but slower than a standard "for v >> in it: yield v". > > That depends on what you think the "right thing" is. If you > think that somehow A needs to notice that B has finished yielding > from R and gracefully stop doing so itself, then that's not something > I intended and it's not the way the current prototype implementation > would behave. The right thing is whatever the PEP specifies :) You are the author, so you get to decide... I am saying that what the PEP currently specifies is not quite so simple to speed up as you and Arnaud seem to think. (Even with a simple stack, handling 'close' and 'StopIteration' correctly is not exactly trivial) > > So IMO you're worrying about a problem that doesn't exist. > No, I am worrying about a problem that so far has only appeared in contrived examples designed to expose it. Any real-life examples I have seen of the "yield from" feature would work perfectly well with a simple stack-based approach. However, I have seen several ideas for speeding up long chains of "yield from"s beyond the current C implementation, and most of them fail either by giving wrong results (bad) or by slowing things down in admittedly unusual cases (not so bad, but not good). Anyway... it is 3 in the morning. As I told Arnaud, I will try to find some time this week to write some more of these crazy examples. Regards Jacob From jimjjewett at gmail.com Mon Mar 2 04:05:16 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Sun, 1 Mar 2009 22:05:16 -0500 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AB31C0.5000005@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> Message-ID: On 3/1/09, Greg Ewing wrote: > Jacob Holm wrote: >> A >> \ >> R --- (whatever R is waiting for) >> / >> B >... normally you'd create a fresh iterator > for each yield-from that you want to do. So A and B would really > be yielding from different iterators, even if they were both > iterating over the same underlying object. I think the problem might come up with objects that are using the iterator protocol destructively. For example, imagine A and B as worker threads, and R as a work queue. > If you did try to share iterators between yield-froms like that, >... something is > going to get an exception due to trying to resume an exhausted > iterator. I would expect that to be interpreted as a StopIteration and handled gracefully. If that doesn't seem reasonable, then I wonder if the whole protocol is still too fragile. -jJ From jimjjewett at gmail.com Mon Mar 2 04:34:48 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Sun, 1 Mar 2009 22:34:48 -0500 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AB4038.6070205@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> Message-ID: On 3/1/09, Jacob Holm wrote: > Greg Ewing wrote: >> Jacob Holm wrote: [naming that for which R waits] >>> A >>> \ >>> R --- X (=whatever R is waiting for) >>> / >>> B >> But in any case, I think you can still model this as two separate >> stacks, with R appearing in both stacks: [A, R] and [B, R]. [A, R, X] and [B, R, X] >> Whichever >> one of them finishes yielding from R first pops it from its stack, >> and when the other one tries to resume R it gets an exception. Either >> that or it breaks out of its yield-from early and discards its >> version of R. > I am not worried about R running out, each of A and B would find out > about that next time they tried to get a value. I *am* worried about R > doing a yield-from to X (the xrange in this example) which then needs to > appear in both stacks to get the expected behavior from the PEP. I would assume that the second one tries to resume, gets the StopIteration from X, retreats to R, gets the StopIteration there as well, and then continues after its yield-from. If that didn't happen, I would wonder whether (theoretical speed) optimization was leading to suboptimal semantics. >> > As long as that scenario is possible I can construct >>> an example where treating it as a simple stack will either do the >>> wrong thing, or do the right thing but slower than a standard "for v >>> in it: yield v". I have to wonder whether any optimization will be a mistake. At the moment, I can't think of any way to do it without adding at least an extra pointer and an extra if-test. That isn't much, but ... how often will there be long chains, vs how often are generators used without getting any benefit from this sort of delegation? -jJ From greg.ewing at canterbury.ac.nz Mon Mar 2 05:09:24 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Mar 2009 17:09:24 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AB4038.6070205@improva.dk> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> Message-ID: <49AB5BF4.6020203@canterbury.ac.nz> Jacob Holm wrote: > Did you read the spelled-out version at the bottom? No need to "break > out" of anything. That happens automatically because of the "yield > from". Just a few well-placed calls to next... Yes, and I tried running it on my prototype implementation. It gives exactly the results you suggested, and any further next() calls on a or b raise StopIteration. You're right that there's no need to break out early in the case of generators, since it seems they just continue to raise StopIteration if you call next() on them after they've finished. Other kinds of iterators might not be so forgiving. > I am not worried about R running out, each of A and B would find out > about that next time they tried to get a value. I *am* worried about R > doing a yield-from to X (the xrange in this example) which then needs to > appear in both stacks to get the expected behavior from the PEP. What's wrong with it appearing in both stacks, though, as long as it gives the right result? > I am saying that what the PEP currently specifies is not quite so simple > to speed up as you and Arnaud seem to think. > (Even with a simple stack, handling 'close' and 'StopIteration' > correctly is not exactly trivial) It's a bit tricky to handle all the cases correctly, but that has nothing to do with speed. I should perhaps point out that neither of my suggested implementations actually use separate stacks like this. The data structure is really more like the shared stack you suggest, except that it's accessed from the "bottoms" rather than the "top". This is only a speed issue if the time taken to find the "top" starting from one of the "bottoms" is a significant component of the total running time. My conjecture is that it won't be, especially if you do it iteratively in a tight C loop. Some timing experiments I did suggest that the current implementation (which finds the "top" using recursion in C rather than iteration) is at least 20 times faster at delegating a next() call than using a for-loop, which is already a useful improvement, and the iterative method can only make it better. So until someone demonstrates that the simple algorithm I'm using is too slow in practice, I don't see much point in trying to find a smarter one. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 2 05:21:51 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Mar 2009 17:21:51 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> Message-ID: <49AB5EDF.6070903@canterbury.ac.nz> Jim Jewett wrote: > I would expect that to be interpreted as a StopIteration and handled > gracefully. If that doesn't seem reasonable, then I wonder if the > whole protocol is still too fragile. I've just done what I should have done in the first place and checked the documentation. From the Library Reference, section 3.5, Iterator Types: "The intention of the protocol is that once an iterator's next() method raises StopIteration, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken. (This constraint was added in Python 2.3; in Python 2.2, various iterators are broken according to this rule.)" So according to modern-day rules at least, Jacob's example will work fine. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 2 05:37:24 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 02 Mar 2009 17:37:24 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: References: <499DDA4C.8090906@canterbury.ac.nz> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> Message-ID: <49AB6284.7020202@canterbury.ac.nz> Jim Jewett wrote: > how often will there be long chains, My suspicion is not very often. The timing tests I did suggest that the biggest benefit will be from simply removing most of the delegation overhead from a single level of delegation, and you don't need any fancy algorithm for that. My experiments with traversing binary trees suggest that the delegation overhead isn't all that great a problem until the tree is about 20 levels deep, or 1e6 nodes. Even then it doesn't exactly kill you, but even so, my naive implementation shows a clear improvement. -- Greg From denis.spir at free.fr Mon Mar 2 08:16:28 2009 From: denis.spir at free.fr (spir) Date: Mon, 2 Mar 2009 08:16:28 +0100 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> Message-ID: <20090302081628.502bc1ee@o> Le Sun, 1 Mar 2009 14:31:58 -0800, Guido van Rossum s'exprima ainsi: > Programming language > design is not a rational science. Most reasoning about is is at best > rationalization of gut feelings, and at worst plain wrong. ! ;-) Denis ------ la vita e estrany PS: Time for starting a "Quotes" section at http://en.wikipedia.org/wiki/Guido_van_Rossum ? From pyideas at rebertia.com Mon Mar 2 08:18:48 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 1 Mar 2009 23:18:48 -0800 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <20090302081628.502bc1ee@o> References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> <20090302081628.502bc1ee@o> Message-ID: <50697b2c0903012318t2fc80318vf6d14bfefa5dc125@mail.gmail.com> On Sun, Mar 1, 2009 at 11:16 PM, spir wrote: > Le Sun, 1 Mar 2009 14:31:58 -0800, > Guido van Rossum s'exprima ainsi: > >> Programming language >> design is not a rational science. Most reasoning about is is at best >> rationalization of gut feelings, and at worst plain wrong. +1 QOTW! - Chris -- Shameless self-promotion: http://blog.rebertia.com From g.brandl at gmx.net Mon Mar 2 09:10:41 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 02 Mar 2009 09:10:41 +0100 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: <52dc1c820903011301p631cefc2l2300912b88b2e890@mail.gmail.com> <76fd5acf0903011315l4309ee06xb65c303eea44a828@mail.gmail.com> Message-ID: Guido van Rossum schrieb: > On Sun, Mar 1, 2009 at 1:15 PM, Calvin Spealman wrote: >> On Sun, Mar 1, 2009 at 4:01 PM, Gregory P. Smith wrote: >>> Alternatively if closer conformity with for loop syntax is desirable >>> consider this: >>> with lock, open(infile), open(outfile) as lock, fin, fout: >>> fout.fwrite(fin.read()) >> >> +1 >> >> We don't have multi-assignment statements in favor of the unpacking >> concept, and I think it carries over here. Also, as mentioned, this >> goes along with the lack of any multi-for statement. The `x as y` part >> of the with statement is basically an assignment with extras, and the >> original suggestion then combines multiple assignments on one line. >> This option, I think, is more concise and readable. > > -1 for this variant. The syntactic model is import: import foo as bar, > bletch, quuz as frobl. If we're doing this it should be like this. Also, the "a as b, c as d" syntax better conveys the fact that the managers are called sequentially, and not somehow "in parallel". Georg From clockworksaint at gmail.com Mon Mar 2 10:22:10 2009 From: clockworksaint at gmail.com (Weeble) Date: Mon, 2 Mar 2009 09:22:10 +0000 Subject: [Python-ideas] Syntax for curried functions Message-ID: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Would this be crazy? Make this: def spam(a,b,c)(d,e,f): [BODY] a synonym for: def spam(a,b,c): def spam2(d,e,f): [BODY] return spam2 As far as I can tell, it would always be illegal syntax right now, so it seems a safe change. And it would make it cleaner to write higher order functions. While I had the idea bouncing round my head, I realised I would quite like the syntax for writing methods too. E.g.: class Eggs(object): def spam(self)(a,b,c): [BODY] Of course, that wouldn't work unless you wrote and used a special meta-class to change how bound methods work (instead of creating a bound method with the instance and function, you'd call the "method function" with the instance and get back the "bound method"), and I have no idea, but I guess it might slow things down. But I thought it was worth mentioning because it looked cool. Anyway, that's tangential to the idea. The idea is that any (positive) number of parameter lists could follow the method name in its declaration. The function takes as input its first parameter list and returns a function that takes the second parameter list, etc., until there are no more method lists and the last level of function executes and returns the result of the function body. From steve at pearwood.info Mon Mar 2 13:09:57 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 02 Mar 2009 23:09:57 +1100 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: <49ABCC95.2050503@pearwood.info> Weeble wrote: > Would this be crazy? Make this: > > def spam(a,b,c)(d,e,f): > [BODY] > > a synonym for: > > def spam(a,b,c): > def spam2(d,e,f): > [BODY] > return spam2 > > As far as I can tell, it would always be illegal syntax right now, so > it seems a safe change. And it would make it cleaner to write higher > order functions. I don't think it would. Your proposed syntax is only useful in the case that the outer function has no doc string and there's no pre-processing or post-processing of the inner function, including decorators. In practice, when I write higher order functions, I often do something like this example: # untested def spam_factory(n): """Factory returning decorators that print 'spam' n times.""" msg = "spam " * n if n > 5: msg += 'WONDERFUL SPAM!!!' def decorator(func): """Print 'spam' %d times.""" @functools.wraps(func) def f(*args, *kwargs): print msg return func(*args, **kwargs) return f decorator.__doc__ = decorator.__doc__ % n return decorator Although the function as given is contrived, the basic structure (including doc strings, decorators, pre-processing and post-processing) is realistic, and your suggested syntax wouldn't be useful here. -- Steven From sturla at molden.no Mon Mar 2 13:21:09 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 02 Mar 2009 13:21:09 +0100 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> Message-ID: <49ABCF35.5030002@molden.no> On 3/1/2009 9:57 PM, Christian Heimes wrote: > with a, b as x, d as y: I'd like to add that parentheses improve readability here: with a, (b as x), (d as y): I am worried the proposed syntax could be a source of confusion and errors. E.g. when looking at with a,b as c,d: my eyes read with nested(a,b) as c,d: when Python would read with a,(b as c),d: It may actually be better to keep the current implementation with contextlib.nested. If contextlib.nested is not well known (I only learned of its existence recently), maybe it should be better documented? Tutorial examples of the with statement should cover contextlib.nested as well. Sturla Molden From dangyogi at gmail.com Mon Mar 2 15:45:56 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Mon, 02 Mar 2009 09:45:56 -0500 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: <49ABF124.2000907@gmail.com> Weeble wrote: > Would this be crazy? Make this: > > def spam(a,b,c)(d,e,f): > [BODY] > > a synonym for: > > def spam(a,b,c): > def spam2(d,e,f): > [BODY] > return spam2 > Why not: def curry(n): def decorator(fn): @functools.wraps(fn) def surrogate(*args): if len(args) != n: raise TypeError("%s() takes exactly %d arguments (%d given)" % (fn.__name__, n, len(args))) def curried(*rest, **kws): return fn(*(args + rest), **kws) return curried return surrogate return decorator @curry(3) def spam(a,b,c,d,e,f): print a,b,c,d,e,f spam(1,2,3)(4,5,6) -bruce frederiksen From guido at python.org Mon Mar 2 16:25:03 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 2 Mar 2009 07:25:03 -0800 Subject: [Python-ideas] "try with" syntactic sugar In-Reply-To: <50697b2c0903012318t2fc80318vf6d14bfefa5dc125@mail.gmail.com> References: <9bfc700a0902270249w70ae5740yeec1259728867d03@mail.gmail.com> <20090302081628.502bc1ee@o> <50697b2c0903012318t2fc80318vf6d14bfefa5dc125@mail.gmail.com> Message-ID: On Sun, Mar 1, 2009 at 11:18 PM, Chris Rebert wrote: > On Sun, Mar 1, 2009 at 11:16 PM, spir wrote: >> Le Sun, 1 Mar 2009 14:31:58 -0800, >> Guido van Rossum s'exprima ainsi: >> >>> Programming language >>> design is not a rational science. Most reasoning about is is at best >>> rationalization of gut feelings, and at worst plain wrong. > > +1 QOTW! Let me add that this formulation was at least in part inspired by reading "How we decide" by Jonah Lehrer. http://www.amazon.com/How-We-Decide-Jonah-Lehrer/dp/0618620117 -- --Guido van Rossum (home page: http://www.python.org/~guido/) From Scott.Daniels at Acm.Org Mon Mar 2 18:05:54 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Mon, 02 Mar 2009 09:05:54 -0800 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: Weeble wrote: > Would this be crazy? Make this: > > def spam(a,b,c)(d,e,f): > [BODY] > > a synonym for: > > def spam(a,b,c): > def spam2(d,e,f): > [BODY] > return spam2 What advantage does this style have over: def spam(a, b, c, d, e, f): [BODY] spam3 = functools.partial(spam, 'initial', 'three', 'args') --Scott David Daniels Scott.Daniels at Acm.Org From sturla at molden.no Mon Mar 2 18:06:55 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 02 Mar 2009 18:06:55 +0100 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: <49AC122F.4090403@molden.no> On 3/2/2009 10:22 AM, Weeble wrote: > Would this be crazy? Make this: > > def spam(a,b,c)(d,e,f): > [BODY] > > a synonym for: > > def spam(a,b,c): > def spam2(d,e,f): > [BODY] > return spam2 This is atrocious. From clockworksaint at gmail.com Mon Mar 2 20:07:14 2009 From: clockworksaint at gmail.com (Weeble) Date: Mon, 2 Mar 2009 19:07:14 +0000 Subject: [Python-ideas] Syntax for curried functions Message-ID: <13e3f9930903021107n5deac6e8lfcf3574e9c9e3a4f@mail.gmail.com> Steven D'Aprano wrote: > Weeble wrote: > > Would this be crazy? Make this: > > > > def spam(a,b,c)(d,e,f): > > [BODY] > > > > a synonym for: > > > > def spam(a,b,c): > > def spam2(d,e,f): > > [BODY] > > return spam2 > > > > As far as I can tell, it would always be illegal syntax right now, so > > it seems a safe change. And it would make it cleaner to write higher > > order functions. > > I don't think it would. Your proposed syntax is only useful in the case > that the outer function has no doc string and there's no pre-processing > or post-processing of the inner function, including decorators. Fair enough. I knew I must be overlooking something. Sturla Molden wrote: > This is atrocious. :'( Sorry. I thought it might be a bit of a crazy idea. I didn't think it was that bad! From adam at atlas.st Tue Mar 3 00:05:25 2009 From: adam at atlas.st (Adam Atlas) Date: Mon, 2 Mar 2009 18:05:25 -0500 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: On 2 Mar 2009, at 04:22, Weeble wrote: > Would this be crazy? Make this: > > def spam(a,b,c)(d,e,f): > [BODY] > > a synonym for: > > def spam(a,b,c): > def spam2(d,e,f): > [BODY] > return spam2 I once proposed (as far as I can tell) the exact same thing. Here's the discussion that took place -- http://markmail.org/message/aa22tnx2vog3rnin I still like the idea, but it doesn't appear to be very popular. From clockworksaint at gmail.com Tue Mar 3 01:22:40 2009 From: clockworksaint at gmail.com (Weeble) Date: Mon, 2 Mar 2009 16:22:40 -0800 (PST) Subject: [Python-ideas] Syntax for curried functions In-Reply-To: References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: On Mar 2, 11:05?pm, Adam Atlas wrote: > I once proposed (as far as I can tell) the exact same thing. Here's ? > the discussion that took place --http://markmail.org/message/aa22tnx2vog3rnin > > I still like the idea, but it doesn't appear to be very popular. Thank you, at least I feel a little less foolish now. I did try to search for such a proposal, but it's hard to know what search terms to use. To be honest I don't think it's generally useful enough to merit a change to the language, but somehow the idea floated into my head and it just seemed so *neat* that I had to tell somebody. It may be that it just appeals to my sense of aesthetics. From greg.ewing at canterbury.ac.nz Tue Mar 3 21:56:56 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 04 Mar 2009 09:56:56 +1300 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: <49AD9998.7050009@canterbury.ac.nz> Weeble wrote: > I don't think it's generally useful enough to merit > a change to the language, but somehow the idea floated into my head > and it just seemed so *neat* that I had to tell somebody. The designers of at least one other language seem to think it's neat, too. Scheme has an exactly analogous construct: (define ((f x) y) ...) which is shorthand for (define f (lambda (x) (lambda (y) ...))) [The *really* neat thing about the Scheme version is the way it just naturally falls out of the macro expansion of 'define' in terms of 'lambda' -- so it's not really a separate language feature at all!] -- Greg From guido at python.org Tue Mar 3 23:22:25 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 3 Mar 2009 14:22:25 -0800 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: <49AD9998.7050009@canterbury.ac.nz> References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> <49AD9998.7050009@canterbury.ac.nz> Message-ID: Haskell has this too, perhaps even more extreme: there's not really such a thing in Haskell as a function of N arguments (N > 1). "f a b = ..." defines a function f of one argument a which returns another function ("f a") of one argument b. And so on. That doesn't mean we need to copy this idea in Python. On Tue, Mar 3, 2009 at 12:56 PM, Greg Ewing wrote: > Weeble wrote: > >> I don't think it's generally useful enough to merit >> a change to the language, but somehow the idea floated into my head >> and it just seemed so *neat* that I had to tell somebody. > > The designers of at least one other language seem to > think it's neat, too. Scheme has an exactly analogous > construct: > > ?(define ((f x) y) > ? ?...) > > which is shorthand for > > ?(define f > ? ?(lambda (x) > ? ? ?(lambda (y) ...))) > > [The *really* neat thing about the Scheme version is > the way it just naturally falls out of the macro expansion > of 'define' in terms of 'lambda' -- so it's not really > a separate language feature at all!] > > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Wed Mar 4 00:06:50 2009 From: aahz at pythoncraft.com (Aahz) Date: Tue, 3 Mar 2009 15:06:50 -0800 Subject: [Python-ideas] Syntax for curried functions In-Reply-To: References: <13e3f9930903020122p744952a3v7cbc3a3446cf27f4@mail.gmail.com> Message-ID: <20090303230650.GA10398@panix.com> On Mon, Mar 02, 2009, Weeble wrote: > On Mar 2, 11:05?pm, Adam Atlas wrote: >> >> I once proposed (as far as I can tell) the exact same thing. Here's ? >> the discussion that took place --http://markmail.org/message/aa22tnx2vog3rnin >> >> I still like the idea, but it doesn't appear to be very popular. > > Thank you, at least I feel a little less foolish now. I did try to > search for such a proposal, but it's hard to know what search terms to > use. To be honest I don't think it's generally useful enough to merit > a change to the language, but somehow the idea floated into my head > and it just seemed so *neat* that I had to tell somebody. It may be > that it just appeals to my sense of aesthetics. Don't worry too much about it -- part of the point of python-ideas is to provide an appropriate forum for trial ballons like this. It's only foolish when you haven't done your homework, especially for issues that have previously been brought up ad nauseum. It's also a good idea to keep in mind that even when people response with "ewww, yuck!" they're attacking your idea, not you (yes, I know how difficult that can be ;-). -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From leif.walsh at gmail.com Wed Mar 4 01:37:33 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Tue, 3 Mar 2009 19:37:33 -0500 (EST) Subject: [Python-ideas] Syntax for curried functions In-Reply-To: Message-ID: On Mon, Mar 2, 2009 at 7:22 PM, Weeble wrote: > Thank you, at least I feel a little less foolish now. I did try to > search for such a proposal, but it's hard to know what search terms to > use. To be honest I don't think it's generally useful enough to merit > a change to the language, but somehow the idea floated into my head > and it just seemed so *neat* that I had to tell somebody. It may be > that it just appeals to my sense of aesthetics. If it makes you feel better, you can still get almost all of the way there with decorators, and it's even "officially" documented: http://wiki.python.org/moin/PythonDecoratorLibrary#Pseudo-currying -- Cheers, Leif -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 270 bytes Desc: OpenPGP digital signature URL: From kornelpal at gmail.com Wed Mar 4 10:21:33 2009 From: kornelpal at gmail.com (=?UTF-8?B?S29ybsOpbCBQw6Fs?=) Date: Wed, 4 Mar 2009 10:21:33 +0100 Subject: [Python-ideas] Python Bytecode Verifier Message-ID: <9440ace50903040121o22e86ad4sd354af4030d2d922@mail.gmail.com> Hi, I've created a Python Bytecode Verifier in Python. I'm not a Python guru so I borrowed coding patterns from C/C++. I also created this with C portability in mind. The only reason I used Python was to experiment with Python and was easier to morph code during development. If this program were ported to C it would only need 8 bytes per opcode (some additional storage to track blocks) and a single pass. I haven't found backward jumps to previously unused code in any compiled Python code but it can easily be supported. In that case some more partial passes are required. I also was able to successfully verify all Python files in the Python-2.5.2 source package. The verification algorythm should be quite complete but I may have missed some limitations of the interpreter that could be checked by the verifier as well. The ability to create this verfier proved that although Python bytecode is designed for a dynamically typed interpreter, is still easily verifiable. I am willing port this code C but only in the case if there is any chance to be included in Python. I believe that Python in general would benefit having the ability to safely load .pyc files and create code objects on the fly. Both Java and .NET have the ability to safely load compiled byte code. .NET Framework, just like Python also has the ability to create and execute new code at run-time. You may feel that enabling closed source applications and/or creating a multi-language runtime would hurt Python but both of these have contributed to the success of Java (both the language and the runtime). Korn?l -------------- next part -------------- A non-text attachment was scrubbed... Name: verifier.py Type: application/octet-stream Size: 22908 bytes Desc: not available URL: From aahz at pythoncraft.com Wed Mar 4 15:49:22 2009 From: aahz at pythoncraft.com (Aahz) Date: Wed, 4 Mar 2009 06:49:22 -0800 Subject: [Python-ideas] Python Bytecode Verifier In-Reply-To: <9440ace50903040121o22e86ad4sd354af4030d2d922@mail.gmail.com> References: <9440ace50903040121o22e86ad4sd354af4030d2d922@mail.gmail.com> Message-ID: <20090304144922.GA2645@panix.com> On Wed, Mar 04, 2009, Korn?l P?l wrote: > > I've created a Python Bytecode Verifier in Python. I'm not a Python > guru so I borrowed coding patterns from C/C++. I also created this > with C portability in mind. The only reason I used Python was to > experiment with Python and was easier to morph code during > development. You should upload this to PyPI. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Weinberg's Second Law: If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization. From rrr at ronadam.com Thu Mar 5 19:31:38 2009 From: rrr at ronadam.com (Ron Adam) Date: Thu, 05 Mar 2009 12:31:38 -0600 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49AB5BF4.6020203@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> <49AB5BF4.6020203@canterbury.ac.nz> Message-ID: <49B01A8A.1010907@ronadam.com> Greg Ewing wrote: > This is only a speed issue if the time taken to find the > "top" starting from one of the "bottoms" is a significant > component of the total running time. My conjecture is that > it won't be, especially if you do it iteratively in a > tight C loop. Could it be possible to design it so that the yield path of generators are passed down or forwarded to sub-generators when they are called? If that can be done then no matter how deep they get, it works as if it is always only one generator deep. So... result = yield from sub-generator Might be sugar for ... (very rough example) sub-generator.forward(this_generator_yield_path) # set yield path sub_generator.run() # pass control to sub-generator result = sub_generator.return_value # get return value if set Of course the plumbing in this may takes some creative rewriting of generators so the yield path can be passed around. Being able to get the yield path might also be useful in other ways, such as error reporting. Cheers, Ron From greg.ewing at canterbury.ac.nz Thu Mar 5 21:34:32 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 06 Mar 2009 09:34:32 +1300 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49B01A8A.1010907@ronadam.com> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> <49AB5BF4.6020203@canterbury.ac.nz> <49B01A8A.1010907@ronadam.com> Message-ID: <49B03758.6090703@canterbury.ac.nz> Ron Adam wrote: > Could it be possible to design it so that the yield path of generators > are passed down or forwarded to sub-generators when they are called? It's really the other way around -- you would need some way of passing the yield path *up* to a generator's caller. E.g. if A is yielding from B is yielding from C, then A needs a direct path to C. Possibly some scheme could be devised to do this, but I don't feel like spending any brain cycles on it until the current scheme is shown to be too slow in practice. Premature optimisation and all that. -- Greg From rrr at ronadam.com Fri Mar 6 03:58:17 2009 From: rrr at ronadam.com (Ron Adam) Date: Thu, 05 Mar 2009 20:58:17 -0600 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49B03758.6090703@canterbury.ac.nz> References: <499DDA4C.8090906@canterbury.ac.nz> <499F6053.40407@improva.dk> <499F93A9.9070500@canterbury.ac.nz> <499FC423.6080500@improva.dk> <49A9A488.4070308@improva.dk> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> <49AB5BF4.6020203@canterbury.ac.nz> <49B01A8A.1010907@ronadam.com> <49B03758.6090703@canterbury.ac.nz> Message-ID: <49B09149.5020001@ronadam.com> Greg Ewing wrote: > Ron Adam wrote: > >> Could it be possible to design it so that the yield path of generators >> are passed down or forwarded to sub-generators when they are called? > > It's really the other way around -- you would need some way > of passing the yield path *up* to a generator's caller. > > E.g. if A is yielding from B is yielding from C, then A needs > a direct path to C. So when A.next() is called it in effect does a C.next() instead. Is that correct? And when C's yield statement is executed, it needs to return the value to A's caller. So the .next() methods need to be passed up, while the yield return path needs to be passed down? OK, I guess I need to look at some byte/source code. ;-) > Possibly some scheme could be devised to do this, but I don't > feel like spending any brain cycles on it until the current > scheme is shown to be too slow in practice. Premature > optimisation and all that. Right I agree. I have an intuitive feeling that being able to expose and redirect caller and return values may useful in more ways than just generators. ie.. general functions, translators, encoders, decorators, and possibly method resolution. then again, there may not be any easy obvious way to do it. Ron From arnodel at googlemail.com Fri Mar 6 11:11:24 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 6 Mar 2009 10:11:24 +0000 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <49B09149.5020001@ronadam.com> References: <499DDA4C.8090906@canterbury.ac.nz> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> <49AB5BF4.6020203@canterbury.ac.nz> <49B01A8A.1010907@ronadam.com> <49B03758.6090703@canterbury.ac.nz> <49B09149.5020001@ronadam.com> Message-ID: <9bfc700a0903060211l64930745t21d130beb7b5a13f@mail.gmail.com> 2009/3/6 Ron Adam : > > > Greg Ewing wrote: [...] > So when A.next() is called it in effect does a C.next() instead. Is that > correct? ?And when C's yield statement is executed, it needs to return the > value to A's caller. This is what the toy python implementation that I posted last month does. I think Jacob Holm did something more sophisticated that does this as well (I haven't seen it!). > So the .next() methods need to be passed up, while the yield return path > needs to be passed down? Can you draw a picture? :) -- Arnaud From ben+python at benfinney.id.au Fri Mar 6 11:37:23 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 06 Mar 2009 21:37:23 +1100 Subject: [Python-ideas] Cross-platform file locking, PID files , and the "daemon" PEP References: <21108.1233038783@pippin.parc.xerox.com> <87wschyykf.fsf@benfinney.id.au> <20090127090058.GA302@phd.pp.ru> <20090127160643.GC37589@wind.teleri.net> <20090127161933.GB28125@phd.pp.ru> <497F351D.30700@timgolden.me.uk> <4222a8490901270830l275a7e55t1514e51944d86089@mail.gmail.com> <87eiyozbus.fsf@benfinney.id.au> <20090128012906.GH57568@wind.teleri.net> <87ab9cyy42.fsf@benfinney.id.au> <20090128030351.GD64950@wind.teleri.net> <871vuoyuxg.fsf@benfinney.id.au> <87k58gdpgl.fsf@xemacs.org> <874ozhtq3c.fsf_-_@benfinney.id.au> <87ocxpcpxz.fsf@xemacs.org> Message-ID: <87fxhr544c.fsf@benfinney.id.au> "Stephen J. Turnbull" writes: > Ben Finney writes: > > > Skip Montanaro has a PyPI package, ?lockfile? (currently marked > > "Beta") that has such > > ambitions. > > I would send your proposal for PIDlockfile to Skip. [?] For those people who have asked about the status of this: Yes, I'm currently working with Skip on the ?lockfile? package so that it can be used for the ?daemon? implementation. -- \ ?Following fashion and the status quo is easy. Thinking about | `\ your users' lives and creating something practical is much | _o__) harder.? ?Ryan Singer, 2008-07-09 | Ben Finney From jh at improva.dk Fri Mar 6 13:20:25 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 06 Mar 2009 13:20:25 +0100 Subject: [Python-ideas] Revised**5 PEP on yield-from In-Reply-To: <9bfc700a0903060211l64930745t21d130beb7b5a13f@mail.gmail.com> References: <499DDA4C.8090906@canterbury.ac.nz> <49AA7FEC.4090609@improva.dk> <49AB0CF5.6020900@canterbury.ac.nz> <49AB23AA.4060106@improva.dk> <49AB31C0.5000005@canterbury.ac.nz> <49AB4038.6070205@improva.dk> <49AB5BF4.6020203@canterbury.ac.nz> <49B01A8A.1010907@ronadam.com> <49B03758.6090703@canterbury.ac.nz> <49B09149.5020001@ronadam.com> <9bfc700a0903060211l64930745t21d130beb7b5a13f@mail.gmail.com> Message-ID: <49B11509.10408@improva.dk> Arnaud Delobelle wrote: > [...] > > This is what the toy python implementation that I posted last month > does. I think Jacob Holm did something more sophisticated that does > this as well (I haven't seen it!). > > [...] I do, and here is the most readable version so far. If you look very closely, you will notice a certain similarity to the version you posted. It turned out there was an easy fix for the 'closing' problem i mentioned, and a hack based on gi_frame that allowed me to use a generator function for the wrapper after all, so the _generator_iterator_wrapper below is heavily based on your code. The rest is there to implement the tree structure I mentioned, instead of just a simple stack. The speed of this version on simple long chains is still about 6 times slower than your code. This can be improved to about 1.5 times by inlining most of the function calls and eliminating the RootPath class, but the resulting code is close to unreadable and therefore not well suited to explain the idea. Most of the remaining slowdown compared to "for v in it: yield v" is due to attribute lookups and simple integer computations. In fact, I would be surprised if a C version of this wasn't going to be faster even for just a single "yield from". Anyway, here it is Jacob """Tree structure implementing the operations needed by yield-from. """ class Node(object): """A Node object represents a single node in the tree. """ __slots__ = ('parent', # the parent of this node (if any) 'chain', # the chain this node belongs to (if any) 'child', # the child of this node in the chain (if any) 'size', # The number of descendants of this node # (including the node itself) that are not # descendants of 'child', plus the number of # times any of these nodes have been accessed. ) def __init__(self): self.parent = None # We are not actually using this one, but # it is cheap to maintain. self.chain = None self.child = None self.size = 0 class Chain(object): """A Chain object represents a fragment of the path from some node towards the root. Chains are long-lived, and are used to make shortcuts in the tree enabling operations to take O(logN) rather than O(N) time. (And even enabling O(1) for certain usage patterns). """ __slots__ = ('top', # topmost node in the chain 'size', # The sum of sizes of all nodes in the chain 'parent', # the parent node of top in the tree 'rp_child', # the child of this chain in the current # root path ) def __init__(self, *nodes): """Construct a chain from the given nodes. The nodes must have a size>0 assigned before constructing the chain, and must have their 'chain' pointer set to None. The nodes are linked together using their 'child' pointers, and get their 'chain' pointers set to the new chain. First node in the list will be at the bottom of the chain, last node in the list becomes the value of the 'top' field. The size of the new chain is the sum of the sizes of the nodes. """ top = None size = 0 for node in nodes: assert node.chain is None assert node.child is None assert node.size > 0 node.chain = self node.child = top size += node.size top = node parent = None for node in reversed(nodes): node.parent = parent parent = node self.top = top self.size = size self.parent = None self.rp_child = None class RootPath(object): """A RootPath represents the whole path from the root to some given node. RootPaths need to be 'acquired' before use and 'released' after. Acquiring the path establishes the necessary down-links, ensures the path has length O(logN), and gives an efficient way of detecting and avoiding loops. """ __slots__ = ('base', # The node defining the path 'root', # The root node of the tree ) def __init__(self, base): """Construct the RootPath representing the path from base to its root. """ assert isinstance(base, Node) self.base = base self.root = None def acquire(self, sizedelta=0): """Find the root and take ownership of the tree containing self.base. Create a linked list from the root to base using the Chain.rp_child pointers. (Using self as sentinel in self.base.rp_child). Optionally adds 'sizedelta' to the size of the base node for the path and updates the rest of the sizes accordingly. If the tree was already marked (one of the chains had a non-None rp_child pointer), back out all changes and return None, else return the root node of the tree. """ assert self.root is None node = self.base assert node is not None rp_child = self while True: chain = node.chain assert chain is not None if chain.rp_child is not None: # Some other rootpath already owns this tree. Undo the # changes so far and raise a RuntimeError if rp_child is not self: while True: chain = rp_child rp_child = chain.rp_child chain.rp_child = None if rp_child is self: break node = rp_child.parent node.size -= sizedelta chain.size -= sizedelta return None assert chain.rp_child is None node.size += sizedelta chain.size += sizedelta chain.rp_child = rp_child rp_child = chain node = rp_child.parent if node is None: break self.root = root = rp_child.top assert root.chain is not None assert root.chain.parent is None # Tricky, this rebalancing is needed because cut_root # cannot do a full rebalancing fast enough without maintaining # a lot more information. We may actually have traversed a # (slightly) unbalanced path above. Rebalancing here makes # sure that it is balanced when we return, and the cost of # traversing the unbalanced path can be included in the cost # of rebalancing it. The amortized cost is still O(logn) per # operation as it should be. self._rebalance(self, False) return root def release(self): """Release the tree containing this RootPath. """ assert self.root is not None chain = self.root.chain assert chain is not None while chain is not self: child = chain.rp_child chain.rp_child = None chain = child self.root = None def cut_root(self): """Cut the link between the root and its child on the root path. Release the tree containing the root. If the root was the only node on the path, return None. Else return the new root. """ root = self.root assert root is not None chain = root.chain assert chain.parent is None assert root is chain.top child = chain.rp_child assert child is not None if child is self: # only one chain if root is self.base: # only one node, release tree chain.rp_child = None self.root = None return None else: # multiple chains if root is child.parent: # only one node from this chain on the path. root.size -= child.size chain.size -= child.size child.parent = None chain.rp_child = None self.root = newroot = child.top newroot.parent = None return newroot # Multiple nodes on the chain. Cut the topmost off and put it # into its own chain if necessary. (This is needed when the # node has other children) newroot = root.child assert newroot newroot.parent = None self.root = chain.top = newroot chain.size -= root.size root.child = None root.chain = None Chain(root) self._rebalance(self, True) return newroot def link(self, node): """Make the root of this tree a child of a node from another tree. Return the root of the resulting tree on succes, or None if the tree for the parent node couldn't be acquired. """ root = self.root assert root is not None chain = root.chain assert chain.parent is None assert isinstance(node, Node) rp = RootPath(node) newroot = rp.acquire(chain.size) if newroot is None: return None self.root = newroot node.chain.rp_child = chain root.parent = chain.parent = node self._rebalance(chain.rp_child, False) return newroot def _rebalance(self, stop, quick): # check and rebalance all the chains starting from the root. # If 'quick' is True, stop the first time no rebalancing took # place, else continue until the child is 'stop'. gpchain = None pchain = self.root.chain chain = pchain.rp_child while chain is not stop: parent = chain.parent if 2*(pchain.size-parent.size) <= chain.size: # Unbalanced chains. Move all ancestors to parent from # pchain into this chain. This may look like an # expensive operation, but the balancing criterion is # chosen such that every time a node is moved from one # chain to another, the sum of the sizes of everything # *after* the node in its chain is at least # doubled. This number is only decreased by cut_root # (where it may be reset to zero), so a node can move # between chains at most log_2(N) times before if # becomes the root in a cut_root. The amortized cost # of keeping the tree balanced is thus O(logN). The # purpose of the balancing is of course to keep the # height of the tree down. Any node that is balanced # according to this criterion has # # 2*(pchain.size-self.size) > 2*(pchain.size-self.parent.size) # > self.size # # and so # # pchain.size > 1.5*self.size # # Therefore, the number of chains in any balanced # RootPath is at most log_1.5(N), and so the cost per # operation is O(logN). # Compute size of elements to move and set their 'chain' # pointers. p = pchain.top movedsize = p.size p.chain = chain while p is not parent: p = p.child movedsize += p.size p.chain = chain # update sizes parent.size -= chain.size chain.size = pchain.size pchain.size -= movedsize parent.size += pchain.size # update child, top and parent links pchain.top, parent.child, chain.top \ = parent.child, chain.top, pchain.top chain.parent = pchain.parent pchain.parent = parent # update rp_child links pchain.rp_child = None # pchain is no longer on the path if gpchain is not None: assert gpchain is not None assert gpchain.rp_child is pchain gpchain.rp_child = chain assert (pchain.top is None)==(pchain.size==0) # if pchain.top is None: # # # pchain has become empty. If coding this in C, # # remember to free the memory. elif quick: break else: gpchain = pchain pchain = chain chain = pchain.rp_child ############################################################################### """Yield-from implementation """ import sys # _getframe, used only in an assert import types # GeneratorType import functools # wraps class _iterator_node(Node): # Wrapper for turning an unknown iterator into a node in the tree. __slots__ = ('iterator', # the wrapped iterator ) locals().update((k, Node.__dict__[k]) for k in ('parent', 'chain', 'child', 'size')) def __init__(self, iterator): self.iterator = iterator Node.__init__(self) self.size = 1 Chain(self) class _generator_iterator_wrapper(_iterator_node): # Wrapper for turning a generator using "yield from_(expr)" into a # node in the tree. __slots__ = () locals().update((k, _iterator_node.__dict__[k]) for k in ('parent', 'chain', 'child', 'size', 'iterator')) def __new__(cls, iterator): self = _iterator_node.__new__(cls) _iterator_node.__init__(self, iterator) return self.__iter__() # we don't hold on to a reference to # this generator, because a) we don't # need to, and b) when the last # user-code reference to it goes away, # the generator is automatically closed # and we get a chance to clean up the # rest of the cycles created by this # structure. def __iter__(self): val = exc = None rp = RootPath(self) while True: try: if rp.acquire(1) is None: raise ValueError('generator already executing') while True: try: gen = rp.root.iterator try: if exc is not None: throw = getattr(gen, 'throw', None) try: if throw is None: raise exc ret = throw(exc) finally: del throw exc = None elif val is None: ret = gen.next() else: ret = gen.send(val) except: close = getattr(gen, 'close', None) try: if close is not None: close() raise finally: del close finally: del gen except Exception, e: if rp.cut_root() is None: raise if isinstance(e, StopIteration): val, exc = getattr(e, 'val', None), None else: val, exc = None, e else: if type(ret) is from_: if rp.link(ret.node) is not None: val = None else: exc = ValueError('generator already executing') continue break finally: if rp.root is not None: rp.release() try: val = yield ret except Exception, e: exc = e class from_(object): """This class is used together with the uses_from decorator to simulate the proposed 'yield from' statement using existing python features. Use: @uses_from def mygenerator(): ... result = yield _from(expr) ... raise StopIteration(result) To get the equivalent effect of the proposed: def mygenerator(): ... result = yield from expr ... return result Any use other than directly in a yield expression within a generator function decorated with 'uses_from' is unsupported, and could eat your harddrive for all I care. Unsupported uses include, but are not limited to: subclassing, calling methods, reading or writing attributes, storing in a variable, and passing as argument to a builtin or other function. """ __slots__ = ('node',) def __init__(self, iterable): # get the code object for the wrapper, for comparison func_code = _generator_iterator_wrapper.__iter__.func_code # verify that from_ is only used in a wrapped generator function if __debug__: frame = sys._getframe(2) assert frame is not None and frame.f_code is func_code, ( "from_ called from outside a @uses_from generator function.") if type(iterable) is types.GeneratorType: frame = iterable.gi_frame if frame is not None and frame.f_code is func_code: # this is a wrapped generator, extract the node for it # by peeking at it's locals. self.node = frame.f_locals['self'] else: # an unwrapped generator, create a new node. self.node = _iterator_node(iterable) else: # some other iterable, create a new node. self.node = _iterator_node(iter(iterable)) def uses_from(func): """Decorator for generator functions/methods that use "yield from_(expr)" This class is used together with the from_ class to simulate the proposed 'yield from' statement using existing python features. Use: @uses_from def mygenerator(): ... result = yield _from(expr) ... raise StopIteration(result) To get the equivalent effect of the proposed: def mygenerator(): ... result = yield from expr ... return result Any other use than as a decorator for a normal generator function/method is at your own risk. I wouldn't do it if I were you. Seriously. """ @functools.wraps(func) def wrapper(*args, **kwargs): return _generator_iterator_wrapper(func(*args, **kwargs)) return wrapper From skip at pobox.com Fri Mar 6 14:55:00 2009 From: skip at pobox.com (skip at pobox.com) Date: Fri, 6 Mar 2009 07:55:00 -0600 Subject: [Python-ideas] Cross-platform file locking, PID files , and the "daemon" PEP In-Reply-To: <87fxhr544c.fsf@benfinney.id.au> References: <87fxhr544c.fsf@benfinney.id.au> Message-ID: <18865.11060.602298.698687@montanaro.dyndns.org> Ben> For those people who have asked about the status of this: Yes, I'm Ben> currently working with Skip on the ?lockfile? package so that it Ben> can be used for the ?daemon? implementation. Indeed. I should have my Mercurial repository in a more globally visible place later today so Ben and anyone else interested can bash bits. Skip From eric at trueblade.com Mon Mar 9 13:20:51 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 09 Mar 2009 08:20:51 -0400 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <499AAF4E.3020506@trueblade.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4999D184.3080105@trueblade.com> <499AAF4E.3020506@trueblade.com> Message-ID: <49B509A3.1080404@trueblade.com> I've added a patch to http://bugs.python.org/issue5237 that implements the basic '{}' functionality in str.format. Read the note in the issue; this patch is not ready for production. But it will let you play with the feature. I like it, mainly because it's so much quicker to type '{}' than '{0}' because you don't have to shift-unshift-shift (on my US English keyboard). If we decide to add this feature I hope I can finish it before PyCon, or worst case finish it during the sprints. From tav at espians.com Mon Mar 9 14:04:23 2009 From: tav at espians.com (tav) Date: Mon, 9 Mar 2009 13:04:23 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea Message-ID: Hey all, I've come up with a way to do Ruby-style blocks in what I feel to be a Pythonic way: using employees.select do (employee): if employee.salary > developer.salary: fireEmployee(employee) else: extendContract(employee) I originally overloaded the ``with`` keyword, but on Guido's guidance and responses from many others, switched to the ``using`` keyword. I explain the idea in detail in this blog article: http://tav.espians.com/ruby-style-blocks-in-python.html It covers everything from why these are useful to a proposal of how the new ``do`` statement and __do__ function could work. Please note that the intention of this proposal is not to encourage iteration with blocks -- Python already does this rather elegantly -- but blocks are very useful for more than just iteration as various Ruby applications have shown. Let me know what you think. My apologies if I've missed something obvious. Thanks! -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From sturla at molden.no Mon Mar 9 16:24:23 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Mar 2009 16:24:23 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: Message-ID: <49B534A7.2010001@molden.no> On 3/9/2009 2:04 PM, tav wrote: > Hey all, > > I've come up with a way to do Ruby-style blocks in what I feel to be a > Pythonic way: > > using employees.select do (employee): > if employee.salary > developer.salary: > fireEmployee(employee) > else: > extendContract(employee) I believe this is just an extension to the lambda keyword. If lambdas could define a block, not just a statement, this would e.g. be employees.select( lambda employee: if employee.salary > developer.salary: fireEmployee(employee) else: extendContract(employee) ) or tmp = lambda employee: if employee.salary > developer.salary: fireEmployee(employee) else: extendContract(employee) employees.select(tmp) I see no reason for introducing two new keywords to do this, as you are really just enhancing the current lambda keyword. On the other hand, turning blocks into anonymous functions would be very useful for functional programming. As such, I like your suggestion. This also has a great potential for abuse (as in writing unreadable code), just consider how anonymous classes are used in Java's GUI toolkits. I really don't want to see self.Bind(wx.BUTTON, lamda: evt , mybutton) in wxPython code. But Java programmers coming to Pytho would jump to this, as they have been brain washed to use anonymous classes for everything (no pun intended). Sturla Molden From tav at espians.com Mon Mar 9 16:33:30 2009 From: tav at espians.com (tav) Date: Mon, 9 Mar 2009 15:33:30 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <49B534A7.2010001@molden.no> References: <49B534A7.2010001@molden.no> Message-ID: Hey Stuart, > I believe this is just an extension to the lambda keyword. If lambdas could > define a block, not just a statement, this would e.g. be Multi-line lambdas would be nice, but I struggle to find a way to do so in a Pythonic manner. See: http://unlimitednovelty.com/2009/03/indentation-sensitivity-post-mortem.html It would be nice to find a way though... > tmp = lambda employee: > ? ?if employee.salary > developer.salary: > ? ? ? fireEmployee(employee) > ? ?else: > ? ? ? extendContract(employee) > > employees.select(tmp) You might as well just use ``def`` above... > I see no reason for introducing two new keywords to do this, as you are > really just enhancing the current lambda keyword. I agree that two new keywords is a bit much. I tried to re-use ``with`` initially -- but I guess people would be confused by the conflicting semantics. > On the other hand, turning blocks into anonymous functions would be very > useful for functional programming. As such, I like your suggestion. Thanks =) -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From aahz at pythoncraft.com Mon Mar 9 16:45:05 2009 From: aahz at pythoncraft.com (Aahz) Date: Mon, 9 Mar 2009 08:45:05 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <49B534A7.2010001@molden.no> References: <49B534A7.2010001@molden.no> Message-ID: <20090309154505.GA18115@panix.com> On Mon, Mar 09, 2009, Sturla Molden wrote: > > I see no reason for introducing two new keywords to do this, as you are > really just enhancing the current lambda keyword. > > On the other hand, turning blocks into anonymous functions would be very > useful for functional programming. As such, I like your suggestion. There's a substantial minority (possibly even a majority) in the Python community that abhors functional programming. Even among those who like functional programming, there's a substantial population that dislikes extensive use of anonymous functions. The trick to getting features for functional programming accepted is to make them look as Pythonic as possible. Right now, I'm somewhere between -0 and -1 on this proposal, because all the motivation I see looks like it's perfectly satisfied by using ``def`` instead of lambda. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson From tav at espians.com Mon Mar 9 16:53:35 2009 From: tav at espians.com (tav) Date: Mon, 9 Mar 2009 15:53:35 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <20090309154505.GA18115@panix.com> References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: Hey Aahz, > The trick to getting features for functional programming accepted > is to make them look as Pythonic as possible. I spent considerable effort to make the using/do statement as Pythonic as possible. Could you please elaborate on what you don't like about it? Please note that the lambda thing was Sturla's follow up comment... -- Thanks, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From sturla at molden.no Mon Mar 9 17:09:02 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Mar 2009 17:09:02 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: <49B53F1E.30009@molden.no> On 3/9/2009 4:53 PM, tav wrote: > I spent considerable effort to make the using/do statement as Pythonic > as possible. > > Could you please elaborate on what you don't like about it? > > Please note that the lambda thing was Sturla's follow up comment... If I can elaborate as well. There are three things I don't like: 1. You are introducing two new keywords. Solving problems by constantly adding new syntax is how programming languages are designed in Redmond, WA. I don't exactly know what Pythonic means, but bloating the syntax is not. 2. Most of this is covered by 'def'. Python allows functions to be nested. Python does support closures. 3. Anonymous classes in Java have more cases for abuse than use. Just see how they are abused to write callbacks/handlers. They are a notorious source of unreadable and unmaintainable spaghetti code. S.M. From aahz at pythoncraft.com Mon Mar 9 17:16:06 2009 From: aahz at pythoncraft.com (Aahz) Date: Mon, 9 Mar 2009 09:16:06 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: <20090309161606.GA19375@panix.com> On Mon, Mar 09, 2009, tav wrote: > > Hey Aahz, > >> The trick to getting features for functional programming accepted >> is to make them look as Pythonic as possible. > > I spent considerable effort to make the using/do statement as Pythonic > as possible. > > Could you please elaborate on what you don't like about it? That was a general point. The specific point was what you cut: your proposal seems to offer little advantage over a ``def``. You need to justify yourself more thoroughly. Also, because this list is archived, you should probably include your entire argument here rather than referring to an external web page. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson From bronger at physik.rwth-aachen.de Mon Mar 9 17:22:18 2009 From: bronger at physik.rwth-aachen.de (Torsten Bronger) Date: Mon, 09 Mar 2009 17:22:18 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: <87y6veeked.fsf@physik.rwth-aachen.de> Hall?chen! tav writes: > Hey Aahz, > >> The trick to getting features for functional programming accepted >> is to make them look as Pythonic as possible. > > I spent considerable effort to make the using/do statement as > Pythonic as possible. > > Could you please elaborate on what you don't like about it? Two things for me: The "using ..." is not well-readable. The "(employee)" sits clumsily at the end of the line whithout any connection to the rest. Even worse, the "do" anticipates the ":", which in Python already means "do". And secondly, I'm not comfortable with the fact that the return value is the first (or last? or all?) expression the interpreter stumbles over. Because Python distinguishs between expressions and statements, you have to look twice to see what actually happens in the block. In other words, expressions work differently in these almost-functions, so we end up with two kinds of functions that have different semantic rules. This makes reading more difficult, as well as code-reuse. Tsch?, Torsten. -- Torsten Bronger, aquisgrana, europa vetus Jabber ID: torsten.bronger at jabber.rwth-aachen.de From sturla at molden.no Mon Mar 9 17:24:36 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Mar 2009 17:24:36 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <20090309154505.GA18115@panix.com> References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: <49B542C4.6090702@molden.no> On 3/9/2009 4:45 PM, Aahz wrote: > There's a substantial minority (possibly even a majority) in the Python > community that abhors functional programming. There are a substantial minority that use Python for scientific computing (cf. numpy and scipy, the Hubble space telescope, the NEURON simulator, Sage, etc.) For numerical computing, functional programming often leads to code that are shorter and easier to read. That is, equations look like functions, not like classes. S.M. From aahz at pythoncraft.com Mon Mar 9 17:29:43 2009 From: aahz at pythoncraft.com (Aahz) Date: Mon, 9 Mar 2009 09:29:43 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <49B542C4.6090702@molden.no> References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <49B542C4.6090702@molden.no> Message-ID: <20090309162943.GB19375@panix.com> On Mon, Mar 09, 2009, Sturla Molden wrote: > On 3/9/2009 4:45 PM, Aahz wrote: >> >> There's a substantial minority (possibly even a majority) in the Python >> community that abhors functional programming. > > There are a substantial minority that use Python for scientific > computing (cf. numpy and scipy, the Hubble space telescope, the NEURON > simulator, Sage, etc.) For numerical computing, functional programming > often leads to code that are shorter and easier to read. That is, > equations look like functions, not like classes. Yes, I know; I'm just pointing out that Python is not a pure functional language and that there's a tension within the community about how far Python should go in the functional direction. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "All problems in computer science can be solved by another level of indirection." --Butler Lampson From guido at python.org Mon Mar 9 17:39:30 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Mar 2009 09:39:30 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: Message-ID: On Mon, Mar 9, 2009 at 6:04 AM, tav wrote: > I've come up with a way to do Ruby-style blocks in what I feel to be a > Pythonic way: > > ?using employees.select do (employee): > ? ? ?if employee.salary > developer.salary: > ? ? ? ? ?fireEmployee(employee) > ? ? ?else: > ? ? ? ? ?extendContract(employee) Sounds like you might as well write a decorator named @using: @using(employees.select) def _(employee): if employee.salary > developer.salary: fireEmployee(employee) else: extendContract(employee) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dreamingforward at gmail.com Mon Mar 9 17:43:03 2009 From: dreamingforward at gmail.com (average) Date: Mon, 9 Mar 2009 09:43:03 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea Message-ID: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> While there are several complaining that all this can be done with a def, there's an critical distinction being overlooked. Passing around code blocks is a very different style of programming that Python or most languages like it have ever experimented with. The programming art itself hasn't really even explored the different range of thinking this style of programming opens up. Where as most function definitions are verb-like, this would be a noun-like definition. Akin perhaps to the difference between a hormone in the body and a neurotransmitter, respectively.Setting it off with a new keyword, or expanding the use of lambda is really a mandatory way of signifying this INTENT. marcos > I've come up with a way to do Ruby-style blocks in what I feel to be a > Pythonic way: > > using employees.select do (employee): > if employee.salary > developer.salary: > fireEmployee(employee) > else: > extendContract(employee) > > I originally overloaded the ``with`` keyword, but on Guido's guidance > and responses from many others, switched to the ``using`` keyword. > > It covers everything from why these are useful to a proposal of how > the new ``do`` statement and __do__ function could work. From guido at python.org Mon Mar 9 17:45:07 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Mar 2009 09:45:07 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> Message-ID: On Mon, Mar 9, 2009 at 9:43 AM, average wrote: > While there are several complaining that all this can be done with a > def, there's an critical distinction being overlooked. ?Passing around > code blocks is a very different style of programming that Python or > most languages like it have ever experimented with. ?The programming > art itself hasn't really even explored the different range of thinking > this style of programming opens up. ?Where as most function > definitions are verb-like, this would be a noun-like definition. ?Akin > perhaps to the difference between a hormone in the body and a > neurotransmitter, respectively.Setting it off with a new keyword, or > expanding the use of lambda is really a mandatory way of signifying > this INTENT. Your claim that this is somehow something new seems to be overlooking Lisp and Smalltalk, as well as Ruby which was mentioned in the quoted blog post. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steve at pearwood.info Mon Mar 9 17:46:46 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 Mar 2009 03:46:46 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <49B542C4.6090702@molden.no> References: <20090309154505.GA18115@panix.com> <49B542C4.6090702@molden.no> Message-ID: <200903100346.46452.steve@pearwood.info> On Tue, 10 Mar 2009 03:24:36 am Sturla Molden wrote: > On 3/9/2009 4:45 PM, Aahz wrote: > > There's a substantial minority (possibly even a majority) in the > > Python community that abhors functional programming. > > There are a substantial minority that use Python for scientific > computing (cf. numpy and scipy, the Hubble space telescope, the > NEURON simulator, Sage, etc.) For numerical computing, functional > programming often leads to code that are shorter and easier to read. > That is, equations look like functions, not like classes. I don't understand what you mean. As far as I can see, equations never look like classes in Python, regardless of whether you are using functional programming, object-oriented programming or procedural programming. Can you give me an example of what you mean? Secondly, the proposal relates to *anonymous* functions, which is a small part of functional programming. Perhaps they are necessary in a purely functional programming language, but Python is not such a language. Python never *requires* anonymous functions, they are a convenience and that's all. -- Steven D'Aprano From steve at pearwood.info Mon Mar 9 18:03:04 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 Mar 2009 04:03:04 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> Message-ID: <200903100403.04733.steve@pearwood.info> On Tue, 10 Mar 2009 03:43:03 am average wrote: > While there are several complaining that all this can be done with a > def, there's an critical distinction being overlooked. Passing > around code blocks is a very different style of programming that > Python or most languages like it have ever experimented with. The > programming art itself hasn't really even explored the different > range of thinking this style of programming opens up. Where as most > function > definitions are verb-like, this would be a noun-like definition. > Akin perhaps to the difference between a hormone in the body and a > neurotransmitter, respectively.Setting it off with a new keyword, or > expanding the use of lambda is really a mandatory way of signifying > this INTENT. Marcos, I'm afraid that I don't understand your analogy, or your argument. An anonymous code block is just like a named function, except it doesn't have a name. Can you explain why: func(named_function) is radically different from: func(multi-line-code-block-without-the-name) please? We can already do this in Python, using functions made up of a single expression: func(lambda: expr) I use this frequently, because it is sometimes convenient, but there is nothing I can do with a lambda that I can't do with a named function. I don't see that introducing multi-lined lambdas will change that. I'm trying to keep an open-mind here, but I also don't understand your analogy. In what way are named functions like neurotransmitters, or hormones? Which one is supposed to be verb-like and which one is noun-like? What does that even mean? -- Steven D'Aprano From grosser.meister.morti at gmx.net Mon Mar 9 18:11:20 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Mon, 09 Mar 2009 18:11:20 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: Message-ID: <49B54DB8.3090101@gmx.net> tav wrote: > Hey all, > > I've come up with a way to do Ruby-style blocks in what I feel to be a > Pythonic way: > > using employees.select do (employee): > if employee.salary > developer.salary: > fireEmployee(employee) > else: > extendContract(employee) > Maybe if you come up with an example that isn't written with already existing python syntax as easy (or even more easily): for employee in employees: # or employees.select() if you like if employee.salary > developer.salary: fireEmployee(employee) else: extendContract(employee) -panzi From tav at espians.com Mon Mar 9 18:15:13 2009 From: tav at espians.com (tav) Date: Mon, 9 Mar 2009 17:15:13 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <200903100403.04733.steve@pearwood.info> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> Message-ID: Hey Steven, > Can you explain why: > func(named_function) > is radically different from: > func(multi-line-code-block-without-the-name) Hmz, the intention isn't to support multi-line lambdas. It's to make passing in anonymous functions easier. For precedence let's take a look at decorators. Fundamentally, decorators save a user nothing more than a single line of code. Why do @foo, when you could just do: func = foo(func) ? But saving developers that extra line of typing has obviously been useful -- you can find decorators used pretty heavily in many of the major Python frameworks nowadays... By easing up some of the hassle, we can encourage certain forms of development. -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From guido at python.org Mon Mar 9 18:26:14 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Mar 2009 10:26:14 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> Message-ID: On Mon, Mar 9, 2009 at 10:15 AM, tav wrote: >> Can you explain why: >> ? func(named_function) >> is radically different from: >> ? func(multi-line-code-block-without-the-name) > > Hmz, the intention isn't to support multi-line lambdas. It's to make > passing in anonymous functions easier. Well, that works only as long as there is only a single anonymous function to pass in and it's the last argument. Plus it doesn't work with an existing function that takes a function argument -- your function (if I understand your proposed __do__ implemtation correctly) must really be a generator. > For precedence let's take a look at decorators. Fundamentally, > decorators save a user nothing more than a single line of code. I guess you weren't there at the time. If it was about saving a line of code it would have been boohed out of the room. (Especially since the line count is actually the same with or without using the decorator syntax!) The big improvement that decorators offer is to move the "decoration" from the end of the function body, where it is easily missed, to the front of the declaration, where it changes the emphasis for the reader. I don't see a similar advantage in your example; it looks more like "Ruby-envy" to me. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dreamingforward at gmail.com Mon Mar 9 18:43:43 2009 From: dreamingforward at gmail.com (average) Date: Mon, 9 Mar 2009 10:43:43 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> Message-ID: <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com> >> [RE:"using" keyword].Setting it off with a new keyword, or >> expanding the use of lambda is really a mandatory way of signifying >> this INTENT. > > Your claim that this is somehow something new seems to be overlooking > Lisp and Smalltalk, as well as Ruby which was mentioned in the quoted > blog post. Acknowledged. However, the power of such a language as Lisp is under-appreciated (as I think all who know it can agree), and the problem with it (and the challenge of language design in general) is how best to *orgranize* that power; i.e. that generality. In my view, Lisp is like assembly language for the mind. It's powerful, but not easily organized or visualized into a fashion where one can see and evaluate the building up of and into higher-level constructs. MY point was really about how the programming art *itself* hasn't fully explored this concept to even be able to *evaluate* the power and usefulness of employing techniques such as code blocks. To me, Python and Ruby are both exciting and interesting examples of how language design is evolving to find ways to express and evolve that power. In my mind, there is no doubt that languages will have to find elegant ways to express that power. What's cool about Python and Ruby is that it's taking that vast general space of the "mind's assembly language" and distilling it down into nicely manageble and elegant chunks of language syntax. The concept of distinct, passable code blocks is a nice example of that compression, one that certainly has correspondence within our biology. Thanks for the dialog though... marcos From steve at pearwood.info Mon Mar 9 18:50:56 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 Mar 2009 04:50:56 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> Message-ID: <200903100450.56903.steve@pearwood.info> On Tue, 10 Mar 2009 04:15:13 am you wrote: > Hey Steven, > > > Can you explain why: > > func(named_function) > > is radically different from: > > func(multi-line-code-block-without-the-name) > > Hmz, the intention isn't to support multi-line lambdas. It's to make > passing in anonymous functions easier. Lambdas are single-line (technically, single-statement) anonymous functions, and it's already easy to pass them in: caller(lambda args: statement) A multi-line lambda (technically, multi-statement) would also be an anonymous function. Your syntax: using caller (args) do: multiple lines making an anonymous function seems to me to be defining a multi-line lambda. The three lines of the indented block are exactly equivalent to the statement of a lambda. What is the difference that you see? -- Steven D'Aprano From tav at espians.com Mon Mar 9 19:47:18 2009 From: tav at espians.com (tav) Date: Mon, 9 Mar 2009 18:47:18 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples] Message-ID: Dear all, Here's another stab at making a case for it. I'll avoid referring to Ruby this time round -- I was merely using it as an example of where this approach has been successful. Believe me, there's no Ruby-envy here ;p The motivation: 1. Having to name a one-off function adds additional cognitive overload to a developer. It doesn't make the code any cleaner and by taking away the burden, we'd have happier developers and cleaner code. 2. This approach is more descriptive and in line with the code flow. With blocks the first line says "I'm about to define a function for use with X" instead of the existing way which says "I'm defining a function. Now I'm using that function with X." 3. DSLs -- whether we like them or not, they are in mainstream use. Python already has beautiful syntax. We should be leveraging that for DSLs instead of forcing framework developers to create their own ugly/buggy mini-DSLs. This will enable that. The proposed syntax: using EXPR do PARAM_LIST: FUNCTION_BODY Examples: # Django/App Engine Query Frameworks like Django or App Engine define DSLs to enable easy querying of datastores by users. Wouldn't it better if this could be done in pure Python syntax? Compare the current Django: q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now()) with a hypothetical: using Entry.filter do (entry): if entry.headline.startswith('What') and entry.pub_date <= datetime.now(): return entry Wouldn't the latter be easier for a developer to read/maintain? Let's compare this App Engine: composer = "Lennon, John" query = GqlQuery("SELECT * FROM Song WHERE composer = :1", composer) with: composer = "Lennon, John" using Song.query do (item): if item.composer == composer: return item Again, being able to do it in Python syntax will save developers the hassles of having to learn non-Python DSLs. # Event-driven Programming Right now, event-driven programming like it's done in Twisted is rather painful for many developers. It's filled with callbacks and the order in which code is written is completely inverted as far as the average developer is concerned. Let's take Eventlet -- a nice coroutines-based networking library in Python. Their example webcrawler.py currently does: urls = ["http://www.google.com/intl/en_ALL/images/logo.gif", "http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"] def fetch(url): print "%s fetching %s" % (time.asctime(), url) httpc.get(url) print "%s fetched %s" % (time.asctime(), url) pool = coros.CoroutinePool(max_size=4) waiters = [] for url in urls: waiters.append(pool.execute(fetch, url)) Wouldn't it be nicer to do this instead: pool = coros.CoroutinePool(max_size=4) for url in urls: using pool.execute do: print "%s fetching %s" % (time.asctime(), url) httpc.get(url) print "%s fetched %s" % (time.asctime(), url) I'd argue that it is -- but then I have bias =) # SCons SCons is a make-esque build tool. In the SConstruct (makefile) for Google Chrome, we find: def WantSystemLib(env, lib): if lib not in env['all_system_libs']: env['all_system_libs'].append(lib) return (lib in env['req_system_libs']) root_env.AddMethod(WantSystemLib, "WantSystemLib") Which we could hypothetically do as: with root_env.WantSystemLib do (env, lib): if lib not in env['all_system_libs']: env['all_system_libs'].append(lib) return (lib in env['req_system_libs']) As someone who's used both make and SCons, I found SCons terribly verbose and painful to use. By using the proposed do statement, SCons could be made extremely pleasant! # Webapp Configuration Configuration in web applications is generally a real pain: application = webapp([('/profile', ProfileHandler), ('/', MainHandler)], debug=True) run(application) Compare to: using webapp.runner do (config, routes): routes['/profiles'] = ProfileHandler routes['/'] = MainHandler config.debug = True I think the latter is more readable and maintainable. Please let me know if more examples would help... I really do believe that a block syntax would make developers more productive and lead to cleaner code. -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From guido at python.org Mon Mar 9 20:10:11 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Mar 2009 12:10:11 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples] In-Reply-To: References: Message-ID: On Mon, Mar 9, 2009 at 11:47 AM, tav wrote: > Here's another stab at making a case for it. I'll avoid referring to > Ruby this time round -- I was merely using it as an example of where > this approach has been successful. Believe me, there's no Ruby-envy > here ;p > > The motivation: > > 1. Having to name a one-off function adds additional cognitive > overload to a developer. It doesn't make the code any cleaner and by > taking away the burden, we'd have happier developers and cleaner code. I showed an example using "def _(...)". > 2. This approach is more descriptive and in line with the code flow. > With blocks the first line says "I'm about to define a function for > use with X" instead of the existing way which says "I'm defining a > function. Now I'm using that function with X." Marginal. The decorator name could clarify this too. > 3. DSLs -- whether we like them or not, they are in mainstream use. > Python already has beautiful syntax. We should be leveraging that for > DSLs instead of forcing framework developers to create their own > ugly/buggy mini-DSLs. This will enable that. This is quite the non-sequitur. Given that what you propose is trivially done using a decorator, what's ugly/buggy about the existing approach? > The proposed syntax: > > ?using EXPR do PARAM_LIST: > ? ?FUNCTION_BODY > > Examples: > > # Django/App Engine Query > > Frameworks like Django or App Engine define DSLs to enable easy > querying of datastores by users. Wouldn't it better if this could be > done in pure Python syntax? > > Compare the current Django: > > ?q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now()) > > with a hypothetical: > > ?using Entry.filter do (entry): > ? ? ?if entry.headline.startswith('What') and entry.pub_date <= datetime.now(): > ? ? ? ? ?return entry Hmm... where does the 'return' return to? The "current Django" has an assignment to q. How do you set q in this example? > Wouldn't the latter be easier for a developer to read/maintain? But it's not the same. If they had wanted you to be able to write that they could easily have provided an iterator (and actually of course they do -- it's the iterator over all records). But the (admittedly awkward) current syntax *executes the query in the database engine*. There's no way (without proposing a lot of other changes to Python anyway) to translate the Python code in the body of your example to SQL, because Python (the language, anyway -- not all implementations support access to the bytecode) doesn't let you recover the source code of a block at run time. > Let's compare this App Engine: > > ?composer = "Lennon, John" > ?query = GqlQuery("SELECT * FROM Song WHERE composer = :1", composer) > > with: > > ?composer = "Lennon, John" > ?using Song.query do (item): > ? ? ?if item.composer == composer: > ? ? ? ? ?return item > > Again, being able to do it in Python syntax will save developers the > hassles of having to learn non-Python DSLs. Again, it's broken the same way. > # Event-driven Programming > > Right now, event-driven programming like it's done in Twisted is > rather painful for many developers. It's filled with callbacks and the > order in which code is written is completely inverted as far as the > average developer is concerned. > > Let's take Eventlet -- a nice coroutines-based networking library in > Python. Their example webcrawler.py currently does: > > ?urls = ["http://www.google.com/intl/en_ALL/images/logo.gif", > ? ? ? ? ?"http://us.i1.yimg.com/us.yimg.com/i/ww/beta/y3.gif"] > > ?def fetch(url): > ? ? ?print "%s fetching %s" % (time.asctime(), url) > ? ? ?httpc.get(url) > ? ? ?print "%s fetched %s" % (time.asctime(), url) > > ?pool = coros.CoroutinePool(max_size=4) > ?waiters = [] > > ?for url in urls: > ? ? ?waiters.append(pool.execute(fetch, url)) > > Wouldn't it be nicer to do this instead: > > ?pool = coros.CoroutinePool(max_size=4) > > ?for url in urls: > ? ? ?using pool.execute do: > ? ? ? ? ?print "%s fetching %s" % (time.asctime(), url) > ? ? ? ? ?httpc.get(url) > ? ? ? ? ?print "%s fetched %s" % (time.asctime(), url) > > I'd argue that it is -- but then I have bias =) It's also broken -- unless you also have some kind of alternative semantics in mind that does *not* map to existing Python functions and scopes, all callbacks will reference the last url in the list. Compare this classic stumbling block: addN = [(lambda x: x+i) for i in range(10)] add1 = addN[1] print add1(10) # prints 19 > # SCons > > SCons is a make-esque build tool. In the SConstruct (makefile) for > Google Chrome, we find: > > ?def WantSystemLib(env, lib): > ? ? ?if lib not in env['all_system_libs']: > ? ? ? ? ?env['all_system_libs'].append(lib) > ? ? ?return (lib in env['req_system_libs']) > ?root_env.AddMethod(WantSystemLib, "WantSystemLib") > > Which we could hypothetically do as: > > ?with root_env.WantSystemLib do (env, lib): s/with/using/ > ? ? ?if lib not in env['all_system_libs']: > ? ? ? ? ?env['all_system_libs'].append(lib) > ? ? ?return (lib in env['req_system_libs']) That's just a matter of API design. They could easily have provided a decorator to register the callback. The name of the decorated function could even serve as the key. > As someone who's used both make and SCons, I found SCons terribly > verbose and painful to use. By using the proposed do statement, SCons > could be made extremely pleasant! Quite the exaggeration. > # Webapp Configuration > > Configuration in web applications is generally a real pain: > > ?application = webapp([('/profile', ProfileHandler), ('/', MainHandler)], > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? debug=True) > ?run(application) > > Compare to: > > ?using webapp.runner do (config, routes): > ? ? ?routes['/profiles'] = ProfileHandler > ? ? ?routes['/'] = MainHandler > ? ? ?config.debug = True > > I think the latter is more readable and maintainable. Again, there's no need to add new syntax if you wanted to make this API easier on the eyes. > Please let me know if more examples would help... Well, examples that (a) aren't broken and (b) aren't trivially written using decorators might help... > I really do believe that a block syntax would make developers more > productive and lead to cleaner code. I got that. :-) > -- > love, tav > > plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 > http://tav.espians.com | http://twitter.com/tav | skype:tavespian > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From bruce at leapyear.org Mon Mar 9 20:19:50 2009 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 9 Mar 2009 12:19:50 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples] In-Reply-To: References: Message-ID: > Again, being able to do it in Python syntax will save developers the > hassles of having to learn non-Python DSLs. Hmm. For your GqlQuery example, the way I see it doing it in Python syntax will save the hassle of GqlQuery optimizing the database access. The truth is that you don't know what happens to the query string but you do know that it can't take your function apart and have the database process it. So this example is useless. I also can't tell if using/do is supposed to be a loop or not. Some of your examples are loopy and some aren't. And besides that your examples are apples and oranges: composer = "Lennon, John" query = GqlQuery("SELECT * FROM Song WHERE composer = :1", composer) with: composer = "Lennon, John" using Song.query do (item): if item.composer == composer: return item The first one produces a result: something bound to a variable named query. What does the second one produce? I see no advantage in a new syntax that is this confusing. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis.spir at free.fr Mon Mar 9 21:22:25 2009 From: denis.spir at free.fr (spir) Date: Mon, 9 Mar 2009 21:22:25 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> Message-ID: <20090309212225.56d18547@o> Le Mon, 9 Mar 2009 09:43:03 -0700, average s'exprima ainsi: > While there are several complaining that all this can be done with a > def, there's an critical distinction being overlooked. Passing around > code blocks is a very different style of programming that Python or > most languages like it have ever experimented with. This style of programming is very common in stack-based languages, or rather in concatenative languages in general. :square dup * # def square(x): return x*x :squares [square] map # def squares(l): return [square(x) for x in l] # using equivalent of first class func :squares [dup *] map # using anonymous func def [1 2 3] squares ==> [1 4 9] In the latter form, the func literal expression -- often called 'quotation' because it 'quotes' code without executing it -- can be whatever and as long as needed, which is easy due to the linear style of stack-based programming. "Higher order" functions like map, called combinators, are very frequent and (unlike in other paradigms) make the code clearer. But they are not really higher order functions, rather they take several kinds of data as input, one of which happens to be code that will be 'unquoted', i.e. run -- closer to Lisp's eval. see http://www.latrobe.edu.au/philosophy/phimvt/joy/faq.html for a nice introduction (but rather distinctive to FP) esp section #10 denis ------ la vita e estrany From dreamingforward at gmail.com Mon Mar 9 21:22:53 2009 From: dreamingforward at gmail.com (average) Date: Mon, 9 Mar 2009 13:22:53 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <7afdee2f0903091126jba6a941v597b84d9ab40742b@mail.gmail.com> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <7afdee2f0903091126jba6a941v597b84d9ab40742b@mail.gmail.com> Message-ID: <913f9f570903091322s2eb078b2jc0bb93d19d8b3872@mail.gmail.com> > average wrote: > Perhaps I've misunderstood, but both Perl and Javascript (highly > popular languages by any standard) support "passing around code > blocks" by defining anonymous functions. How can you say that most > languages like Python have never experimented with this, when of the > more popular programming languages, Javascript and Perl are the most > obviously similar to Python (besides Ruby)? You're likely right. And I'm probably being sloppier than I should. My point was really more about how the art of programming has yet to really explore the concept and power of code-blocks adequately. Most of us are comfortably stuck in our decades of procedural programming experience. What's misleading about framing this discussion is that all of us are [over]used to the "flatland" of the program editor. Code blocks appear on the screen like any other code, but the *critical* point is that LOGICALLY they are ORTHOGONAL to it. Where most of your code could be organized in a tree-like fashion rooted in your program's "main" node, code blocks are orthogonal and are really like leaves spanning back into the screen to a *different* tree's root (the *application's* surface) hence its subtlety. Hoping that analogy is more useful.... marcos From guido at python.org Mon Mar 9 21:29:16 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Mar 2009 13:29:16 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <913f9f570903091322s2eb078b2jc0bb93d19d8b3872@mail.gmail.com> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <7afdee2f0903091126jba6a941v597b84d9ab40742b@mail.gmail.com> <913f9f570903091322s2eb078b2jc0bb93d19d8b3872@mail.gmail.com> Message-ID: On Mon, Mar 9, 2009 at 1:22 PM, average wrote: > What's misleading about framing this discussion is that all of us are > [over]used to the "flatland" of the program editor. ?Code blocks > appear on the screen like any other code, but the *critical* point is > that LOGICALLY they are ORTHOGONAL to it. ?Where most of your code > could be organized in a tree-like fashion rooted in your program's > "main" node, code blocks are orthogonal and are really like leaves > spanning back into the screen to a *different* tree's root (the > *application's* surface) hence its subtlety. That's only one use of callbacks. One could claim that a confusing part of anonymous blocks (as used in SmallTalk and Ruby) is that they use the same syntax for both use cases: you can't tell from the syntax whether the block is executed in the place where you see it (perhaps in a loop or with an error handler wrapped around it, or conditionally like in SmallTalk's "if" construct), or squirreled away for later use (once or many times). The good thing about function syntax (as used in JavaScript's anonymous blocks) is that it leaves no doubt about these two different uses: at least by convention, anonymous functions are used for asynchronous programming, while in-line code uses regular block syntax. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Mon Mar 9 21:58:43 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Mar 2009 09:58:43 +1300 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com> Message-ID: <49B58303.5080802@canterbury.ac.nz> average wrote: > MY point was really about how the programming art *itself* hasn't > fully explored this concept to even be able to *evaluate* the power > and usefulness of employing techniques such as code blocks. On the contrary, I think Smalltalk has explored it very well. Smalltalk implements *all* control structures in terms of code blocks, and does so in a very readable way, without any of the brain-exploding characteristics of Lisp. It works well in Smalltalk because the whole language syntax is designed from the ground up to accommodate it. Ruby inherits the idea, but struggles to fit it into its syntax, leading to a much-weakened form (you can only pass one code block to a given method at a time). It's even harder to fit the idea into Python's syntax. This isn't just because of the indentation issue, but also because Pythonistas tend to have a higher standard of aesthetics when comes to syntax design. Ruby can get away with looking a bit messy and haphazard, but that's not acceptable in the Python community. There are also semantic problems with the idea in Python. Once you're allowed to write the code block in-line, it becomes expected that you can write things like: while some_condition: with flapple() do (arg): if some_other_condition: break and have the 'break' exit from the while-loop. But if the body is actually a separate function, this is not easy to arrange. Back when the existing with-statement was being designed, there was serious thought put towards implementing it by passing the body as a function. But handling 'break', 'continue', 'return' and 'yield' inside the body would have required raising special control-flow exceptions, and it all got very messy and complicated. In the end it was decided not to be worth the hassle, and the existing generator-based implementation was settled on. -- Greg From tav at espians.com Mon Mar 9 22:20:07 2009 From: tav at espians.com (tav) Date: Mon, 9 Mar 2009 21:20:07 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <49B58303.5080802@canterbury.ac.nz> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com> <49B58303.5080802@canterbury.ac.nz> Message-ID: Hey Greg, > It becomes expected that you can write things like: > > ?while some_condition: > ? ?with flapple() do (arg): > ? ? ?if some_other_condition: > ? ? ? ?break > > and have the 'break' exit from the while-loop. Thanks for this!! It's the only counter-argument that I've seen which demonstrates that my proposal was unpythonic. As such, I'd like to withdraw my proposal -- sorry for taking up everybody's time =( But, hey, live and learn =) -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From tjreedy at udel.edu Mon Mar 9 22:26:11 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 Mar 2009 17:26:11 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> Message-ID: Guido van Rossum wrote: >> For precedence let's take a look at decorators. Fundamentally, >> decorators save a user nothing more than a single line of code. > > I guess you weren't there at the time. If it was about saving a line > of code it would have been boohed out of the room. (Especially since > the line count is actually the same with or without using the > decorator syntax!) The big improvement that decorators offer is to > move the "decoration" from the end of the function body, where it is > easily missed, to the front of the declaration, where it changes the > emphasis for the reader. I don't see a similar advantage in your > example; it looks more like "Ruby-envy" to me. Plus decorators save 2 retypings of the function name, which was especially important for people who use very_long_multi_word_func_names. From greg.ewing at canterbury.ac.nz Mon Mar 9 22:33:04 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 10 Mar 2009 10:33:04 +1300 Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples] In-Reply-To: References: Message-ID: <49B58B10.4090806@canterbury.ac.nz> tav wrote: > q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now()) > > using Entry.filter do (entry): > if entry.headline.startswith('What') and entry.pub_date <= datetime.now(): > return entry How is this any better than [entry for entry in Entry.filter if entry.headline.startswith('What') and entry.pub_date <= datetime.now()] But note that neither of these is an adequate replacement for the Django expression, because Django can generate an SQL query incorporating the filter criteria. Neither a list comprehension nor the proposed using-statement are capable of doing that. The same thing applies to the App Engine example, or any other relational database wrapper. By the way, having the 'return' in there doing something other than return from the function containing the 'using' statement would be confusing and inconsistent with the rest of the language. > # Event-driven Programming I've been exploring this a bit myself in relation to my yield-from proposal, and doing this sort of thing using generators is a much better idea, I think, especially given something like a yield-from construct. -- Greg From g.brandl at gmx.net Mon Mar 9 22:51:50 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 09 Mar 2009 22:51:50 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <200903100450.56903.steve@pearwood.info> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> <200903100450.56903.steve@pearwood.info> Message-ID: Steven D'Aprano schrieb: > On Tue, 10 Mar 2009 04:15:13 am you wrote: >> Hey Steven, >> >> > Can you explain why: >> > func(named_function) >> > is radically different from: >> > func(multi-line-code-block-without-the-name) >> >> Hmz, the intention isn't to support multi-line lambdas. It's to make >> passing in anonymous functions easier. > > Lambdas are single-line (technically, single-statement) anonymous If you really want to get technical, it's single-expression. nit-pickingly-yrs, Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From tjreedy at udel.edu Mon Mar 9 23:04:32 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 Mar 2009 18:04:32 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <200903100450.56903.steve@pearwood.info> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> <200903100450.56903.steve@pearwood.info> Message-ID: Steven D'Aprano wrote: > Lambdas are single-line (technically, single-statement) anonymous > functions, Lambdas are function-defining expressions used *within* statements that give the resulting function object a stock .__name__ of ''. The syntax could have been augmented to include a real name, so the stock-name anonymity is a side-effect of the chosen syntax. Possibilities include lambda name(args): expression lambda args: expression The latter, assuming it is LL(1) parse-able, would even be compatible with existing code and could still be added. Contrarywise, function-defining def statements could have been allowed to omit the name. To be useful, the object (with a .__name__ such as '', would have to get a default namespace binding such as to '_', even in batch mode. > caller(lambda args: statement) Change 'statement' to 'expression'. > A multi-line lambda (technically, multi-statement) The problem is that 'multi-statement expression' is an oxymoron in Pythonland. > would also be an anonymous function. Not necessarily, and irrelevant to the essence of lambda expressions, which is that they are expressions that can be used within statements. Terry Jan Reedy From tjreedy at udel.edu Mon Mar 9 23:31:36 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 Mar 2009 18:31:36 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <913f9f570903091043t70b4ec9dye79ad66433a6fb7f@mail.gmail.com> Message-ID: average wrote: >>> [RE:"using" keyword].Setting it off with a new keyword, or > MY point was really about how the programming art *itself* hasn't > fully explored this concept to even be able to *evaluate* the power > and usefulness of employing techniques such as code blocks. To me, > Python and Ruby are both exciting and interesting examples of how > language design is evolving to find ways to express and evolve that > power. In my mind, there is no doubt that languages will have to find > elegant ways to express that power. What's cool about Python and Ruby > is that it's taking that vast general space of the "mind's assembly > language" and distilling it down into nicely manageble and elegant > chunks of language syntax. The concept of distinct, passable code > blocks is a nice example of that compression, one that certainly has > correspondence within our biology. A function is a possibly parameterized code block that can be passed around and called. Anonymity is a defect, not an advantage. So your attempted differentiation looks a bit like mystical gibberish to me. Sorry. From tjreedy at udel.edu Tue Mar 10 00:09:52 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 09 Mar 2009 19:09:52 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <20090309154505.GA18115@panix.com> References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: Aahz wrote: > On Mon, Mar 09, 2009, Sturla Molden wrote: >> I see no reason for introducing two new keywords to do this, as you are >> really just enhancing the current lambda keyword. >> >> On the other hand, turning blocks into anonymous functions would be very >> useful for functional programming. As such, I like your suggestion. > > There's a substantial minority (possibly even a majority) in the Python > community that abhors functional programming. I am not one of them, if there really are such. > Even among those who like functional programming, > there's a substantial population that dislikes > extensive use of anonymous functions. Like many other Pythonistas I recognize that that an uninformative stock name of '' is defective relative to an informative name that points back to readable code. What I dislike is the anonymity-cult claim that the defect is a virtue. Since I routinely use standard names 'f' and 'g' (from math) to name functions whose name I do not care about, I am baffled (and annoyed) by (repeated) claims such as "Having to name a one-off function adds additional cognitive overload to a developer." (Tav). Golly gee, if one cannot decide on standard one-char name, how can he manage the rest of Python? (I also, like others, routinely use 'C' for class and 'c' for C instance. What next? A demand for anonymous classes? Whoops, I just learned that Java has those.) But I have no problem with the use of lambda expressions as a convenience, where appropriate. Terry Jan Reedy From steve at pearwood.info Tue Mar 10 00:17:00 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 10 Mar 2009 10:17:00 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100450.56903.steve@pearwood.info> Message-ID: <200903101017.00551.steve@pearwood.info> On Tue, 10 Mar 2009 09:04:32 am Terry Reedy wrote: > Steven D'Aprano wrote: > > Lambdas are single-line (technically, single-statement) anonymous > > functions, > > Lambdas are function-defining expressions used *within* statements [...] Ah, sorry for the brain-o, I was thinking "expression" and typing "statement". But thanks for the detailed explanation anyway, it is helpful. -- Steven D'Aprano From guido at python.org Tue Mar 10 00:55:18 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 9 Mar 2009 16:55:18 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy wrote: > Like many other Pythonistas I recognize that that an uninformative stock > name of '' is defective relative to an informative name that points > back to readable code. ?What I dislike is the anonymity-cult claim that the > defect is a virtue. > > Since I routinely use standard names 'f' and 'g' (from math) to name > functions whose name I do not care about, I am baffled (and annoyed) by > (repeated) claims such as "Having to name a one-off function adds additional > cognitive overload to a developer." (Tav). ?Golly gee, if one cannot decide > on standard one-char name, how can he manage the rest of Python? > > (I also, like others, routinely use 'C' for class and 'c' for C instance. > ?What next? ?A demand for anonymous classes? Whoops, I just learned that > Java has those.) > > But I have no problem with the use of lambda expressions as a convenience, > where appropriate. Andrew Koening once gave me a good use case where lambdas are really a lot more convenient than named functions. He was initializing a large data structure that was used by an interpreter for some language. It was a single expression (probably a list of tuples or a dict). Each record contained various bits of information (e.g. the operator symbol and its precedence and associativity) as well as a function (almost always a very simple lambda) that implemented it. Since this table was 100s of records long, it would have been pretty inconvenient to first have to define 100s of small one-line functions and give them names, only to reference them once in the initializer. This use case doesn't have a nice equivalent without anonymous functions (though I'm sure that if there really was no other way it could be done, e.g. using registration=style decorators). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jan.kanis at phil.uu.nl Tue Mar 10 01:13:42 2009 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Tue, 10 Mar 2009 01:13:42 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) Message-ID: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> This being python-ideas, I'll also have a go at it. Being someone who does like functional programing when used in limited quantities, I also think multi line lambdas (or blocks, whatever you call them) are a good thing if a good way could be found to embed them into Python. But I don't like the part of tavs proposal of handling them with a magic __do__ function. So what about this slightly modified syntax and semantics: def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME: BODY eg: def callback(param) as result with do_something(with_our(callback), other, args): print("called back with "+param) return foobar(param) this would be equivalent to def callback(param): print("called back with "+param) return foobar(param) result = do_something(with_our(callback), other_args) This example tav gave for url in urls: using pool.execute do: print "%s fetching %s" % (time.asctime(), url) httpc.get(url) print "%s fetched %s" % (time.asctime(), url) would be written as for url in urls: def fetch(url) with pool.execute(fetch): print "%s fetching %s" % (time.asctime(), url) httpc.get(url) print "%s fetched %s" % (time.asctime(), url) Compared to tavs proposal this would: - allow for use in expressions where the block is not the only argument, the block could even be passed in multiple parameter positions - make it clear that a function is being defined, and this syntax even allows for re-using the function later on, including testing etc. - not use any new keywords, as it is both syntactically and semantically an extension of the 'def' keyword. I think it also caters to those circumstances where you'd want to use a multiline lambda, without having the awkward reordering of first having to define the function and then passing it by name where the emphasis has to be on what you do with the function (analogous to decorators). It also does everything (I think) that Rubys blocks do. It does not solve the case where you'd want to pass multiple blocks/multiline lambdas to a function, but hey, you can't solve everything. There are several variations of this syntax that could also be considered, eg having the 'with' clause before the 'as' clause, or making the 'with' clause optional as well. Or doing 'using foo(callback, other, args) as answer with callback(param):', that would require a new keyword, but would allow the expression using the new function to be first, as it is presumably the most important part. my ?0,02, - Jan From santagada at gmail.com Tue Mar 10 02:04:52 2009 From: santagada at gmail.com (Leonardo Santagada) Date: Mon, 9 Mar 2009 22:04:52 -0300 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) In-Reply-To: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> Message-ID: <73022CA5-ED35-4FF9-A4B8-1013816F5288@gmail.com> On Mar 9, 2009, at 9:13 PM, Jan Kanis wrote: > def callback(param) as result with do_something(with_our(callback), > other, args): > print("called back with "+param) > return foobar(param) > > > this would be equivalent to > > def callback(param): > print("called back with "+param) > return foobar(param) > > result = do_something(with_our(callback), other_args) Not only the equivalent code looks much cleaner, the only good thing it actually do (not having to first define a function to then use it) can be accomplished with a decorator. Thanks GvR and all that finally shed some light in ruby blocks. I never understood what was so special about them, now I know it is nothing really. :) -- Leonardo Santagada santagada at gmail.com From grosser.meister.morti at gmx.net Tue Mar 10 03:16:48 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Tue, 10 Mar 2009 03:16:48 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: <49B5CD90.1060205@gmx.net> Guido van Rossum wrote: > On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy wrote: >> Like many other Pythonistas I recognize that that an uninformative stock >> name of '' is defective relative to an informative name that points >> back to readable code. What I dislike is the anonymity-cult claim that the >> defect is a virtue. >> >> Since I routinely use standard names 'f' and 'g' (from math) to name >> functions whose name I do not care about, I am baffled (and annoyed) by >> (repeated) claims such as "Having to name a one-off function adds additional >> cognitive overload to a developer." (Tav). Golly gee, if one cannot decide >> on standard one-char name, how can he manage the rest of Python? >> >> (I also, like others, routinely use 'C' for class and 'c' for C instance. >> What next? A demand for anonymous classes? Whoops, I just learned that >> Java has those.) >> >> But I have no problem with the use of lambda expressions as a convenience, >> where appropriate. > > Andrew Koening once gave me a good use case where lambdas are really a > lot more convenient than named functions. He was initializing a large > data structure that was used by an interpreter for some language. It > was a single expression (probably a list of tuples or a dict). Each > record contained various bits of information (e.g. the operator symbol > and its precedence and associativity) as well as a function (almost > always a very simple lambda) that implemented it. Since this table was > 100s of records long, it would have been pretty inconvenient to first > have to define 100s of small one-line functions and give them names, > only to reference them once in the initializer. > > This use case doesn't have a nice equivalent without anonymous > functions (though I'm sure that if there really was no other way it > could be done, e.g. using registration=style decorators). > With a small trick you don't need lambdas for this, if the keys are python identifiers. If they aren't you can add the real key to the info tuple and then generate a new dict in a oneliner. But yeah, it's a bit ugly/abusive. But it is possible without defining parts of the tuple at different places! :) def info(*args): def wrapper(f): return args + (f,) return wrapper def mkdict(): @info("a",12,9.9,None) def foo(): print "this is foo" @info("b",23,0.0,foo) def bar(): print "this is bar" @info("c",42,3.1415,None) def baz(): print "this is baz" return locals() print mkdict() -panzi From stephen at xemacs.org Tue Mar 10 05:14:10 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 10 Mar 2009 13:14:10 +0900 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: <87iqmi5819.fsf@xemacs.org> Terry Reedy writes: > Like many other Pythonistas I recognize that that an uninformative stock > name of '' is defective relative to an informative name that > points back to readable code. What I dislike is the anonymity-cult > claim that the defect is a virtue. That's unfair. Python has "anonymous blocks" all over the place, since every control structure controls one or more of them. It simply requires that they be forgotten at the next DEDENT. Surely you don't advocate that each of them should get a name! I think this is a difference of cognition. Specifically, people who don't want to name blocks as functions may not abstract processes to signatures as easily, and reify whole processes (including all free identifiers!) as objects more easily, as those who don't think naming is a problem. > Since I routinely use standard names 'f' and 'g' (from math) to name > functions whose name I do not care about, I am baffled (and annoyed) by If the cognition hypothesis is correct, of course you're baffled. You "just don't" think that way, while he really does. The annoyance can probably be relieved by s/a developer/some developers/ here: > (repeated) claims such as "Having to name a one-off function adds > additional cognitive overload to a developer." (Tav). I suspect that "overload" is a pun, here. Your rhetorical question > Golly gee, if one cannot decide on standard one-char name, how can > he manage the rest of Python? has an unexpected answer: in the rest of Python name overloading is carefully controlled and scoped into namespaces. If my cognition hypothesis is correct, then a standard one-character name really does bother/confuse his cognition, where maintaining the whole structure of the block from one use to the next somehow does not. (This baffles me, too!) The question then becomes "can Python become more usable to developers unlike you and me without losing some Pythonicity?" Guido seems to think not (I read him as pessimistic on both grounds: the proposed syntax is neither as useful nor as Pythonic as Tav thinks it is). From cmjohnson.mailinglist at gmail.com Tue Mar 10 06:59:39 2009 From: cmjohnson.mailinglist at gmail.com (Carl Johnson) Date: Mon, 9 Mar 2009 19:59:39 -1000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: Message-ID: <3bdda690903092259h241916a5p862cdf5cf1b00de6@mail.gmail.com> > Sounds like you might as well write a decorator named @using: > > @using(employees.select) > def _(employee): > if employee.salary > developer.salary: > fireEmployee(employee) > else: > extendContract(employee) I like that (which is why I proposed allowing lambda decorators last month), but I'm uncomfortable with how after all is said and done, _ will either be set to something that's not a callable or to a callable that no one is ever supposed to call. Perhaps if we allowed for this: @using(employees.select) as results def (employee): #It is mandatory that no name be used here and #that parentheses are included if employee.salary > developer.salary: fireEmployee(employee) else: extendContract(employee) #results = [ list of employees ] to be syntatic sugar for this: def callback(employee): if employee.salary > developer.salary: fireEmployee(employee) else: extendContract(employee) results = using(employees.select)(callback) #results = [ list of employees ] Similarly, the horrible Java-style callback mentioned earlier in the thread self.Bind(wx.BUTTON, lamda: evt , mybutton) might with a change in API become something like @Bind(wx.BUTTON, mybutton) as connected_button_object def (event): #do event handling stuff? return Ruby blocks let them write, >> (1..10).map { |x| 2 * x } => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20] which for us would be a simple [2 * x for x in range(1, 11)], but their version has the advantage of being able to be expanded to a series of expressions and statements instead of a single expression if need be. With my proposal, someone could write: >>> def map_dec(l): ... def f(callback): ... return [callback(item) for item in l] ... return f ... >>> @map_dec(range(10)) as results ... def (item): ... return 2 * item ... >>> results [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] Basically, the "as" would be used to indicate "no one cares about the following function by itself, they just want to use it as a callback to get some result". My-doomed-proposally-yours, -- Carl From stephen at xemacs.org Tue Mar 10 08:38:18 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 10 Mar 2009 16:38:18 +0900 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <87iqmi5819.fsf@xemacs.org> References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <87iqmi5819.fsf@xemacs.org> Message-ID: <87d4cp6d5h.fsf@xemacs.org> Stephen J. Turnbull writes: > If my cognition hypothesis is correct, then a standard > one-character name really does bother/confuse his cognition, where > maintaining the whole structure of the block from one use to the > next somehow does not. (This baffles me, too!) This is mis-written. Since the block is "one-off", there is no "next use". So I guess the thought process is "I'm in a context, and I need to operate on it, so now I define the process: ." Not baffling, but still foreign to me personally. From denis.spir at free.fr Tue Mar 10 09:19:36 2009 From: denis.spir at free.fr (spir) Date: Tue, 10 Mar 2009 09:19:36 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> <200903100450.56903.steve@pearwood.info> Message-ID: <20090310091936.6e782ca3@o> Le Mon, 09 Mar 2009 18:04:32 -0400, Terry Reedy s'exprima ainsi: > Lambdas are function-defining expressions used *within* statements that > give the resulting function object a stock .__name__ of ''. The > syntax could have been augmented to include a real name, so the > stock-name anonymity is a side-effect of the chosen syntax. > Possibilities include > lambda name(args): expression > lambda args: expression > The latter, assuming it is LL(1) parse-able, would even be compatible > with existing code and could still be added. > > Contrarywise, function-defining def statements could have been allowed > to omit the name. To be useful, the object (with a .__name__ such as > '', would have to get a default namespace binding such as to '_', > even in batch mode. I do not agree with that. It is missing the point of lambdas. Lambdas are snippets of code equivalent to expressions to be used in place. Lambdas are *not* called, need not beeing callable, rather they are *used* by higher order functions like map. The fact that they do not have any name in syntax thus properly matches their semantic "anonymousity" ;-) > > A multi-line lambda (technically, multi-statement) > > The problem is that 'multi-statement expression' is an oxymoron in > Pythonland. > > > would also be an anonymous function. > > Not necessarily, and irrelevant to the essence of lambda expressions, > which is that they are expressions that can be used within statements. I do not see any contradiction with the "essence of lambda expressions" here. We could have a syntax for multi-statement lambdas without any semantic contradiction. The issue is more probably that it does not fit well python's style and syntax (esp. indent) and would hinder legibility and simplicity. print map(lambda x: {fact = None if x<0 else factorial(x); return "%s: %s" %(x,fact)}, seq) It's ugly, sure. Still, I do not see the "essence of lambda expressions" twisted. I vote -1 to "block-lambdas" for the sake of clarity only. Denis ------ la vita e estrany From denis.spir at free.fr Tue Mar 10 09:58:22 2009 From: denis.spir at free.fr (spir) Date: Tue, 10 Mar 2009 09:58:22 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) In-Reply-To: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> Message-ID: <20090310095822.0f957a4d@o> Le Tue, 10 Mar 2009 01:13:42 +0100, Jan Kanis s'exprima ainsi: > This being python-ideas, I'll also have a go at it. > > Being someone who does like functional programing when used in limited > quantities, I also think multi line lambdas (or blocks, whatever you > call them) are a good thing if a good way could be found to embed them > into Python. But I don't like the part of tavs proposal of handling > them with a magic __do__ function. So what about this slightly > modified syntax and semantics: > > def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME: > BODY > > eg: > > def callback(param) as result with do_something(with_our(callback), > other, args): > print("called back with "+param) > return foobar(param) I like this proposal much more than all previous ones. Still, how would you (or anybody else) introduce the purpose, meaning, use of this construct, and its language-level semantics? [This is not disguised critics, neither rethoric question: I'm really interested in answers.] Denis ------ la vita e estrany From sturla at molden.no Tue Mar 10 15:51:24 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 10 Mar 2009 15:51:24 +0100 Subject: [Python-ideas] cd statement? Message-ID: <49B67E6C.6020206@molden.no> When working with Python interactive shell (particularly IDLE started from Windows start menu), one thing I miss is a cd statement. Ok, I can do >>> import os >>> os.setcwd('e:\\work') But I keep feeling that Matlab's cd statement is more handy: >> cd e:\work One other feature that makes Matlab's shell more handy, is the whos statement. It lists all variables created form the shell, types, etc. Yes it is possible to get all local and global names in Python/IDLE, but that is not the same. The variables created interactively get hidden in the clutter. IDLE also lacks a command history. If I e.g. make a typo, why do I have to copy and paste, instead of just hitting the arrow button? Although cosmetically, these three small things keep annoying me. :-( Sturla Molden From phd at phd.pp.ru Tue Mar 10 15:58:59 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Tue, 10 Mar 2009 17:58:59 +0300 Subject: [Python-ideas] cd statement? In-Reply-To: <49B67E6C.6020206@molden.no> References: <49B67E6C.6020206@molden.no> Message-ID: <20090310145859.GA20242@phd.pp.ru> On Tue, Mar 10, 2009 at 03:51:24PM +0100, Sturla Molden wrote: > >> cd e:\work See DirChanger at http://phd.pp.ru/Software/dotfiles/init.py.html . With it you can do >>> cd('work') >>> cd /home/phd/work Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From 8mayday at gmail.com Tue Mar 10 16:07:01 2009 From: 8mayday at gmail.com (Andrey Popp) Date: Tue, 10 Mar 2009 18:07:01 +0300 Subject: [Python-ideas] cd statement? In-Reply-To: <20090310145859.GA20242@phd.pp.ru> References: <49B67E6C.6020206@molden.no> <20090310145859.GA20242@phd.pp.ru> Message-ID: Why not to use IPython? On Tue, Mar 10, 2009 at 5:58 PM, Oleg Broytmann wrote: > On Tue, Mar 10, 2009 at 03:51:24PM +0100, Sturla Molden wrote: >> >> cd e:\work > > ? See DirChanger at http://phd.pp.ru/Software/dotfiles/init.py.html . With > it you can do > >>>> cd('work') >>>> cd > /home/phd/work > > Oleg. > -- > ? ? Oleg Broytmann ? ? ? ? ? ?http://phd.pp.ru/ ? ? ? ? ? ?phd at phd.pp.ru > ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- ? ?????????, ?????? ????. +7 911 740 24 91 From malaclypse2 at gmail.com Tue Mar 10 16:08:27 2009 From: malaclypse2 at gmail.com (Jerry Hill) Date: Tue, 10 Mar 2009 11:08:27 -0400 Subject: [Python-ideas] cd statement? In-Reply-To: <49B67E6C.6020206@molden.no> References: <49B67E6C.6020206@molden.no> Message-ID: <16651e80903100808t6abc47at329fec2df7ee4ae@mail.gmail.com> On Tue, Mar 10, 2009 at 10:51 AM, Sturla Molden wrote: > IDLE also lacks a command history. If I e.g. make a typo, why do I have to > copy and paste, instead of just hitting the arrow button? Command history in IDLE is bound to alt-n (next) and alt-p (previous) by default. -- Jerry From jjb5 at cornell.edu Tue Mar 10 15:56:01 2009 From: jjb5 at cornell.edu (Joel Bender) Date: Tue, 10 Mar 2009 10:56:01 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: Message-ID: <49B67F81.5030305@cornell.edu> Guido van Rossum wrote: > @using(employees.select) > def _(employee): > if employee.salary > developer.salary: > fireEmployee(employee) > else: > extendContract(employee) I personally don't mind anonymous functions, I use them when I can fit everything on mostly one line and they don't have any side effects. None of my decorators that I write actually call the function they are passed either. So there could be a lambda statement... @using(employees.select) lambda employee: if employee.developer and employee.not_using_python: fireDeveloper(employee) ...which would make a function that isn't named, purely for its side effects. -$0.02 Joel From arnodel at googlemail.com Tue Mar 10 17:17:25 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 10 Mar 2009 16:17:25 +0000 Subject: [Python-ideas] cd statement? In-Reply-To: <49B67E6C.6020206@molden.no> References: <49B67E6C.6020206@molden.no> Message-ID: <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com> 2009/3/10 Sturla Molden : > > When working with Python interactive shell (particularly IDLE started from > Windows start menu), one thing I miss is a cd statement. Ok, I can do > >>>> import os >>>> os.setcwd('e:\\work') > > But I keep feeling that Matlab's cd statement is more handy: > >>> cd e:\work > > One other feature that makes Matlab's shell more handy, is the whos > statement. It lists all variables created form the shell, types, etc. Yes it > is possible to get all local and global names in Python/IDLE, but that is > not the same. The variables created interactively get hidden in the clutter. > > IDLE also lacks a command history. If I e.g. make a typo, why do I have to > copy and paste, instead of just hitting the arrow button? > > Although cosmetically, these three small things keep annoying me. :-( Have you tried IPython? -- Arnaud From sturla at molden.no Tue Mar 10 17:29:12 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 10 Mar 2009 17:29:12 +0100 Subject: [Python-ideas] cd statement? In-Reply-To: <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com> References: <49B67E6C.6020206@molden.no> <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com> Message-ID: <49B69558.3090000@molden.no> Arnaud Delobelle wrote: > Have you tried IPython? Yes, it has all that I miss, but it's ugly (at least on Windows, where it runs in a DOS shell). S.M. From guido at python.org Tue Mar 10 17:33:27 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 09:33:27 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) In-Reply-To: <20090310095822.0f957a4d@o> References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> <20090310095822.0f957a4d@o> Message-ID: On Tue, Mar 10, 2009 at 1:58 AM, spir wrote: > Le Tue, 10 Mar 2009 01:13:42 +0100, > Jan Kanis s'exprima ainsi: > >> This being python-ideas, I'll also have a go at it. >> >> Being someone who does like functional programing when used in limited >> quantities, I also think multi line lambdas (or blocks, whatever you >> call them) are a good thing if a good way could be found to embed them >> into Python. But I don't like the part of tavs proposal of handling >> them with a magic __do__ function. So what about this slightly >> modified syntax and semantics: >> >> def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME: >> ? ? ? BODY >> >> eg: >> >> ?def callback(param) as result with do_something(with_our(callback), >> other, args): >> ? ? ? print("called back with "+param) >> ? ? ? return foobar(param) > > I like this proposal much more than all previous ones. Just to avoid getting your hopes up too high, this gets a solid -1 from me, since it just introduces unwieldy new additions to the currently clean 'def' syntax, to accomplish something you can already do just as easily with a decorator. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From taleinat at gmail.com Tue Mar 10 17:52:48 2009 From: taleinat at gmail.com (Tal Einat) Date: Tue, 10 Mar 2009 18:52:48 +0200 Subject: [Python-ideas] cd statement? In-Reply-To: <49B69558.3090000@molden.no> References: <49B67E6C.6020206@molden.no> <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com> <49B69558.3090000@molden.no> Message-ID: <7afdee2f0903100952s15e0a836l366bd2158b13c125@mail.gmail.com> Sturla Molden wrote: > Arnaud Delobelle wrote: >> >> Have you tried IPython? > > Yes, it has all that I miss, but it's ugly (at least on Windows, where it > runs in a DOS shell). > > S.M. > Hear, hear! GUI interactive prompts FTW! In IDLE you can also just move the cursor to a previous line of code (or code block), hit Return and you'll have that code on your current command line, ready to be edited and executed. As for changing directories, I find "from os import chdir as cd, getcwd as cwd" satisfactory. I have it in the python file referenced by the PYTHONSTARTUP environment variable, and I've changed all the relevant shortcuts (on Windows) to run IDLE with the -s flag (which causes the PYTHONSTARTUP file to be imported before anything else). - Tal From guido at python.org Tue Mar 10 18:05:30 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 10:05:30 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <20090310091936.6e782ca3@o> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> <200903100450.56903.steve@pearwood.info> <20090310091936.6e782ca3@o> Message-ID: On Tue, Mar 10, 2009 at 1:19 AM, spir wrote: > I do not agree with that. It is missing the point of lambdas. Lambdas are snippets of code equivalent to expressions to be used in place. Lambdas are *not* called, need not beeing callable, rather they are *used* by higher order functions like map. The fact that they do not have any name in syntax thus properly matches their semantic "anonymousity" ;-) Eh? On what planet do you live? What use is a lambda if it is never called? It will *eventually* be called -- if it is never called you might as well substitute None and your program would run the same. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Tue Mar 10 18:07:58 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 10:07:58 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <3bdda690903092259h241916a5p862cdf5cf1b00de6@mail.gmail.com> References: <3bdda690903092259h241916a5p862cdf5cf1b00de6@mail.gmail.com> Message-ID: On Mon, Mar 9, 2009 at 10:59 PM, Carl Johnson wrote: [Guido] >> Sounds like you might as well write a decorator named @using: >> >> ?@using(employees.select) >> ?def _(employee): >> ? ? if employee.salary > developer.salary: >> ? ? ? ? fireEmployee(employee) >> ? ? else: >> ? ? ? ? extendContract(employee) > > I like that (which is why I proposed allowing lambda decorators last > month), but I'm uncomfortable with how after all is said and done, _ > will either be set to something that's not a callable or to a callable > that no one is ever supposed to call. Well, _ is by convention often used as a "throw-away" result. So I am totally comfortable with this. Or, at least, I am as comfortable with it as I am with writing name, phone, _, _ = record to unpack a 4-tuple when the last two elements are not needed. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ironfroggy at gmail.com Tue Mar 10 18:13:09 2009 From: ironfroggy at gmail.com (Calvin Spealman) Date: Tue, 10 Mar 2009 13:13:09 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: Message-ID: <76fd5acf0903101013p1d8dd0f6m441d4277f3626523@mail.gmail.com> On Mon, Mar 9, 2009 at 12:39 PM, Guido van Rossum wrote: > On Mon, Mar 9, 2009 at 6:04 AM, tav wrote: >> I've come up with a way to do Ruby-style blocks in what I feel to be a >> Pythonic way: >> >> ?using employees.select do (employee): >> ? ? ?if employee.salary > developer.salary: >> ? ? ? ? ?fireEmployee(employee) >> ? ? ?else: >> ? ? ? ? ?extendContract(employee) > > Sounds like you might as well write a decorator named @using: > > ?@using(employees.select) > ?def _(employee): > ? ? if employee.salary > developer.salary: > ? ? ? ? fireEmployee(employee) > ? ? else: > ? ? ? ? extendContract(employee) What would `using` here do that decorating the temp function with employees.select itself wouldn't do? All you are doing is saying "Pass this function to this function" which is exactly what decorators already do. > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From guido at python.org Tue Mar 10 18:24:59 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 10:24:59 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <76fd5acf0903101013p1d8dd0f6m441d4277f3626523@mail.gmail.com> References: <76fd5acf0903101013p1d8dd0f6m441d4277f3626523@mail.gmail.com> Message-ID: On Tue, Mar 10, 2009 at 10:13 AM, Calvin Spealman wrote: > On Mon, Mar 9, 2009 at 12:39 PM, Guido van Rossum wrote: >> On Mon, Mar 9, 2009 at 6:04 AM, tav wrote: >>> I've come up with a way to do Ruby-style blocks in what I feel to be a >>> Pythonic way: >>> >>> ?using employees.select do (employee): >>> ? ? ?if employee.salary > developer.salary: >>> ? ? ? ? ?fireEmployee(employee) >>> ? ? ?else: >>> ? ? ? ? ?extendContract(employee) >> >> Sounds like you might as well write a decorator named @using: >> >> ?@using(employees.select) >> ?def _(employee): >> ? ? if employee.salary > developer.salary: >> ? ? ? ? fireEmployee(employee) >> ? ? else: >> ? ? ? ? extendContract(employee) > > What would `using` here do that decorating the temp function with > employees.select itself wouldn't do? All you are doing is saying "Pass > this function to this function" which is exactly what decorators > already do. I think the original proposal was implying some kind of loop over the values returned by employees.select(). But it's really irrelevant for the equivalency. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From denis.spir at free.fr Tue Mar 10 19:38:15 2009 From: denis.spir at free.fr (spir) Date: Tue, 10 Mar 2009 19:38:15 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> <200903100450.56903.steve@pearwood.info> <20090310091936.6e782ca3@o> Message-ID: <20090310193815.454f1975@o> Le Tue, 10 Mar 2009 10:05:30 -0700, Guido van Rossum s'exprima ainsi: > On Tue, Mar 10, 2009 at 1:19 AM, spir wrote: > > I do not agree with that. It is missing the point of lambdas. Lambdas are > > snippets of code equivalent to expressions to be used in place. Lambdas > > are *not* called, need not beeing callable, rather they are *used* by > > higher order functions like map. The fact that they do not have any name > > in syntax thus properly matches their semantic "anonymousity" ;-) > > Eh? On what planet do you live? What use is a lambda if it is never > called? It will *eventually* be called -- if it is never called you > might as well substitute None and your program would run the same. > Should have written: "called in code" (!). Thought it was obvious, sorry. It's executed indeed through the func it is passed to, but not (explicitely) called. The point I meant is: it is locally used -- not called from anywhere else -- the reason why it needs no name. denis ------ la vita e estrany From guido at python.org Tue Mar 10 19:47:21 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 11:47:21 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <20090310193815.454f1975@o> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> <200903100450.56903.steve@pearwood.info> <20090310091936.6e782ca3@o> <20090310193815.454f1975@o> Message-ID: On Tue, Mar 10, 2009 at 11:38 AM, spir wrote: > Le Tue, 10 Mar 2009 10:05:30 -0700, > Guido van Rossum s'exprima ainsi: > >> On Tue, Mar 10, 2009 at 1:19 AM, spir wrote: >> > I do not agree with that. It is missing the point of lambdas. Lambdas are >> > snippets of code equivalent to expressions to be used in place. Lambdas >> > are *not* called, need not beeing callable, rather they are *used* by >> > higher order functions like map. The fact that they do not have any name >> > in syntax thus properly matches their semantic "anonymousity" ;-) >> >> Eh? On what planet do you live? What use is a lambda if it is never >> called? It will *eventually* be called -- if it is never called you >> might as well substitute None and your program would run the same. >> > > Should have written: "called in code" (!). Thought it was obvious, sorry. It's executed indeed through the func it is passed to, but not (explicitely) called. The point I meant is: it is locally used -- not called from anywhere else -- the reason why it needs no name. OK, I understand what you're saying now. But I don't agree with your position that if it isn't used locally it doesn't deserve a name. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Tue Mar 10 20:40:33 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 10 Mar 2009 15:40:33 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <20090310091936.6e782ca3@o> References: <913f9f570903090943t543a2946m3be2dfff9bc65fbe@mail.gmail.com> <200903100403.04733.steve@pearwood.info> <200903100450.56903.steve@pearwood.info> <20090310091936.6e782ca3@o> Message-ID: spir wrote: > Le Mon, 09 Mar 2009 18:04:32 -0400, Terry Reedy > s'exprima ainsi: > >> Lambdas are function-defining expressions used *within* statements >> that give the resulting function object a stock .__name__ of >> ''. The syntax could have been augmented to include a real >> name, so the stock-name anonymity is a side-effect of the chosen >> syntax. Possibilities include lambda name(args): expression lambda >> args: expression The latter, assuming it is LL(1) >> parse-able, would even be compatible with existing code and could >> still be added. >> >> Contrarywise, function-defining def statements could have been >> allowed to omit the name. To be useful, the object (with a >> .__name__ such as '', would have to get a default namespace >> binding such as to '_', even in batch mode. > > I do not agree with that. It is missing the point of lambdas. Lambdas > are snippets of code equivalent to expressions to be used in place. > Lambdas are *not* called, need not beeing callable, rather they are > *used* by higher order functions like map. The fact that they do not > have any name in syntax thus properly matches their semantic > "anonymousity" ;-) ??? A lambda expression and a def statement both produce function objects. The *only* difference between the objects produced by a lambda expression and the equivalent def statement is that the former gets the stock .__name__ of ', which is less useful that a specific name should there be a traceback. The important difference between a function-defining expression and statement is that the former can be used in expression context within statements and the latter cannot. tjr From tjreedy at udel.edu Tue Mar 10 21:00:50 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 10 Mar 2009 16:00:50 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <87iqmi5819.fsf@xemacs.org> References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <87iqmi5819.fsf@xemacs.org> Message-ID: Stephen J. Turnbull wrote: > Terry Reedy writes: > > > Like many other Pythonistas I recognize that that an uninformative stock > > name of '' is defective relative to an informative name that > > points back to readable code. What I dislike is the anonymity-cult > > claim that the defect is a virtue. > > That's unfair. It is unfair to dislike false statements? I think that *that* is unfair ;-) > Python has "anonymous blocks" all over the place, > since every control structure controls one or more of them. It simply > requires that they be forgotten at the next DEDENT. Surely you don't > advocate that each of them should get a name! Surely, I did not. And surely you cannot really think I suggested such. Every expression and every statement or group of statements defines a function on the current namespaces, but I was talking about Python function objects. And I never said that they necessarily should get an individual name (and indeed I went on to say that I too use 'f' and 'g' as stock, don't-care names) but only that I dislike the silly claim that being named '' is a virtue. And this was in the context you snipped of Aahz saying that some disliked the *use* of lambda expressions (as opposed to the promotion of their result as superior). > I think this is a difference of cognition. I do not think it a 'difference of cognition', in the usual sense of the term, to think that a more informative traceback is a teeny bit superior, and certainly not inferior, to a less informative traceback. Unless of course you mean that all disagreements are such. Terry Jan Reedy From tjreedy at udel.edu Tue Mar 10 21:15:01 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 10 Mar 2009 16:15:01 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: Guido van Rossum wrote: > On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy wrote: >> Like many other Pythonistas I recognize that that an uninformative stock >> name of '' is defective relative to an informative name that points >> back to readable code. What I dislike is the anonymity-cult claim that the >> defect is a virtue. >> >> Since I routinely use standard names 'f' and 'g' (from math) to name >> functions whose name I do not care about, I am baffled (and annoyed) by >> (repeated) claims such as "Having to name a one-off function adds additional >> cognitive overload to a developer." (Tav). Golly gee, if one cannot decide >> on standard one-char name, how can he manage the rest of Python? >> >> (I also, like others, routinely use 'C' for class and 'c' for C instance. >> What next? A demand for anonymous classes? Whoops, I just learned that >> Java has those.) >> >> But I have no problem with the use of lambda expressions as a convenience, >> where appropriate. > > Andrew Koening once gave me a good use case where lambdas are really a > lot more convenient than named functions. He was initializing a large > data structure that was used by an interpreter for some language. It > was a single expression (probably a list of tuples or a dict). Each > record contained various bits of information (e.g. the operator symbol > and its precedence and associativity) as well as a function (almost > always a very simple lambda) that implemented it. Since this table was > 100s of records long, it would have been pretty inconvenient to first > have to define 100s of small one-line functions and give them names, > only to reference them once in the initializer. Initializing such structures is one of the use cases I intended under 'where appropriate'. Adding more powerful expressions, like comprehensions (and g.e's) that do not break Python's basic syntactic model of mixed expressions and indented statements has added to the convenience. > This use case doesn't have a nice equivalent without anonymous > functions (though I'm sure that if there really was no other way it > could be done, e.g. using registration=style decorators). The convenience is from having function expressions. If the expression syntax allowed the optional attachment of a name, it would be just as convenient. In some cases, I am sure people would find it even more convenient if they could add in a name, especially when there is nothing else in the structure to serve as a substitute. 'Anonymous' is a different concept from 'expression-defined' despite the tendency to conflate the two. Terry Jan Reedy From guido at python.org Tue Mar 10 21:25:33 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 13:25:33 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy wrote: > Guido van Rossum wrote: >> >> On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy wrote: >>> >>> Like many other Pythonistas I recognize that that an uninformative stock >>> name of '' is defective relative to an informative name that >>> points >>> back to readable code. ?What I dislike is the anonymity-cult claim that >>> the >>> defect is a virtue. >>> >>> Since I routinely use standard names 'f' and 'g' (from math) to name >>> functions whose name I do not care about, I am baffled (and annoyed) by >>> (repeated) claims such as "Having to name a one-off function adds >>> additional >>> cognitive overload to a developer." (Tav). ?Golly gee, if one cannot >>> decide >>> on standard one-char name, how can he manage the rest of Python? >>> >>> (I also, like others, routinely use 'C' for class and 'c' for C instance. >>> ?What next? ?A demand for anonymous classes? Whoops, I just learned that >>> Java has those.) >>> >>> But I have no problem with the use of lambda expressions as a >>> convenience, >>> where appropriate. >> >> Andrew Koening once gave me a good use case where lambdas are really a >> lot more convenient than named functions. He was initializing a large >> data structure that was used by an interpreter for some language. It >> was a single expression (probably a list of tuples or a dict). Each >> record contained various bits of information (e.g. the operator symbol >> and its precedence and associativity) as well as a function (almost >> always a very simple lambda) that implemented it. Since this table was >> 100s of records long, it would have been pretty inconvenient to first >> have to define 100s of small one-line functions and give them names, >> only to reference them once in the initializer. > > Initializing such structures is one of the use cases I intended under 'where > appropriate'. ?Adding more powerful expressions, like comprehensions (and > g.e's) that do not break Python's basic syntactic model of mixed expressions > and indented statements has added to the convenience. > >> This use case doesn't have a nice equivalent without anonymous >> functions (though I'm sure that if there really was no other way it >> could be done, e.g. using registration=style decorators). > > The convenience is from having function expressions. ?If the expression > syntax allowed the optional attachment of a name, it would be just as > convenient. ?In some cases, I am sure people would find it even more > convenient if they could add in a name, especially when there is nothing > else in the structure to serve as a substitute. > > 'Anonymous' is a different concept from 'expression-defined' despite the > tendency to conflate the two. If I read you correctly you're saying that having an expression that returns a function (other than referencing it by name) is not the same as having anonymous functions. This sounds like quite the hairsplitting argument. Why is it important to you to split this particular hair? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Tue Mar 10 21:29:42 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 10 Mar 2009 16:29:42 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) In-Reply-To: <73022CA5-ED35-4FF9-A4B8-1013816F5288@gmail.com> References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> <73022CA5-ED35-4FF9-A4B8-1013816F5288@gmail.com> Message-ID: Leonardo Santagada wrote: > > On Mar 9, 2009, at 9:13 PM, Jan Kanis wrote: > >> def callback(param) as result with do_something(with_our(callback), >> other, args): >> print("called back with "+param) >> return foobar(param) >> >> >> this would be equivalent to >> >> def callback(param): >> print("called back with "+param) >> return foobar(param) >> >> result = do_something(with_our(callback), other_args) > > > Not only the equivalent code looks much cleaner, I completely agree. > the only good thing it > actually do (not having to first define a function to then use it) can > be accomplished with a decorator. If a decolib were ever assembled, a callback(receiving_func, args) would be a good one to include. I think I understand now that one of the reasons to use a decorator is to say what you are going to do with a function before you define it so that the person reading the definition can read it in that light. What I like is that the decorator form still leaves the definition cleanly separate from the context. tjr From tjreedy at udel.edu Tue Mar 10 21:34:06 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 10 Mar 2009 16:34:06 -0400 Subject: [Python-ideas] cd statement? In-Reply-To: <49B67E6C.6020206@molden.no> References: <49B67E6C.6020206@molden.no> Message-ID: Sturla Molden wrote: > > IDLE also lacks a command history. If I e.g. make a typo, why do I have > to copy and paste, instead of just hitting the arrow button? There is already a patch on the tracker to make IDLE's history work the same (with arrow keys) as the command window. It is one of about 70 open issues which are languishing due to the lack of a maintainer/committer. Someone perhaps volunteered a few days ago. From jimjjewett at gmail.com Tue Mar 10 23:05:02 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 10 Mar 2009 18:05:02 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <87iqmi5819.fsf@xemacs.org> Message-ID: On 3/10/09, Terry Reedy wrote: > ... I dislike the silly claim that > being named '' is a virtue. And this was in the context you > snipped of Aahz saying that some disliked the *use* of lambda > expressions (as opposed to the promotion of their result as superior). > Stephen J. Turnbull wrote: >> I think this is a difference of cognition. > I do not think it a 'difference of cognition', in the usual sense of the > term, to think that a more informative traceback is a teeny bit > superior, and certainly not inferior, to a less informative traceback. > Unless of course you mean that all disagreements are such. The question is which traceback will be more informative. A 50-Meg memory dump will be even more informative, but few people will want to sift through it. I *think* at least some of the lambda lovers are saying something akin to: ''' This little piece of logic isn't a worth naming as a section; it is just something that I would do interactively once I got to this point. I don't *want* a debugging pointer right to this line, I *want* to go to the enclosing function to get my bearings. ''' I'm not convinced, because I've seen so many times when a lambda actually is crucial to the bug. That said, I'm the sort of person who will break up and name subunits even if I have to resort to names like fn_XXX_slave_1. And I will certainly admit that there are times when it would be more useful if the traceback showed the source code of the lambda instead of showing or skipping the frame. -jJ From jimjjewett at gmail.com Tue Mar 10 23:22:03 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 10 Mar 2009 18:22:03 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: On 3/10/09, Guido van Rossum wrote: > On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy wrote: >>> On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy wrote: >>>> ... What I dislike is the anonymity-cult claim that the defect is a virtue. >> The convenience is from having function expressions. If the expression >> syntax allowed the optional attachment of a name, it would be just as >> convenient. In some cases, I am sure people would find it even more >> convenient if they could add in a name, especially when there is nothing >> else in the structure to serve as a substitute. >> 'Anonymous' is a different concept from 'expression-defined' despite the >> tendency to conflate the two. > If I read you correctly you're saying that having an expression that > returns a function (other than referencing it by name) is not the same > as having anonymous functions. This sounds like quite the > hairsplitting argument. Why is it important to you to split this > particular hair? An expression that *creates* and returns a function is useful. A way to create unnamed functions may or may not be useful. Right now, the two are tied together, as lambda is the best way to do either. Mentally untangling them might lead to better code. If the name in a def were optional, it would meet the perceived need for anonymity, but still wouldn't meet the need for creating and returning a function within a single expression. # Would this really ever be useful? # Not to me, but the anon-lovers suggest yes. # Cognition difference, or just confounding the two uses of lambda? def (a): return a+3 On the other hand, if def became an expression, it would meet the need for function-creating expressions (and would have at least reduced the need for decorators). add_callback(button1, def add3(a): return a+3) (And yes, I understand that there are reasons why class, def, and import do not return values, even if I sometimes wish they did.) -jJ From guido at python.org Tue Mar 10 23:27:28 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 15:27:28 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: On Tue, Mar 10, 2009 at 3:22 PM, Jim Jewett wrote: > On 3/10/09, Guido van Rossum wrote: >> On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy wrote: >>>> On Mon, Mar 9, 2009 at 4:09 PM, Terry Reedy wrote: > >>>>> ... What I dislike is the anonymity-cult claim that the defect is a virtue. > >>> The convenience is from having function expressions. ?If the expression >>> syntax allowed the optional attachment of a name, it would be just as >>> convenient. ?In some cases, I am sure people would find it even more >>> convenient if they could add in a name, especially when there is nothing >>> else in the structure to serve as a substitute. > >>> 'Anonymous' is a different concept from 'expression-defined' despite the >>> tendency to conflate the two. > >> If I read you correctly you're saying that having an expression that >> returns a function (other than referencing it by name) is not the same >> as having anonymous functions. This sounds like quite the >> hairsplitting argument. Why is it important to you to split this >> particular hair? > > An expression that *creates* and returns a function is useful. > > A way to create unnamed functions may or may not be useful. > > Right now, the two are tied together, as lambda is the best way to do > either. ?Mentally untangling them might lead to better code. I'm feeling really dense right now -- I still don't see the difference between the two. Are you saying that you would prefer an expression that creates a *named* function? That seems to be really bizarre -- like claiming that you don't like expressions that return anonymous numbers. > If the name in a def were optional, it would meet the perceived need > for anonymity, but still wouldn't meet the need for creating and > returning a function within a single expression. Moreover, unless you used a decorator, there would be no way to do anything with the anonymous function, so it would be useless. > ? ?# Would this really ever be useful? > ? ?# Not to me, but the anon-lovers suggest yes. > ? ?# Cognition difference, or just confounding the two uses of lambda? > ? ?def (a): return a+3 > > On the other hand, if def became an expression, it would meet the need > for function-creating expressions (and would have at least reduced the > need for decorators). I don't see the conceptual difference between a "def-expression" (if it were syntactically possible) and a lambda-expression. What is the difference in your view? Are you sure that difference exists? (It wouldn't be the first time that people ascribe powers to lambda that it doesn't have. :-) > ? ?add_callback(button1, def add3(a): return a+3) Two questions about this example: (1) Do you expect the name 'add3' to be bound in the surrounding scope? (2) What is the purpose of the name other than documenting the obvious? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dreamingforward at gmail.com Wed Mar 11 00:50:13 2009 From: dreamingforward at gmail.com (average) Date: Tue, 10 Mar 2009 16:50:13 -0700 Subject: [Python-ideas] Python-ideas Digest, Vol 28, Issue 19 In-Reply-To: References: Message-ID: <913f9f570903101650g3d21e500q524312f7d60dd368@mail.gmail.com> From: "Stephen J. Turnbull" Terry Reedy writes: > > Like many other Pythonistas I recognize that that an uninformative stock > > name of '' is defective relative to an informative name that > > points back to readable code. What I dislike is the anonymity-cult > > claim that the defect is a virtue. > > That's unfair. Python has "anonymous blocks" all over the place, > since every control structure controls one or more of them. It simply > requires that they be forgotten at the next DEDENT. Surely you don't > advocate that each of them should get a name! > > I think this is a difference of cognition. Specifically, people who > don't want to name blocks as functions may not abstract processes to > signatures as easily, and reify whole processes (including all free > identifiers!) as objects more easily, as those who don't think naming > is a problem. > > > Since I routinely use standard names 'f' and 'g' (from math) to name > > functions whose name I do not care about, I am baffled (and annoyed) by... > > If the cognition hypothesis is correct, of course you're baffled. You > "just don't" think that way, while he really does. The annoyance can > probably be relieved by s/a developer/some developers/ here: I'm glad you're bringing out the cognitive aspect of this, because to me, though it may seem "gratuitously mystical or mystifying" , there is an essential epistemological component to this issue related to the [bidirectional] cognitive mapping of and between mathematical <-- and --> psychological identity and the confusion stems from the inability to frame the issue around a single linear "classical" construct as generally imposed by typed text (did you follow that?). Even writing this sentence confounds my multidimensional meaning which I'm trying to compress into standard language constructs. Normally, I'd use hand gestures and inflection to split out the two different orthogonal aspects of what I'm trying to convey which simply cannot occupy the same space at the same time, but (obviously) that luxury is unavailable here. So in the end, unless there's sufficient purchase in the listener, I have to be somewhat content sounding like a moron spouting gibberish. What Tav's proposal, in my mind, is aiming to do is provide greater syntactic support within Python so as to minimize cognitive gibberish when the code is reified in the mind of the viewer. Of course, it doesn't help that were culturally trained into VonNeuman architecture-thinking were such conflation of dimensionality is built into the hardware itself. Really, like Stephan is pointing out, "re-ification" *IS* the best analogy to help elucidate of this issue (better in German: Verdinglichung). See wikipedia's "Reification (Marxism)" (--though be prepared that, depending on your state of mind, it will either make sense or sound like its logic is [perfectly] backward, like some flipped bit because it borders that special interplay between subject-object.) These kind of [Anonymous] functions/code blocks explicitly tell the user that "This is NOT part of my program", yet (due to the classical, flat nature of standard computer programming) I must "include" (in a constrained way since I'm not able to include the context or externalized identity in which this code will be run) it here [in my editor window text] even though its logical geometry is orthogonal to my program. It's like a vortex out of flatland--an interface into a different dimension, hence it's difficulty in explaining it to the natives of flatlandia. To put a name on it puts an identity label upon something pointing in the wrong direction (i.e. to the surrounding code) which isn't *meant* to be an an independent block of usable code or be part of the social context of its surroundings. It's like seeing your own body's innards mapped inside-out into a computer program and calling it "marcos" while I continue to function normally in some other dimensionality in some mysterious way to magically maintain my normal cognition elsewhere. Better to see those innards as anonymous data (that for whatever reason I'm needing to interface to) even though they are perfectly functioning blocks with an identity elsewhere (i.e.: me). So, yes, "anonymity" can be a virtue from a given perspective. ...Seems to be a parallel to meta-programming but on the other side of the scale--instead of abstracting "upwards" into greater levels of abstraction, it abstracts sideways and downwards into levels of concreteness. Naming in both cases is problematic if you want to avoid the categorical errors easily made by the flatland of the typed text. gibberish? marcos From dreamingforward at gmail.com Wed Mar 11 01:01:41 2009 From: dreamingforward at gmail.com (average) Date: Tue, 10 Mar 2009 17:01:41 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea Message-ID: <913f9f570903101701m344c702cu317460fe42df6950@mail.gmail.com> FW: Sorry, forgot to change the subject line for the sake of threaded mail readers and archives.... ---------- Forwarded message ---------- From: "Stephen J. Turnbull" Terry Reedy writes: > > Like many other Pythonistas I recognize that that an uninformative stock > > name of '' is defective relative to an informative name that > > points back to readable code. ?What I dislike is the anonymity-cult > > claim that the defect is a virtue. > > That's unfair. ?Python has "anonymous blocks" all over the place, > since every control structure controls one or more of them. ?It simply > requires that they be forgotten at the next DEDENT. ?Surely you don't > advocate that each of them should get a name! > > I think this is a difference of cognition. ?Specifically, people who > don't want to name blocks as functions may not abstract processes to > signatures as easily, and reify whole processes (including all free > identifiers!) as objects more easily, as those who don't think naming > is a problem. > > > Since I routinely use standard names 'f' and 'g' (from math) to name > > functions whose name I do not care about, I am baffled (and annoyed) by... > > If the cognition hypothesis is correct, of course you're baffled. ?You > "just don't" think that way, while he really does. ?The annoyance can > probably be relieved by s/a developer/some developers/ here: I'm glad you're bringing out the cognitive aspect of this, because to me, though it may seem "gratuitously mystical or mystifying" , there is an essential epistemological component to this issue related to the [bidirectional] cognitive mapping of and between mathematical <-- and --> psychological identity and the confusion stems from the inability to frame the issue around a single linear "classical" construct as generally imposed by typed text (did you follow that?). ?Even writing this sentence confounds my multidimensional meaning which I'm trying to compress into standard language constructs. ?Normally, I'd use hand gestures and inflection to split out the two different orthogonal aspects of what I'm trying to convey which simply cannot occupy the same space at the same time, but (obviously) that luxury is unavailable here. ?So in the end, unless there's sufficient purchase in the listener, I have to be somewhat content sounding like a moron spouting gibberish. What Tav's proposal, in my mind, is aiming to do is provide greater syntactic support within Python so as to minimize cognitive gibberish when the code is reified in the mind of the viewer. ? Of course, it doesn't help that were culturally trained into VonNeuman architecture-thinking were such conflation of dimensionality is built into the hardware itself. ?Really, like Stephan is pointing out, "re-ification" *IS* the best analogy to help elucidate of this issue (better in German: Verdinglichung). ?See wikipedia's "Reification (Marxism)" ?(--though be prepared that, depending on your state of mind, it will either make sense or sound like its logic is [perfectly] backward, like some flipped bit because it borders that special interplay between subject-object.) These kind of [Anonymous] functions/code blocks explicitly tell the user that "This is NOT part of my program", yet (due to the classical, flat nature of standard computer programming) I must "include" (in a constrained way since I'm not able to include the context or externalized identity in which this code will be run) it here [in my editor window text] even though its logical geometry is orthogonal to my program. ?It's like a vortex out of flatland--an interface into a different dimension, hence it's difficulty in explaining it to the natives of flatlandia. ?To put a name on it puts an identity label upon something pointing in the wrong direction (i.e. to the surrounding code) which isn't *meant* to be an an independent block of usable code or be part of the social context of its surroundings. It's like seeing your own body's innards mapped inside-out into a computer program and calling it "marcos" while I continue to function normally in some other dimensionality in some mysterious way to magically maintain my normal cognition elsewhere. ?Better to see those innards as anonymous data (that for whatever reason I'm needing to interface to) even though they are perfectly functioning blocks with an identity elsewhere (i.e.: ?me). ?So, yes, "anonymity" can be a virtue from a given perspective. ...Seems to be a parallel to meta-programming but on the other side of the scale--instead of abstracting "upwards" into greater levels of abstraction, it abstracts ?sideways and downwards into levels of concreteness. ?Naming in both cases is problematic if you want to avoid the categorical errors easily made by the flatland of the typed text. gibberish? marcos From jan.kanis at phil.uu.nl Wed Mar 11 01:11:34 2009 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Wed, 11 Mar 2009 01:11:34 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) In-Reply-To: <20090310095822.0f957a4d@o> References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> <20090310095822.0f957a4d@o> Message-ID: <59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com> On Tue, Mar 10, 2009 at 09:58, spir wrote: > Le Tue, 10 Mar 2009 01:13:42 +0100, > Jan Kanis s'exprima ainsi: > >> This being python-ideas, I'll also have a go at it. >> >> Being someone who does like functional programing when used in limited >> quantities, I also think multi line lambdas (or blocks, whatever you >> call them) are a good thing if a good way could be found to embed them >> into Python. But I don't like the part of tavs proposal of handling >> them with a magic __do__ function. So what about this slightly >> modified syntax and semantics: >> >> def NAME(ARGS) [as OTHERNAME] with EXPRESSION_CONTAINING_NAME: >> ? ? ? BODY >> >> eg: >> >> ?def callback(param) as result with do_something(with_our(callback), >> other, args): >> ? ? ? print("called back with "+param) >> ? ? ? return foobar(param) > > I like this proposal much more than all previous ones. > Still, how would you (or anybody else) introduce the purpose, meaning, use of this construct, and its language-level semantics? [This is not disguised critics, neither rethoric question: I'm really interested in answers.] >From my side, this is more of an idea I had than something I've thought all the way through, so while I think it looks nice it's not something I'm 100% committed to. It sprung up in my mind as a slightly better alternative to tavs blocks proposal. The language level semantics are easily explained by showing the equivalent code I gave in my first mail. The purpose and use (and meaning?) are the same as with tavs original proposal for blocks, and with those of multistatement lambdas. So I guess a major consideration is whether someone thinks multiline lambdas are in principle a good idea (setting aside that they can't be implemented in full generality non-ugly in the python grammar). On Tue, Mar 10, 2009 at 17:33, Guido van Rossum wrote: > ... to accomplish something you can already > do just as easily with a decorator. Almost the same thing could be done with decorators, the difference being that the final name the result is assigned to occurs in a weird place, and that decorators are restricted to one-argument functions. The first could be solved by allowing an 'as name' clause on decorators (like Carl Johnson is proposing in the other thread), the second by e.g. allowing lambdas as decorators so one could do @lambda f: do_something(with_our(f)) def result(param): print("called back with "+param) return foobar(param) But I don't really have a specific actual use case for this, so I think I'll just have to go with Guidos gut feeling[1], they seem to work out, usually. [1] http://www.python.org/dev/peps/pep-0318/#id32 From jimjjewett at gmail.com Wed Mar 11 03:44:43 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 10 Mar 2009 22:44:43 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: On 3/10/09, Guido van Rossum wrote: > On Tue, Mar 10, 2009 at 3:22 PM, Jim Jewett wrote: >> On 3/10/09, Guido van Rossum wrote: >>> On Tue, Mar 10, 2009 at 1:15 PM, Terry Reedy wrote: >>>> 'Anonymous' is a different concept from 'expression-defined' despite >>>> the tendency to conflate the two. >>> ... Why is it important to you to split this particular hair? >> An expression that *creates* and returns a function is useful. >> A way to create unnamed functions may or may not be useful. >> Right now, the two are tied together, as lambda is the best way to do >> either. Mentally untangling them might lead to better code. > I'm feeling really dense right now -- I still don't see the difference > between the two. Are you saying that you would prefer an expression > that creates a *named* function? Yes. The __name__ attribute might never be used, but I personally still prefer that it be meaningful. When someone says they need anonymous functions, I hear: "I really, really *need* the __name__ to be useless!" In the past, I had sometimes just assumed they were wrong. Stephen has given me a glimpse of a mindset which really might need the name to be useless. But neither he nor I normally think that way, so it might still be YAGNI. Terry has pointed out that they may actually mean "I need def to be an expression", with the reference to anonymity being a red herring, because the two concepts are currently confounded. >> If the name in a def were optional, it would meet the perceived need >> for anonymity, but still wouldn't meet the need for creating and >> returning a function within a single expression. > Moreover, unless you used a decorator, there would be no way to do > anything with the anonymous function, so it would be useless. Thus my troubles seeing the point of the people who care about "anonymous functions." >> On the other hand, if def became an expression, it would meet the need >> for function-creating expressions (and would have at least reduced the >> need for decorators). > I don't see the conceptual difference between a "def-expression" (if > it were syntactically possible) and a lambda-expression. What is the > difference in your view? The only differences *I* see are syntactical warts in lambda. That said, I may be missing something myself, as lambda has had passionate defenders. >> add_callback(button1, def add3(a): return a+3) > (1) Do you expect the name 'add3' to be bound in the surrounding scope? No. But I agree that expectations would differ, so that either would be acceptable, but either would feel like a wart to at least some people. > (2) What is the purpose of the name other than documenting the obvious? It isn't always quite this obvious. It is more useful in tracebacks. (Though perhaps printing source instead of name would be even better, for functions short enough to be reasonable expressions.) The __name__ is available in case you want to use it for a dispatch table, or to populate fields in an alternative User Interface. (For example, accessibility APIs) -jJ From guido at python.org Wed Mar 11 03:52:26 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 10 Mar 2009 19:52:26 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: On Tue, Mar 10, 2009 at 7:44 PM, Jim Jewett wrote: > When someone says they need anonymous functions, I hear: > > ? ? "I really, really *need* the __name__ to be useless!" I think you need your hearing tuned. By convention anonymous functions and expressions creating functions are almost always synonymous, in almost all languages. So people use the shorter term "anonymous functions" when what they really care about is "expression syntax for creating new functions on the fly". >> I don't see the conceptual difference between a "def-expression" (if >> it were syntactically possible) and a lambda-expression. What is the >> difference in your view? > > The only differences *I* see are syntactical warts in lambda. Well, those have been discussed at length and depth, and nobody has come up with an acceptable syntax to embed a block of statements in the midst of an expression, in Python. That's why they are separate. > The __name__ is available in case you want to use it for a dispatch > table, or to populate fields in an alternative User Interface. ?(For > example, accessibility APIs) When people want the name, they can give it a name using a def statement. I don't accept your argument against that which seems to go along the lines of "but maybe they might want the name later". You can write unreadable code without lambda too. And yes, lambda can be abused; for a really evil example see this recipe: http://code.activestate.com/recipes/148061/ (note the mis-use of the term "one liner" :-). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tav at espians.com Wed Mar 11 04:34:57 2009 From: tav at espians.com (tav) Date: Wed, 11 Mar 2009 03:34:57 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <20090309154505.GA18115@panix.com> Message-ID: Dearest all, Forgive me if I seem frustrated, but it's been a very long day. After catching up on a lot of the discussion here, I am feeling something between astonishment and disappointment. Forgive me as I am new here, but I had expected python-ideas to be filled with remarkable people. Instead I find that many people don't even understand *why* Python is the way it is!! Never mind understanding what it is that they are talking about. It seems as if jumping in with an opinion counts for more than doing some basic research. Forgive me, but the likes of equating expressions with statements or the desire for anonymous functions with some bizarre dislike of the __name__ attribute just shouts out sheer ignorance/stupidity. I had expected more. From everyone. As for the various lambda proposals I've seen, none of them are anywhere near Pythonic!! Some of you, could you please google "lambda site:mail.python.org" and then read for a little while? For some bizarre reason, I had expected those on this list to be Masters of Python. If not why would they care to improve the language? But some of you clearly are not and should spend a lot of time *reading*. And, please, get hold of as many Python libraries as you can and read the code until you feel you truly do get the essence of what is Pythonic. I am really sorry if I've offended anyone. Frustrations aside, I do mean this in a constructive way. If we all individually would apply a little bit more effort, then we'd collectively benefit and have a much better language. Thanks for bearing with me. -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From stephen at xemacs.org Wed Mar 11 04:43:29 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 11 Mar 2009 12:43:29 +0900 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <87iqmi5819.fsf@xemacs.org> Message-ID: <871vt467xa.fsf@xemacs.org> Terry Reedy writes: > Stephen J. Turnbull wrote: > > Terry Reedy writes: > > > Like many other Pythonistas I recognize that that an > > > uninformative stock name of '' is defective relative > > > to an informative name that points back to readable code. > > > What I dislike is the anonymity-cult claim that the defect is > > > a virtue. > > > > That's unfair. > > It is unfair to dislike false statements? No. It is unfair to use factives: the proponents of code blocks don't claim that the uninformative name is a virtue. They claim that the effort required to deal with an unnecessary name is a defect, and so removing that effort is a virtue. You evidently have no answer for that argument, so you reinterpret in precisely the kind of word- twisting way that bothers you when I do it: > > Python has "anonymous blocks" all over the place, since every > > control structure controls one or more of them. It simply > > requires that they be forgotten at the next DEDENT. Surely you > > don't advocate that each of them should get a name! > > Surely, I did not. And surely you cannot really think I suggested such. That was a rhetorical question, properly marked as such with a exclamation point rather than a question mark. Not to mention being immediately preceded by the obviously correct rationale for the obviously correct answer. How did you miss it? I did have a real point, which was that if the only use of an anonymous block is immediately juxtaposed to that DEDENT, is there really any harm to the lack of the name? In fact, in debugging you have the name of the using function (which should be short and readable, or you're not going to have fun anyway), and a line number, so there is no trouble identifying the problematic code, nor the execution history that led to it. A good debugger might even provide the arguments to the code block as part of the stack trace, which you would have to go to extra effort to get if it were presented merely as a suite. Ie, in this kind of use case a code block could be considered a kind of meta-syntax that tells debuggers "these variables are of interest, so present me, and them, as a pseudo-stack frame". The sticking point, AIUI, is that the code block proponents have not identified a use case where the anonymous block is immediately used, while there is no Pythonic equivalent. So that "harmless" (YMMV) extension is unnecessary, and the extension violates TOOWTDI if that's all it's good for. The real power of "code blocks" (that Python doesn't have) comes when they can be passed around as objects ... but there doesn't seem to be a way to define them so as to exclude the obnoxious uses such as in callbacks, or indeed such a use case other than callbacks. And that violates the Pythonista's sense of good style. So what I want to know (and my question is directed to the code block proponents, not to you) is where is the Pythonic use case? All those Ruby programmers can't be wrong ... can they? Of course they can! But even so, I'd like to understand what the code block proponents think they're seeing that isn't there, at least not for you and me. *We* (including the BDFL!?) could be wrong, too, maybe there is something special about code blocks that Python could benefit from incorporating. Or maybe there's a better way to teach Python, so that people will use Pythonic idioms instead of reaching for "code blocks." From ben+python at benfinney.id.au Wed Mar 11 04:52:42 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 11 Mar 2009 14:52:42 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea References: <20090309154505.GA18115@panix.com> Message-ID: <87sklkzpf9.fsf@benfinney.id.au> tav writes: > After catching up on a lot of the discussion here, I am feeling > something between astonishment and disappointment. Forgive me as I > am new here, but I had expected python-ideas to be filled with > remarkable people. I'm glad your expectations have been re-adjusted early on. I don't know what would have led you to such an expectation. > Instead I find that many people don't even understand *why* Python > is the way it is!! That's because there are many people who have yet to find these things out. Such people are not excluded from this list. > Never mind understanding what it is that they are talking about. It > seems as if jumping in with an opinion counts for more than doing > some basic research. Again, that state may be unfortunate, but I don't know why you would have expectations that it would be otherwise. > For some bizarre reason, I had expected those on this list to be > Masters of Python. That is rather bizarre. If you find something that led you to expect that, please let us know so the fallacy can be corrected. > If not why would they care to improve the language? Because they are the Users of Python who are interested in improving the language. -- \ ?During my service in the United States Congress, I took the | `\ initiative in creating the Internet.? ?Al Gore | _o__) | Ben Finney From tav at espians.com Wed Mar 11 06:08:30 2009 From: tav at espians.com (tav) Date: Wed, 11 Mar 2009 05:08:30 +0000 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <87sklkzpf9.fsf@benfinney.id.au> References: <87sklkzpf9.fsf@benfinney.id.au> Message-ID: Hey Ben and others, Sorry if my previous message came across as being rude. It really wasn't meant so. And it definitely wasn't meant as a personal attack on anyone. I just wish that a little bit of time would be taken before jumping in with comments. As for Python-ideas, I had taken as granted that people posting would have more than a passing interest in language design and the nature of Python. Obviously a false assumption. I apologise for this. I think there is a *lot* of value in many of the ideas that float around. On this list, python-dev, irc channels and even in the blogosphere. The problem is that few of these ideas get the real time and attention that they deserve. And seeing as all of our time is limited. And that the resources for the development of Python itself is limited. Ideally we would focus them more constructively and selectively. But then, this is the internet -- I should stop being an idealist ;p Thanks again for bearing with me. -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From stephen at xemacs.org Wed Mar 11 06:05:28 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 11 Mar 2009 14:05:28 +0900 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> Message-ID: <87zlfs4pk7.fsf@xemacs.org> Guido van Rossum writes: > I'm feeling really dense right now -- I still don't see the difference > between the two. Are you saying that you would prefer an expression > that creates a *named* function? That seems to be really bizarre -- > like claiming that you don't like expressions that return anonymous > numbers. Here's a use-case from Emacs. Various modes have callbacks so that users can customize them. A typical case is that a text-mode hook will turn on auto-fill-mode, which is documented as an example to be done like this: (add-hook 'text-mode-hook (lambda () (auto-fill-mode 1))) where *-mode functions called with nil toggle the mode, positive numbers turn it on, and non-positive numbers turn it off. The add-hook function is supposed to be idempotent: it won't add the same hook function if it is already in the hook. The problem is if you change the lambda and execute that form, the changed lambda is now not identical to the lambda on the hook, so the old version won't be removed. This (add-hook 'text-mode-hook (defun turn-on-auto-fill () (auto-fill-mode 1))) neatly avoids the problem by returning the name of the function, the symbol `turn-on-auto-fill', which is callable and so suitable for hanging on the hook. If you change the definition and execute the above form, add-hook *mutates nothing* (the symbol is already present), but because the hook is indirect through the function's symbol and the defun *is* executed, the definition changes ... which is exactly what you want.[1] AMK's use-case could be post-processed as something like (let ((i 0)) (mapcar (lambda () (let ((name (intern (format "foo-%d-callback" i)))) (define-function name (aref slots i)) (aset slots i name) (setq i (1+ i)) name)) slots)) where slots is the vector of anonymous functions. Providing names in this way costs one indirection per callback invocation in Emacs Lisp, but the benefits in readability of tracebacks are large, especially for compiled code. > I don't see the conceptual difference between a "def-expression" (if > it were syntactically possible) and a lambda-expression. What is the > difference in your view? Are you sure that difference exists? (It > wouldn't be the first time that people ascribe powers to lambda that > it doesn't have. :-) AIUI, a def-expression binds a callable to an object, while a lambda expression returns a callable. An anonymous def is just lambda by a different name (and I think the code block proponents agree, based on their willingness to accept syntax using def instead of lambda). I don't see how the kind of thing exemplified above would be useful in Python, and from a parallel reply I just saw, I gather Jim agrees. The point is to show how a function-defining expression can be useful in some contexts. This works in Emacs Lisp because in (setq foo (lambda ...)) tools (including the Lisp interpreter) will not recognize foo as a function identifier although its value is a function, while (defun foo ...) marks foo as a function identifier. But in Python (like Scheme) they're basically the same operation, with a little syntactic sugar. Anything that is based on the separation of variable namespace from function namespace is DOA, right? The renaming mapper is a different issue, I think; it depends on computing object names at runtime (ie, the Lisp `intern' operation), not on separate namespaces. I'm not sure offhand how to do that in Python, or even it it's possible; I've never wanted it. Footnotes: [1] N.B. Of course modern Emacsen define turn-on-auto-fill as a standard function. But this is ugly (because of the single flat namespace of Emacs Lisp), and not all modes have their turn-on, turn-off variants. auto-fill-mode was chosen because (a) the semantics are easy to imagine and (b) the use of lambda in a hook is explained by exactly this example in the Emacs Lisp Manual. From tav at espians.com Wed Mar 11 06:13:11 2009 From: tav at espians.com (tav) Date: Wed, 11 Mar 2009 05:13:11 +0000 Subject: [Python-ideas] cd statement? In-Reply-To: <49B69558.3090000@molden.no> References: <49B67E6C.6020206@molden.no> <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com> <49B69558.3090000@molden.no> Message-ID: Hey Sturla, >> Have you tried IPython? > > Yes, it has all that I miss, but it's ugly (at least on Windows, where it > runs in a DOS shell). Have you tried running it with http://sourceforge.net/projects/console/ ? I found that to be a lot prettier than the standard DOS prompt. And IPython really is great -- it increases your productivity in Python dramatically. Especially with its ? and ?? commands. I would heartily recommend it. -- love, tav plex:espians/tav | tav at espians.com | +44 (0) 7809 569 369 http://tav.espians.com | http://twitter.com/tav | skype:tavespian From ben+python at benfinney.id.au Wed Mar 11 06:29:05 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 11 Mar 2009 16:29:05 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea References: <20090309154505.GA18115@panix.com> <87sklkzpf9.fsf@benfinney.id.au> Message-ID: <87ocw8zkym.fsf@benfinney.id.au> Ben Finney writes: > tav writes: > > For some bizarre reason, I had expected those on this list to be > > Masters of Python. > > That is rather bizarre. If you find something that led you to expect > that, please let us know so the fallacy can be corrected. My message had rather an imperious tone that was not intended. I hasten to note that I'm not claiming any special status for myself with regard to this list, or Python's community. I'm a mere interested party, and my response was not intended to speak authoritatively about How Things Are?. My apologies for any mistaken impressions I might have given. -- \ ?Every valuable human being must be a radical and a rebel, for | `\ what he must aim at is to make things better than they are.? | _o__) ?Niels Bohr | Ben Finney From leif.walsh at gmail.com Wed Mar 11 08:25:16 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Wed, 11 Mar 2009 03:25:16 -0400 (EDT) Subject: [Python-ideas] Ruby-style Blocks in Python Idea [with examples] In-Reply-To: Message-ID: Apart from the fact that I don't think blocks are needed for any of the above, I feel compelled to poke a hole in one of your examples: On Mon, Mar 9, 2009 at 2:47 PM, tav wrote: > # Django/App Engine Query > > Frameworks like Django or App Engine define DSLs to enable easy > querying of datastores by users. Wouldn't it better if this could be > done in pure Python syntax? > > Compare the current Django: > > ?q = Entry.objects.filter(headline__startswith="What").filter(pub_date__lte=datetime.now()) > > with a hypothetical: > > ?using Entry.filter do (entry): > ? ? ?if entry.headline.startswith('What') and entry.pub_date <= datetime.now(): > ? ? ? ? ?return entry > > Wouldn't the latter be easier for a developer to read/maintain? Probably, but it doesn't matter, since Django lazily evaluates chained querys, and doesn't actually evaluate anything until you tell it you want elements. I don't believe there's any way to do this with blocks. -- Cheers, Leif -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature URL: From cmjohnson.mailinglist at gmail.com Wed Mar 11 08:36:29 2009 From: cmjohnson.mailinglist at gmail.com (Carl Johnson) Date: Tue, 10 Mar 2009 21:36:29 -1000 Subject: [Python-ideas] [Python-Dev] Deprecated __cmp__ and total ordering In-Reply-To: References: <49B671BB.1030207@voidspace.org.uk> Message-ID: <3bdda690903110036q34529d25va8671874bab9cbb@mail.gmail.com> > The basic idea isn't controversial, but there probably would > be a lengthy discussion on what to call it (total_ordering is one > possibilty) and where to put it (functools is a possibility). It's not really a *func* tool though. Maybe there should be a classtools module? Can anyone think of other things to put into such a module if it were to exist? If nothing else, the existing classmethod, etc. family could be mirrored out there. -- Carl Johnson From denis.spir at free.fr Wed Mar 11 12:28:28 2009 From: denis.spir at free.fr (spir) Date: Wed, 11 Mar 2009 12:28:28 +0100 Subject: [Python-ideas] Python-ideas Digest, Vol 28, Issue 19 In-Reply-To: <913f9f570903101650g3d21e500q524312f7d60dd368@mail.gmail.com> References: <913f9f570903101650g3d21e500q524312f7d60dd368@mail.gmail.com> Message-ID: <20090311122828.37e871ac@o> Le Tue, 10 Mar 2009 16:50:13 -0700, average s'exprima ainsi: > What Tav's proposal, in my mind, is aiming to do is provide greater > syntactic support within Python so as to minimize cognitive gibberish > when the code is reified in the mind of the viewer. Of course, it > doesn't help that were culturally trained into VonNeuman > architecture-thinking were such conflation of dimensionality is built > into the hardware itself. Really, like Stephan is pointing out, > "re-ification" *IS* the best analogy to help elucidate of this issue > (better in German: Verdinglichung). See wikipedia's "Reification > (Marxism)" (--though be prepared that, depending on your state of > mind, it will either make sense or sound like its logic is [perfectly] > backward, like some flipped bit because it borders that special > interplay between subject-object.) > > These kind of [Anonymous] functions/code blocks explicitly tell the > user that "This is NOT part of my program", yet (due to the classical, > flat nature of standard computer programming) I must "include" (in a > constrained way since I'm not able to include the context or > externalized identity in which this code will be run) it here [in my > editor window text] even though its logical geometry is orthogonal to > my program. It's like a vortex out of flatland--an interface into a > different dimension, hence it's difficulty in explaining it to the > natives of flatlandia. To put a name on it puts an identity label > upon something pointing in the wrong direction (i.e. to the > surrounding code) which isn't *meant* to be an an independent block of > usable code or be part of the social context of its surroundings. > It's like seeing your own body's innards mapped inside-out into a > computer program and calling it "marcos" while I continue to function > normally in some other dimensionality in some mysterious way to > magically maintain my normal cognition elsewhere. Better to see those > innards as anonymous data (that for whatever reason I'm needing to > interface to) even though they are perfectly functioning blocks with > an identity elsewhere (i.e.: me). So, yes, "anonymity" can be a > virtue from a given perspective. > > ...Seems to be a parallel to meta-programming [...] Indeed. In the concatenative jargon such code-data constructs are called "quotations". :squares [dup *] map ... [1 2 3] squares yields [1 4 9] [dup *] (dup means duplicate) is sensibly called a quotation, I guess, by analogy to meta-linguistic expressions that "objectify" a snippet of speech. [dup *] holds the literal expression of a valid func def, as illustrated by: :square dup * It is pushed on the data stack that should already hold a sequence ([1 2 3]), then both are data items used by map. Read: the higher order func map takes a func def and a sequence as arguments. Now this is alien (fremd ;-) to anybody used to languages in which code is not, conceptually, *really* data -- even if it has a type and can be denoted, in python, simply by letting down the (). denis ------ la vita e estrany From steve at pearwood.info Wed Mar 11 13:03:58 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 11 Mar 2009 23:03:58 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <913f9f570903101701m344c702cu317460fe42df6950@mail.gmail.com> References: <913f9f570903101701m344c702cu317460fe42df6950@mail.gmail.com> Message-ID: <200903112303.58784.steve@pearwood.info> On Wed, 11 Mar 2009 11:01:41 am average wrote: > These kind of [Anonymous] functions/code blocks explicitly tell the > user that "This is NOT part of my program" I'm a user, and they don't tell me any such thing. By the way, that word, "explicitly" -- I think you are using "explicitly" as if it actually meant "implicitly". If the function were to explicitly tell the user, it would look something like this: caller( function_that_is_not_part_of_the_program x: x+1 ) which of course is ridiculously verbose. What you say might be true if lambda had that extended meaning, but it doesn't. In Python, lambda merely means "create a nameless function object from a single expression". It's also nonsense, because there the function is, inside the program as clear as day. Arguing that a function that appears inside your program is not part of your program is rather like saying that your stomach is not part of your body. Perhaps what you are getting at is that anonymous functions blur the difference between code and data, and that if you consider them as data, then they are outside the program in some sense? That's not entirely unreasonable: it's common to distinguish between code and data, even though fundamentally they're all just bits and, really, the distinction is all in our mind. (That doesn't mean the distinction isn't important.) But even accepting that an anonymous function used as data is outside of the program in some sense, anonymity is strictly irrelevant. If you consider the lambda function here: caller(lambda x: x+1) to be data, then so is the named function foo here: def foo(x): return x+1 caller(foo) [...] > gibberish? I'm trying to give you the benefit of the doubt. -- Steven D'Aprano From sturla at molden.no Wed Mar 11 13:26:51 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Mar 2009 13:26:51 +0100 Subject: [Python-ideas] math module and complex numbers Message-ID: <49B7AE0B.5040004@molden.no> >>> import math >>> math.sqrt(-1) Traceback (most recent call last): File "", line 1, in math.sqrt(-1) ValueError: math domain error I'd say math.sqrt(-1) should return 1j. Sturla Molden From veloso at verylowsodium.com Wed Mar 11 13:34:48 2009 From: veloso at verylowsodium.com (Greg Falcon) Date: Wed, 11 Mar 2009 08:34:48 -0400 Subject: [Python-ideas] math module and complex numbers In-Reply-To: <49B7AE0B.5040004@molden.no> References: <49B7AE0B.5040004@molden.no> Message-ID: <3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com> On Wed, Mar 11, 2009 at 8:26 AM, Sturla Molden wrote: >>>> import math >>>> math.sqrt(-1) > > Traceback (most recent call last): > ?File "", line 1, in > ? math.sqrt(-1) > ValueError: math domain error > > > I'd say math.sqrt(-1) should return 1j. >>> import cmath >>> cmath.sqrt(-1) 1j Greg F From sturla at molden.no Wed Mar 11 14:12:10 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Mar 2009 14:12:10 +0100 Subject: [Python-ideas] math module and complex numbers In-Reply-To: <3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com> References: <49B7AE0B.5040004@molden.no> <3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com> Message-ID: <49B7B8AA.9050305@molden.no> Greg Falcon wrote: >>>> import cmath >>>> cmath.sqrt(-1) >>>> > 1j > What is the point of having two math modules? It just adds confusion, like pickle and cPickle. S.M. From phd at phd.pp.ru Wed Mar 11 14:31:30 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Wed, 11 Mar 2009 16:31:30 +0300 Subject: [Python-ideas] math module and complex numbers In-Reply-To: <49B7B8AA.9050305@molden.no> References: <49B7AE0B.5040004@molden.no> <3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com> <49B7B8AA.9050305@molden.no> Message-ID: <20090311133130.GA21137@phd.pp.ru> On Wed, Mar 11, 2009 at 02:12:10PM +0100, Sturla Molden wrote: > Greg Falcon wrote: >>>>> import cmath >>>>> cmath.sqrt(-1) >>>>> >> 1j >> > > What is the point of having two math modules? It just adds confusion, > like pickle and cPickle. 'c' in 'cmath' stands for 'complex'. There is a difference between float and complex math. See http://docs.python.org/library/math.html : "These functions cannot be used with complex numbers; use the functions of the same name from the cmath module if you require support for complex numbers. The distinction between functions which support complex numbers and those which don't is made since most users do not want to learn quite as much mathematics as required to understand complex numbers. Receiving an exception instead of a complex result allows earlier detection" Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From guido at python.org Wed Mar 11 15:26:29 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Mar 2009 07:26:29 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <87zlfs4pk7.fsf@xemacs.org> References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <87zlfs4pk7.fsf@xemacs.org> Message-ID: On Tue, Mar 10, 2009 at 10:05 PM, Stephen J. Turnbull wrote: > Guido van Rossum writes: > > ?> I'm feeling really dense right now -- I still don't see the difference > ?> between the two. Are you saying that you would prefer an expression > ?> that creates a *named* function? That seems to be really bizarre -- > ?> like claiming that you don't like expressions that return anonymous > ?> numbers. > > Here's a use-case from Emacs. ?Various modes have callbacks so that > users can customize them. ?A typical case is that a text-mode hook > will turn on auto-fill-mode, which is documented as an example to be > done like this: > > (add-hook 'text-mode-hook (lambda () (auto-fill-mode 1))) > > where *-mode functions called with nil toggle the mode, positive > numbers turn it on, and non-positive numbers turn it off. ?The > add-hook function is supposed to be idempotent: it won't add the same > hook function if it is already in the hook. ?The problem is if you > change the lambda and execute that form, the changed lambda is now not > identical to the lambda on the hook, so the old version won't be > removed. ?This > > (add-hook 'text-mode-hook (defun turn-on-auto-fill () (auto-fill-mode 1))) > > neatly avoids the problem by returning the name of the function, the > symbol `turn-on-auto-fill', which is callable and so suitable for > hanging on the hook. ?If you change the definition and execute the > above form, add-hook *mutates nothing* (the symbol is already > present), but because the hook is indirect through the function's > symbol and the defun *is* executed, the definition changes ... which > is exactly what you want.[1] Got it -- sort of like using assignment in an expression in C, to set a variable and return the valule (but not quite, don't worry :-). > AMK's use-case could be post-processed as something like I assume you're talking about Andrew Koenig's use case -- ANK is Andrew Kuchling, who AFAIK didn't participate in this thread. :-) > (let ((i 0)) > ?(mapcar (lambda () > ? ? ? ? ? ?(let ((name (intern (format "foo-%d-callback" i)))) > ? ? ? ? ? ? ?(define-function name (aref slots i)) > ? ? ? ? ? ? ?(aset slots i name) > ? ? ? ? ? ? ?(setq i (1+ i)) > ? ? ? ? ? ? ?name)) > ? ? ? ? ?slots)) IIUC (my Lisp is very rusty) this just assigns unique names to the functions right? You're saying this to satisfy the people who insist that __name__ is always useful right? But it seems to be marginally useful here since the names don't occur in the source. (?) > where slots is the vector of anonymous functions. ?Providing names in > this way costs one indirection per callback invocation in Emacs Lisp, > but the benefits in readability of tracebacks are large, especially > for compiled code. > > ?> I don't see the conceptual difference between a "def-expression" (if > ?> it were syntactically possible) and a lambda-expression. What is the > ?> difference in your view? Are you sure that difference exists? (It > ?> wouldn't be the first time that people ascribe powers to lambda that > ?> it doesn't have. :-) > > AIUI, a def-expression binds a callable to an object, while a lambda > expression returns a callable. ?An anonymous def is just lambda by a > different name (and I think the code block proponents agree, based on > their willingness to accept syntax using def instead of lambda). > > I don't see how the kind of thing exemplified above would be useful in > Python, and from a parallel reply I just saw, I gather Jim agrees. > The point is to show how a function-defining expression can be useful > in some contexts. ?This works in Emacs Lisp because in > > (setq foo (lambda ...)) > > tools (including the Lisp interpreter) will not recognize foo as a > function identifier although its value is a function, while > > (defun foo ...) > > marks foo as a function identifier. ?But in Python (like Scheme) > they're basically the same operation, with a little syntactic sugar. > Anything that is based on the separation of variable namespace from > function namespace is DOA, right? Right, and so are separations between value and type namespaces (as other languages use, e.g. C++ and Haskell). > The renaming mapper is a different issue, I think; it depends on > computing object names at runtime (ie, the Lisp `intern' operation), > not on separate namespaces. ?I'm not sure offhand how to do that in > Python, or even it it's possible; I've never wanted it. You can assign new values to to f.__name__. > Footnotes: > [1] ?N.B. ?Of course modern Emacsen define turn-on-auto-fill as a > standard function. ?But this is ugly (because of the single flat > namespace of Emacs Lisp), and not all modes have their turn-on, > turn-off variants. ?auto-fill-mode was chosen because (a) the > semantics are easy to imagine and (b) the use of lambda in a hook is > explained by exactly this example in the Emacs Lisp Manual. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Wed Mar 11 15:28:22 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 11 Mar 2009 07:28:22 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: <87ocw8zkym.fsf@benfinney.id.au> References: <87sklkzpf9.fsf@benfinney.id.au> <87ocw8zkym.fsf@benfinney.id.au> Message-ID: On Tue, Mar 10, 2009 at 10:29 PM, Ben Finney wrote: > Ben Finney writes: > >> tav writes: >> > For some bizarre reason, I had expected those on this list to be >> > Masters of Python. >> >> That is rather bizarre. If you find something that led you to expect >> that, please let us know so the fallacy can be corrected. > > My message had rather an imperious tone that was not intended. > > I hasten to note that I'm not claiming any special status for myself > with regard to this list, or Python's community. I'm a mere interested > party, and my response was not intended to speak authoritatively about > How Things Are?. My apologies for any mistaken impressions I might > have given. Don't worry, Tav's wording was rather offensive so strong reactions are understandable. I read your message as a totally fine tit-for-tat reply, and in fact it made me discard my own similar draft. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dickinsm at gmail.com Wed Mar 11 15:28:45 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Wed, 11 Mar 2009 14:28:45 +0000 Subject: [Python-ideas] math module and complex numbers In-Reply-To: <20090311133130.GA21137@phd.pp.ru> References: <49B7AE0B.5040004@molden.no> <3cdcefb80903110534p65d2a024n9af6b6621e78e0fc@mail.gmail.com> <49B7B8AA.9050305@molden.no> <20090311133130.GA21137@phd.pp.ru> Message-ID: <5c6f2a5d0903110728j3223dbftb0178b669db2085e@mail.gmail.com> On Wed, Mar 11, 2009 at 1:31 PM, Oleg Broytmann wrote: > numbers. The distinction between functions which support complex numbers > and those which don't is made since most users do not want to learn quite > as much mathematics as required to understand complex numbers. Receiving an > exception instead of a complex result allows earlier detection" Furthermore, even those users who *do* understand complex numbers don't always want sqrt(-1) to return 1j. I find the math/cmath duality useful. Mark >>> from cmath import sqrt >>> sqrt(-complex(1)) -1j >>> sqrt(complex(-1)) 1j From bruce at leapyear.org Wed Mar 11 19:06:55 2009 From: bruce at leapyear.org (Bruce Leban) Date: Wed, 11 Mar 2009 11:06:55 -0700 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <87sklkzpf9.fsf@benfinney.id.au> <87ocw8zkym.fsf@benfinney.id.au> Message-ID: (1) I can easily write an expression that returns a function of arbitrary complexity. (2) What I can't easily write is an expression that is arbitrarily complex and returns a function with that complexity. Sorry if that's unclear. My point is that if I know what the complexity is in advance I can write it down. If I can't, do I really want to embed this in the middle of some other function? For example: >>> def f(name, x): def _(i, x=x): // do complicated stuff return i+x _.__name__ = name return _ Then f('bar', 3) returns a function equivalent to lambda i: i+3, etc. and of course the definition of f can be arbitrarily complicated although that complexity must be planned in advance. I can stick f('bar', 3) anywhere I want a function. What I have yet to hear is an explanation of the advantages of providing (2) since we already have (1) and lambda. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilk at flibuste.net Wed Mar 11 20:25:38 2009 From: wilk at flibuste.net (William Dode) Date: Wed, 11 Mar 2009 19:25:38 +0000 (UTC) Subject: [Python-ideas] float vs decimal Message-ID: Hi, I just read the blog post of gvr : http://python-history.blogspot.com/2009/03/problem-with-integer-division.html And i wonder why >>> .33 0.33000000000000002 is still possible in a "very-high-level langage" like python3 ? Why .33 could not be a Decimal directly ? bye -- William Dod? - http://flibuste.net Informaticien Ind?pendant From pyideas at rebertia.com Wed Mar 11 20:48:17 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 11 Mar 2009 12:48:17 -0700 Subject: [Python-ideas] float vs decimal In-Reply-To: References: Message-ID: <50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com> On Wed, Mar 11, 2009 at 12:25 PM, William Dode wrote: > Hi, > > I just read the blog post of gvr : > http://python-history.blogspot.com/2009/03/problem-with-integer-division.html > > And i wonder why >>>> .33 > 0.33000000000000002 > is still possible in a "very-high-level langage" like python3 ? > > Why .33 could not be a Decimal directly ? I proposed something like this earlier, see: http://mail.python.org/pipermail/python-ideas/2008-December/002379.html Obviously, the proposal didn't go anywhere, the reason being that Decimal is currently implemented in Python and is thus much too inefficient to be the default (efficiency/practicality beating correctness/purity here apparently). There are non-Python implementations of the decimal standard in C, but no one could locate one with a Python-compatible license. The closest was the IBM implementation whose spec the decimal PEP was based off of, but unfortunately it uses the ICU License which has a classic-BSD-like attribution clause. Cheers, Chris -- I have a blog: http://blog.rebertia.com From stefan_ml at behnel.de Wed Mar 11 20:49:18 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Wed, 11 Mar 2009 20:49:18 +0100 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: Guido van Rossum wrote: > On Sun, Feb 1, 2009 at 2:29 PM, Christian Heimes wrote: >> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple >> because there are enough frameworks for elaborate unit testing. >> >> Such a tool should >> >> - find all modules and packages named 'tests' for a given package name > > I predict that this part is where you'll have a hard time getting > consensus. There are lots of different naming conventions. It would be > nice if people could use the new discovery feature without having to > move all their tests around. Still, there should be one way to do it, so that future projects can start to use a common pattern. I actually think the selection of such a pattern can be completely arbitrary, as it will be impossible to get a clear vote on this. Obviously, the OWTDI does not lift the requirement that the test finder must support alternate patterns to make it work smoothly with existing test suites. It's just meant to avoid the configuration overhead if you do it 'the right way'. Stefan From jjb5 at cornell.edu Wed Mar 11 21:32:12 2009 From: jjb5 at cornell.edu (Joel Bender) Date: Wed, 11 Mar 2009 16:32:12 -0400 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) In-Reply-To: <59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com> References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> <20090310095822.0f957a4d@o> <59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com> Message-ID: <49B81FCC.40207@cornell.edu> Jan Kanis wrote: > @lambda f: do_something(with_our(f)) > def result(param): > print("called back with "+param) > return foobar(param) To keep result from stomping on the name, I would expect result to actually be a result rather than a function :-): @lambda f: do_something(with_our(f)) lambda param: print("called back with "+param) return foobar(param) > But I don't really have a specific actual use case for this... Looks interesting anyway. Joel From wilk at flibuste.net Wed Mar 11 22:32:10 2009 From: wilk at flibuste.net (William Dode) Date: Wed, 11 Mar 2009 21:32:10 +0000 (UTC) Subject: [Python-ideas] float vs decimal References: <50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com> Message-ID: On 11-03-2009, Chris Rebert wrote: > On Wed, Mar 11, 2009 at 12:25 PM, William Dode wrote: >> Hi, >> >> I just read the blog post of gvr : >> http://python-history.blogspot.com/2009/03/problem-with-integer-division.html >> >> And i wonder why >>>>> .33 >> 0.33000000000000002 >> is still possible in a "very-high-level langage" like python3 ? >> >> Why .33 could not be a Decimal directly ? > > I proposed something like this earlier, see: > http://mail.python.org/pipermail/python-ideas/2008-December/002379.html > Obviously, the proposal didn't go anywhere, the reason being that > Decimal is currently implemented in Python and is thus much too > inefficient to be the default (efficiency/practicality beating > correctness/purity here apparently). There are non-Python > implementations of the decimal standard in C, but no one could locate > one with a Python-compatible license. The closest was the IBM > implementation whose spec the decimal PEP was based off of, but > unfortunately it uses the ICU License which has a classic-BSD-like > attribution clause. Thanks to resume the situation. I think of another question. Why it's so difficult to mix float and decimal ? For example we cannot do Decimal(float) or float * Decimal. And i'm afraid that with python 3 it will more often a pain because operation with two integers can sometimes return integer (and accept an operation with decimal) and sometimes not when the operation will return a float. >>> a = Decimal('5.3') >>> i = 4 >>> j = 3 >>> i*a/j Decimal('7.066666666666666666666666667') >>> i/j*a Traceback (most recent call last): File "", line 1, in TypeError: unsupported operand type(s) for *: 'float' and 'Decimal' I mean, why if a Decimal is in the middle of an operation, float didn't silently become a Decimal ? -- William Dod? - http://flibuste.net Informaticien Ind?pendant From pyideas at rebertia.com Wed Mar 11 22:39:31 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 11 Mar 2009 14:39:31 -0700 Subject: [Python-ideas] float vs decimal In-Reply-To: References: <50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com> Message-ID: <50697b2c0903111439k21171e3t19d18bff1601d0fd@mail.gmail.com> On Wed, Mar 11, 2009 at 2:32 PM, William Dode wrote: > On 11-03-2009, Chris Rebert wrote: >> On Wed, Mar 11, 2009 at 12:25 PM, William Dode wrote: >>> Hi, >>> >>> I just read the blog post of gvr : >>> http://python-history.blogspot.com/2009/03/problem-with-integer-division.html >>> >>> And i wonder why >>>>>> .33 >>> 0.33000000000000002 >>> is still possible in a "very-high-level langage" like python3 ? >>> >>> Why .33 could not be a Decimal directly ? >> >> I proposed something like this earlier, see: >> http://mail.python.org/pipermail/python-ideas/2008-December/002379.html >> Obviously, the proposal didn't go anywhere, the reason being that >> Decimal is currently implemented in Python and is thus much too >> inefficient to be the default (efficiency/practicality beating >> correctness/purity here apparently). There are non-Python >> implementations of the decimal standard in C, but no one could locate >> one with a Python-compatible license. The closest was the IBM >> implementation whose spec the decimal PEP was based off of, but >> unfortunately it uses the ICU License which has a classic-BSD-like >> attribution clause. > > Thanks to resume the situation. > > I think of another question. Why it's so difficult to mix float and > decimal ? > > For example we cannot do Decimal(float) or float * Decimal. > > And i'm afraid that with python 3 it will more often a pain because > operation with two integers can sometimes return integer (and accept an > operation with decimal) and sometimes not when the operation will return > a float. > >>>> a = Decimal('5.3') >>>> i = 4 >>>> j = 3 >>>> i*a/j > Decimal('7.066666666666666666666666667') >>>> i/j*a > Traceback (most recent call last): > ?File "", line 1, in > TypeError: unsupported operand type(s) for *: 'float' and 'Decimal' > > I mean, why if a Decimal is in the middle of an operation, float didn't > silently become a Decimal ? It's in the FAQ section of the decimal module - http://docs.python.org/library/decimal.html : 17. Is there a way to convert a regular float to a Decimal? A. Yes, all binary floating point numbers can be exactly expressed as a Decimal. An exact conversion may take more precision than intuition would suggest, so we trap Inexact to signal a need for more precision: def float_to_decimal(f): [definition snipped] 17. Why isn?t the float_to_decimal() routine included in the module? A. There is some question about whether it is advisable to mix binary and decimal floating point. Also, its use requires some care to avoid the representation issues associated with binary floating point: >>> float_to_decimal(1.1) Decimal('1.100000000000000088817841970012523233890533447265625') Cheers, Chris -- I have a blog: http://blog.rebertia.com From greg.ewing at canterbury.ac.nz Wed Mar 11 22:44:47 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 12 Mar 2009 10:44:47 +1300 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <87iqmi5819.fsf@xemacs.org> Message-ID: <49B830CF.8070800@canterbury.ac.nz> Jim Jewett wrote: > I'm not convinced, because I've seen so many times when a lambda > actually is crucial to the bug. I think this is just a special case of a more general problem, that a line number is not always a sufficiently fine-grained piece of information when you're trying to pinpoint an error. You can get the same thing even when lambdas are not involved. It's particularly bad when an expression spans more than one line, because CPython currently doesn't even tell you the line containing the error, but the one where the whole statement started. Ideally, the traceback would show you not just the exact line, but the exact *token* where the error occurred. The technology exists to do this, it's just a matter of deciding to incorporate it into Python. -- Greg From ggpolo at gmail.com Wed Mar 11 23:24:03 2009 From: ggpolo at gmail.com (Guilherme Polo) Date: Wed, 11 Mar 2009 19:24:03 -0300 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: Message-ID: On Wed, Mar 11, 2009 at 4:49 PM, Stefan Behnel wrote: > Guido van Rossum wrote: >> On Sun, Feb 1, 2009 at 2:29 PM, Christian Heimes wrote: >>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple >>> because there are enough frameworks for elaborate unit testing. >>> >>> Such a tool should >>> >>> - find all modules and packages named 'tests' for a given package name >> >> I predict that this part is where you'll have a hard time getting >> consensus. There are lots of different naming conventions. It would be >> nice if people could use the new discovery feature without having to >> move all their tests around. > > Still, there should be one way to do it, so that future projects can start > to use a common pattern. I actually think the selection of such a pattern > can be completely arbitrary, as it will be impossible to get a clear vote > on this. > > Obviously, the OWTDI does not lift the requirement that the test finder > must support alternate patterns to make it work smoothly with existing test > suites. It's just meant to avoid the configuration overhead if you do it > 'the right way'. > A little unrelated to your reply but thanks for "reviving" the thread. I still have the intention to do the proposed idea, I just happened to have very busy weeks, month, etc.. new house and others. > Stefan > Regards, -- -- Guilherme H. Polo Goncalves From python at rcn.com Wed Mar 11 23:37:46 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 11 Mar 2009 15:37:46 -0700 Subject: [Python-ideas] Adding a test discovery into Python References: Message-ID: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> [Christian Heimes] >>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple >>> because there are enough frameworks for elaborate unit testing. Test discovery is not the interesting part of the problem. I'm strongly for offering tools that make it easier to write the tests in the first place. The syntax used by py.test and nose is vastly superior to the one used by unittest.py, a module that is more Javathonic than Pythonic. Even if we never adopt that syntax for our own test suite (because we like to run tests with and without -O), it would still be a good service to our users to offer a tool with a lighter weight syntax for writing tests. Raymond P.S. I'm not a partisan on this one. I've been a *heavy* user of unittest.py, doctest.py, py.test, and some personal tools that I wrote long ago in awk. Extensive use of each makes merits of the py.test and nose approaches self-evident. Axiom: The more work involved in writing tests, the fewer tests that will get written. Factoid of the Day: In Py2.7's test_datetime module, the phrase self.assertEqual occurs 578 times. From steve at pearwood.info Wed Mar 11 23:41:03 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 12 Mar 2009 09:41:03 +1100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <87zlfs4pk7.fsf@xemacs.org> Message-ID: <200903120941.03721.steve@pearwood.info> On Thu, 12 Mar 2009 01:26:29 am Guido van Rossum wrote: > You can assign new values to to f.__name__. But it is only used in function repr, not tracebacks. From Python 2.6: >>> f = lambda x: x+1 # __name__ is '' >>> f.__name__ = 'f' >>> f >>> f(None) Traceback (most recent call last): File "", line 1, in File "", line 1, in TypeError: unsupported operand type(s) for +: 'NoneType' and 'int' -- Steven D'Aprano From steve at pearwood.info Wed Mar 11 23:48:27 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 12 Mar 2009 09:48:27 +1100 Subject: [Python-ideas] [Python-Dev] Ext4 data loss In-Reply-To: References: <49B7C6BA.8010100@cheimes.de> Message-ID: <200903120948.28250.steve@pearwood.info> On Thu, 12 Mar 2009 01:21:25 am Antoine Pitrou wrote: > Christian Heimes cheimes.de> writes: > > In my initial proposal one and a half hour earlier I suggested > > 'sync()' as the name of the method and 'synced' as the name of the > > flag that forces a fsync() call during the close operation. > > I think your "synced" flag is too vague. Some applications may need > the file to be synced on close(), but some others may need it to be > synced at regular intervals, or after each write(), etc. > > Calling the flag "sync_on_close" would be much more explicit. Also, > given the current API I think it should be an argument to open() > rather than a writable attribute. Perhaps we should have a module containing rich file tools, e.g. classes FileSyncOnWrite, FileSyncOnClose, functions for common file-related operations, etc. This will make it easy for conscientious programmers to do the right thing for their app without needing to re-invent the wheel all the time, but without handcuffing them into a single "one size fits all" solution. File operations are *hard*, because many error conditions are uncommon, and consequently many (possibly even the majority) of programmers never learn that something like this: f = open('myfile', 'w') f.write(data) f.close() (or the equivalent in whatever language they use) may cause data loss. Worse, we train users to accept that data loss as normal instead of reporting it as a bug -- possibly because it is unclear whether it is a bug in the application, the OS, the file system, or all three. (It's impossible to avoid *all* risk of data loss, of course -- what if the computer loses power in the middle of a write? But we can minimize that risk significantly.) Even when programmers try to do the right thing, it is hard to know what the right thing is: there are trade-offs to be made, and having made a trade-off, the programmer then has to re-invent what usually turns out to be a quite complicated wheel. To do the right thing in Python often means delving into the world of os.O_* constants and file descriptors, which is intimidating and unpythonic. They're great for those who want/need them, but perhaps we should expose a Python interface to the more common operations? To my mind, that means classes instead of magic constants. Would there be interest in a filetools module? Replies and discussion to python-ideas please. -- Steven D'Aprano From robert.kern at gmail.com Thu Mar 12 00:00:14 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Mar 2009 18:00:14 -0500 Subject: [Python-ideas] Ext4 data loss In-Reply-To: <200903120948.28250.steve@pearwood.info> References: <49B7C6BA.8010100@cheimes.de> <200903120948.28250.steve@pearwood.info> Message-ID: On 2009-03-11 17:48, Steven D'Aprano wrote: > Would there be interest in a filetools module? Replies and discussion to > python-ideas please. Yes, please. I am of the opinion that, wherever possible, these kinds of patterns should be codified in reusable libraries. For something as fundamental as writing files, something aimed towards standard library acceptance seems like a very good idea to me. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ggpolo at gmail.com Thu Mar 12 00:05:46 2009 From: ggpolo at gmail.com (Guilherme Polo) Date: Wed, 11 Mar 2009 20:05:46 -0300 Subject: [Python-ideas] Fwd: Adding a test discovery into Python In-Reply-To: References: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> Message-ID: ---------- Forwarded message ---------- From: Guilherme Polo Date: Wed, Mar 11, 2009 at 8:04 PM Subject: Re: [Python-ideas] Adding a test discovery into Python To: Raymond Hettinger , python-dev at python.org On Wed, Mar 11, 2009 at 7:37 PM, Raymond Hettinger wrote: > [Christian Heimes] >>>> >>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on simple >>>> because there are enough frameworks for elaborate unit testing. > > Test discovery is not the interesting part of the problem. Interesting or not, it is a problem that is asking for a solution, this kind of code is being duplicated in several places for no good reason. > > Axiom: ?The more work involved in writing tests, the fewer > tests that will get written. At some point you will have to run them too, I don't think you want to reimplement the discovery part yet another time. -- -- Guilherme H. Polo Goncalves From jan.kanis at phil.uu.nl Thu Mar 12 00:08:38 2009 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Thu, 12 Mar 2009 00:08:38 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea (alternative) In-Reply-To: <49B81FCC.40207@cornell.edu> References: <59a221a0903091713n498e0a6aw8f883d7c62335d03@mail.gmail.com> <20090310095822.0f957a4d@o> <59a221a0903101711o5fb1e32ey4684d8d2f2325902@mail.gmail.com> <49B81FCC.40207@cornell.edu> Message-ID: <59a221a0903111608i2a69c6f5ke8eac8e73bdc073@mail.gmail.com> On Wed, Mar 11, 2009 at 21:32, Joel Bender wrote: > Jan Kanis wrote: > >> ?@lambda f: do_something(with_our(f)) >> ?def result(param): >> ? ? ? print("called back with "+param) >> ? ? ? return foobar(param) > > To keep result from stomping on the name, I would expect result to actually > be a result rather than a function :-): 'result' is the actual result. To try it out in current python: def do_something(func): print("doing something") return func(41)**2 def id(x): return x @id(lambda f: do_something(f)) def result(param): print("called back with", param) return param + 1 print("result is", result, "should be", 42**2) --> doing something called back with 41 result is 1764 should be 1764 Or did I misinterpret what you were saying? From zooko at zooko.com Thu Mar 12 02:26:40 2009 From: zooko at zooko.com (zooko) Date: Wed, 11 Mar 2009 19:26:40 -0600 Subject: [Python-ideas] [Python-Dev] Ext4 data loss In-Reply-To: <200903120948.28250.steve@pearwood.info> References: <49B7C6BA.8010100@cheimes.de> <200903120948.28250.steve@pearwood.info> Message-ID: > Would there be interest in a filetools module? Replies and > discussion to python-ideas please. I've been using and maintaining a few filesystem hacks for, let's see, almost nine years now: http://allmydata.org/trac/pyutil/browser/pyutil/pyutil/fileutil.py (The first version of that was probably written by Greg Smith in about 1999.) I'm sure there are many other such packages. A couple of quick searches of pypi turned up these two: http://pypi.python.org/pypi/Pythonutils http://pypi.python.org/pypi/fs I wonder if any of them have the sort of functionality you're thinking of. Regards, Zooko From stephen at xemacs.org Thu Mar 12 02:55:52 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 12 Mar 2009 10:55:52 +0900 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <49B534A7.2010001@molden.no> <20090309154505.GA18115@panix.com> <87zlfs4pk7.fsf@xemacs.org> Message-ID: <87ocw74i8n.fsf@xemacs.org> Guido van Rossum writes: > > (add-hook 'text-mode-hook (defun turn-on-auto-fill () (auto-fill-mode 1))) > > > > neatly avoids the problem by returning the name of the function, the > > symbol `turn-on-auto-fill', which is callable and so suitable for > > hanging on the hook. > > Got it -- sort of like using assignment in an expression in C, to set > a variable and return the valule (but not quite, don't worry :-). Yes. And no, I don't worry about you, but I do worry about what else may be lurking in a language whose designer(s) chose to return the function definition (rather than the name) from define-function. ;-) > I assume you're talking about Andrew Koenig's use case -- ANK is > Andrew Kuchling, who AFAIK didn't participate in this thread. :-) Oops, my bad. Very sorry to all concerned. > IIUC (my Lisp is very rusty) this just assigns unique names to the > functions right? Yes. > You're saying this to satisfy the people who insist that __name__ > is always useful right? But it seems to be marginally useful here > since the names don't occur in the source. (?) But they are at least cosmetically useful to the runtime system (eg, they will be used in reporting tracebacks -- bytecode in the backtrace is hard to read) and accessible to the user (for redefining a callback on-the-fly). The user can't necessarily access the array of callbacks directly (eg, it might be in C) or conveniently (it may be buried deep in a complex structure). It seems plausible to me that the user is most likely to want to redefine a callback that just blew up, too, and this would give you the necessary "handle" in the backtrace. Also, I haven't thought this through, but use of numbers to differentiate the names was just an easy example. An appropriate naming scheme might make it easy to find a skeleton in the source for the generated callback. Eg, if instead of numbers the identifiers were "foo-abort", "foo-retry", and "foo-fail". I don't know if that would be useful in Andrew's use-case. So, yes, marginal, in the sense that I doubt the use cases are common, but I suspect in a few it could be a great convenience. How useful in Python, I don't know ... Emacs Lisp is full of "seemed like the thing to do at the time" design, so the more handles I have the happier I am. From santagada at gmail.com Thu Mar 12 04:04:40 2009 From: santagada at gmail.com (Leonardo Santagada) Date: Thu, 12 Mar 2009 00:04:40 -0300 Subject: [Python-ideas] Fwd: Adding a test discovery into Python In-Reply-To: References: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> Message-ID: On Mar 11, 2009, at 8:05 PM, Guilherme Polo wrote: > On Wed, Mar 11, 2009 at 7:37 PM, Raymond Hettinger > wrote: >> [Christian Heimes] >>>>> >>>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing >>>>> on simple >>>>> because there are enough frameworks for elaborate unit testing. >> >> Test discovery is not the interesting part of the problem. > > Interesting or not, it is a problem that is asking for a solution, > this kind of code is being duplicated in several places for no good > reason. > >> >> Axiom: The more work involved in writing tests, the fewer >> tests that will get written. > > At some point you will have to run them too, I don't think you want to > reimplement the discovery part yet another time. What I think he was getting at is that 20-30 lines of test discovery have to be written once for each project (or none if using py.test/ nose), but self.assertequals and all of the other quirks of unittest are all over a test suite and you need to write all of it each time you have to make a test. Not that what you are trying to do is pointless, but fixing this other problem is so much more interesting... -- Leonardo Santagada santagada at gmail.com From python at rcn.com Thu Mar 12 04:45:24 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 11 Mar 2009 20:45:24 -0700 Subject: [Python-ideas] [Python-Dev] Formatting mini-language suggestion References: Message-ID: <7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1> [Guido van Rossum] > I suggest moving this to python-ideas and > writing a proper PEP. Okay, it's moved. Will write up a PEP, do research on what other languages do and collect everyone's ideas on what to put in the shed. (hundreds and ten thousands grouping, various choices of decimal points, mayan number systems and whatnot). Will start with Nick's simple proposal as a starting point. [Nick Coghlan] > [[fill]align][sign][#][0][minimumwidth][,][.precision][type] Other suggestions and comments welcome. Raymond From ben+python at benfinney.id.au Thu Mar 12 04:57:11 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 12 Mar 2009 14:57:11 +1100 Subject: [Python-ideas] Draft PEP: Standard daemon process library References: <87wscj11fl.fsf@benfinney.id.au> Message-ID: <874oxzxujs.fsf@benfinney.id.au> Howdy all, Significant changes in this release: * Name the daemon process context class `DaemonContext`, since it doesn't actually represent a separate daemon. (The reference implementation will also have a `DaemonRunner` class, but that's outside the scope of this PEP.) * Implement the context manager protocol, allowing use as a ?with? context manager or via explicit ?open? and ?close? calls. * Delegate PID file handling to a `pidfile` object handed to the `DaemonContext` instance, and used simply as a context manager. * Simplify the set of options by using a mapping for signal handlers. * Target Python 3.2, since the reference implementation will very likely not be complete in time for anything earlier. :PEP: XXX :Title: Standard daemon process library :Version: 0.5 :Last-Modified: 2009-03-12 14:50 :Author: Ben Finney :Status: Draft :Type: Standards Track :Content-Type: text/x-rst :Created: 2009-01-26 :Python-Version: 3.2 :Post-History: ======== Abstract ======== Writing a program to become a well-behaved Unix daemon is somewhat complex and tricky to get right, yet the steps are largely similar for any daemon regardless of what else the program may need to do. This PEP introduces a package to the Python standard library that provides a simple interface to the task of becoming a daemon process. .. contents:: .. Table of Contents: Abstract Specification Example usage Interface ``DaemonContext`` objects ``DaemonError`` objects Motivation Rationale Correct daemon behaviour A daemon is not a service Reference Implementation Other daemon implementations References Copyright ============= Specification ============= Example usage ============= Simple example of direct `DaemonContext` usage:: import daemon from spam import do_main_program with daemon.DaemonContext() as daemon_context: do_main_program() More complex example usage:: import os import grp import signal import daemon import lockfile from spam import ( initial_program_setup, do_main_program, program_cleanup, reload_program_config, ) context = daemon.DaemonContext( working_directory='/var/lib/foo', umask=0o002, pidfile=lockfile.FileLock('/var/run/spam.pid'), ) context.signal_map = { signal.SIGTERM: program_cleanup, signal.SIGHUP: 'close', signal.SIGUSR1: reload_program_config, } mail_gid = grp.getgrnam('mail').gr_gid context.gid = mail_gid important_file = open('spam.data', 'w') interesting_file = open('eggs.data', 'w') context.files_preserve = [important_file, interesting_file] initial_program_setup() with context: do_main_program() Interface ========= A new package, `daemon`, is added to the standard library. An exception class, `DaemonError`, is defined for exceptions raised from the package. A class, `DaemonContext`, is defined to represent the settings and process context for the program running as a daemon process. ``DaemonContext`` objects ========================= A `DaemonContext` instance represents the behaviour settings and process context for the program when it becomes a daemon. The behaviour and environment is customised by setting options on the instance, before calling the `open` method. Each option can be passed as a keyword argument to the `DaemonContext` constructor, or subsequently altered by assigning to an attribute on the instance at any time prior to calling `open`. That is, for options named `wibble` and `wubble`, the following invocation:: foo = daemon.DaemonContext(wibble=bar, wubble=baz) foo.open() is equivalent to:: foo = daemon.DaemonContext() foo.wibble = bar foo.wubble = baz foo.open() The following options are defined. `files_preserve` :Default: ``None`` List of files that should *not* be closed when starting the daemon. If ``None``, all open file descriptors will be closed. Elements of the list are file descriptors (as returned by a file object's `fileno()` method) or Python `file` objects. Each specifies a file that is not to be closed during daemon start. `chroot_directory` :Default: ``None`` Full path to a directory to set as the effective root directory of the process. If ``None``, specifies that the root directory is not to be changed. `working_directory` :Default: ``'/'`` Full path of the working directory to which the process should change on daemon start. Since a filesystem cannot be unmounted if a process has its current working directory on that filesystem, this should either be left at default or set to a directory that is a sensible ?home directory? for the daemon while it is running. `umask` :Default: ``0`` File access creation mask (?umask?) to set for the process on daemon start. Since a process inherits its umask from its parent process, starting the daemon will reset the umask to this value so that files are created by the daemon with access modes as it expects. `pidfile` :Default: ``None`` Context manager for a PID lock file. When the daemon context opens and closes, it enters and exits the `pidfile` context manager. `signal_map` :Default: ``{signal.SIGTOU: None, signal.SIGTTIN: None, signal.SIGTSTP: None, signal.SIGTERM: 'close'}`` Mapping from operating system signals to callback actions. The mapping is used when the daemon context opens, and determines the action for each signal's signal handler: * A value of ``None`` will ignore the signal (by setting the signal action to ``signal.SIG_IGN``). * A string value will be used as the name of an attribute on the ``DaemonContext`` instance. The attribute's value will be used as the action for the signal handler. * Any other value will be used as the action for the signal handler. `uid` :Default: ``None`` The user ID (?uid?) value to switch the process to on daemon start. `gid` :Default: ``None`` The group ID (?gid?) value to switch the process to on daemon start. `prevent_core` :Default: ``True`` If true, prevents the generation of core files, in order to avoid leaking sensitive information from daemons run as `root`. `stdin` :Default: ``None`` `stdout` :Default: ``None`` `stderr` :Default: ``None`` Each of `stdin`, `stdout`, and `stderr` is a file-like object which will be used as the new file for the standard I/O stream `sys.stdin`, `sys.stdout`, and `sys.stderr` respectively. The file should therefore be open, with a minimum of mode 'r' in the case of `stdin`, and mode 'w+' in the case of `stdout` and `stderr`. If the object has a `fileno()` method that returns a file descriptor, the corresponding file will be excluded from being closed during daemon start (that is, it will be treated as though it were listed in `files_preserve`). If ``None``, the corresponding system stream is re-bound to the file named by `os.devnull`. The following methods are defined. `open()` :Return: ``None`` Open the daemon context, turning the current program into a daemon process. This performs the following steps: * If the `chroot_directory` attribute is not ``None``, set the effective root directory of the process to that directory (via `os.chroot`). This allows running the daemon process inside a ?chroot gaol? as a means of limiting the system's exposure to rogue behaviour by the process. * Close all open file descriptors. This excludes those listed in the `files_preserve` attribute, and those that correspond to the `stdin`, `stdout`, or `stderr` attributes. * Change current working directory to the path specified by the `working_directory` attribute. * Reset the file access creation mask to the value specified by the `umask` attribute. * Detach the current process into its own process group, and disassociate from any controlling terminal. This step is skipped if it is determined to be redundant: if the process was started by `init`, by `initd`, or by `inetd`. * Set signal handlers as specified by the `signal_map` attribute. * If the `prevent_core` attribute is true, set the resource limits for the process to prevent any core dump from the process. * Set the process uid and gid to the true uid and gid of the process, to relinquish any elevated privilege. * If the `pidfile` attribute is not ``None``, enter its context manager. * If either of the attributes `uid` or `gid` are not ``None``, set the process uid and/or gid to the specified values. * If any of the attributes `stdin`, `stdout`, `stderr` are not ``None``, bind the system streams `sys.stdin`, `sys.stdout`, and/or `sys.stderr` to the files represented by the corresponding attributes. Where the attribute has a file descriptor, the descriptor is duplicated (instead of re-binding the name). When the function returns, the running program is a daemon process. `close()` :Return: ``None`` Terminate the daemon context. This performs the following step: * If the `pidfile` attribute is not ``None``, exit its context manager. The class also implements the context manager protocol via ``__enter__`` and ``__exit__`` methods. `__enter__()` :Return: The ``DaemonContext`` instance Call the instance's `open()` method, then return the instance. `__exit__(exc_type, exc_value, exc_traceback)` :Return: ``True`` or ``False`` as defined by the context manager protocol Call the instance's `close()` method, then return ``True`` if the exception was handled or ``False`` if it was not. ``DaemonError`` objects ======================= The `DaemonError` class inherits from `Exception`. The `daemon` package implementation will raise an instance of `DaemonError` when an error occurs in processing daemon behaviour. ========== Motivation ========== The majority of programs written to be Unix daemons either implement behaviour very similar to that in the `specification`_, or are poorly-behaved daemons by the `correct daemon behaviour`_. Since these steps should be much the same in most implementations but are very particular and easy to omit or implement incorrectly, they are a prime target for a standard well-tested implementation in the standard library. ========= Rationale ========= Correct daemon behaviour ======================== According to Stevens in [stevens]_ ?2.6, a program should perform the following steps to become a Unix daemon process. * Close all open file descriptors. * Change current working directory. * Reset the file access creation mask. * Run in the background. * Disassociate from process group. * Ignore terminal I/O signals. * Disassociate from control terminal. * Don't reacquire a control terminal. * Correctly handle the following circumstances: * Started by System V `init` process. * Daemon termination by ``SIGTERM`` signal. * Children generate ``SIGCLD`` signal. The `daemon` tool [slack-daemon]_ lists (in its summary of features) behaviour that should be performed when turning a program into a well-behaved Unix daemon process. It differs from this PEP's intent in that it invokes a *separate* program as a daemon process. The following features are appropriate for a daemon that starts itself once the program is already running: * Sets up the correct process context for a daemon. * Behaves sensibly when started by `initd(8)` or `inetd(8)`. * Revokes any suid or sgid privileges to reduce security risks in case daemon is incorrectly installed with special privileges. * Prevents the generation of core files to prevent leaking sensitive information from daemons run as root (optional). * Names the daemon by creating and locking a PID file to guarantee that only one daemon with the given name can execute at any given time (optional). * Sets the user and group under which to run the daemon (optional, root only). * Creates a chroot gaol (optional, root only). * Captures the daemon's stdout and stderr and directs them to syslog (optional). A daemon is not a service ========================= This PEP addresses only Unix-style daemons, for which the above correct behaviour is relevant, as opposed to comparable behaviours on other operating systems. There is a related concept in many systems, called a ?service?. A service differs from the model in this PEP, in that rather than having the *current* program continue to run as a daemon process, a service starts an *additional* process to run in the background, and the current process communicates with that additional process via some defined channels. The Unix-style daemon model in this PEP can be used, among other things, to implement the background-process part of a service; but this PEP does not address the other aspects of setting up and managing a service. ======================== Reference Implementation ======================== The `python-daemon` package [python-daemon]_. As of `python-daemon` version 1.3 (2009-03-12), the package is under active development and is not yet a full implementation of this PEP. Other daemon implementations ============================ Prior to this PEP, several existing third-party Python libraries or tools implemented some of this PEP's `correct daemon behaviour`_. The `reference implementation`_ is a fairly direct successor from the following implementations: * Many good ideas were contributed by the community to Python cookbook recipes #66012 [cookbook-66012]_ and #278731 [cookbook-278731]_. * The `bda.daemon` library [bda.daemon]_ is an implementation of [cookbook-66012]_. It is the predecessor of [python-daemon]_. Other Python daemon implementations that differ from this PEP: * The `zdaemon` tool [zdaemon]_ was written for the Zope project. Like [slack-daemon]_, it differs from this specification because it is used to run another program as a daemon process. * The Python library `daemon` [clapper-daemon]_ is (according to its homepage) no longer maintained. As of version 1.0.1, it implements the basic steps from [stevens]_. * The `daemonize` library [seutter-daemonize]_ also implements the basic steps from [stevens]_. * Ray Burr's `daemon.py` module [burr-daemon]_ provides the [stevens]_ procedure as well as PID file handling and redirection of output to syslog. * Twisted [twisted]_ includes, perhaps unsurprisingly, an implementation of a process daemonisation API that is integrated with the rest of the Twisted framework; it differs significantly from the API in this PEP. * The Python `initd` library [dagitses-initd]_, which uses [clapper-daemon]_, implements an equivalent of Unix `initd(8)` for controlling a daemon process. ========== References ========== .. [stevens] `Unix Network Programming`, W. Richard Stevens, 1994 Prentice Hall. .. [slack-daemon] The (non-Python) ?libslack? implementation of a `daemon` tool ``_ by ?raf? . .. [python-daemon] The `python-daemon` library ``_ by Ben Finney et al. .. [cookbook-66012] Python Cookbook recipe 66012, ?Fork a daemon process on Unix? ``_. .. [cookbook-278731] Python Cookbook recipe 278731, ?Creating a daemon the Python way? ``_. .. [bda.daemon] The `bda.daemon` library ``_ by Robert Niederreiter et al. .. [zdaemon] The `zdaemon` tool ``_ by Guido van Rossum et al. .. [clapper-daemon] The `daemon` library ``_ by Brian Clapper. .. [seutter-daemonize] The `daemonize` library ``_ by Jerry Seutter. .. [burr-daemon] The `daemon.py` module ``_ by Ray Burr. .. [twisted] The `Twisted` application framework ``_ by Glyph Lefkowitz et al. .. [dagitses-initd] The Python `initd` library ``_ by Michael Andreas Dagitses. ========= Copyright ========= This work is hereby placed in the public domain. To the extent that placing a work in the public domain is not legally possible, the copyright holder hereby grants to all recipients of this work all rights and freedoms that would otherwise be restricted by copyright. .. Local variables: mode: rst coding: utf-8 time-stamp-start: "^:Last-Modified:[ ]+" time-stamp-end: "$" time-stamp-line-limit: 20 time-stamp-format: "%:y-%02m-%02d %02H:%02M" End: vim: filetype=rst fileencoding=utf-8 : From stephen at xemacs.org Thu Mar 12 07:02:32 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 12 Mar 2009 15:02:32 +0900 Subject: [Python-ideas] Formatting mini-language suggestion In-Reply-To: <7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1> References: <7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1> Message-ID: <87k56v46tj.fsf@xemacs.org> Raymond Hettinger writes: > Will start with Nick's simple proposal as a starting point. > > [Nick Coghlan] > > [[fill]align][sign][#][0][minimumwidth][,][.precision][type] +1 for making that the stopping point, too. I can't speak for the Chinese, but the Japanese also use the Chinese numbering system where the verbal expression of large numbers is grouped by 10000s. However, in tables of government expenditure and the like, the commas usually occur every three places. Eg, the official GDP figures from the Japanese Ministry of Economy and Trade: http://www.mext.go.jp/b_menu/toukei/001/08030520/013.htm From lists at janc.be Thu Mar 12 07:36:26 2009 From: lists at janc.be (Jan Claeys) Date: Thu, 12 Mar 2009 07:36:26 +0100 Subject: [Python-ideas] Ruby-style Blocks in Python Idea In-Reply-To: References: <87sklkzpf9.fsf@benfinney.id.au> Message-ID: <1236839786.16233.40.camel@saeko.local> Op woensdag 11-03-2009 om 05:08 uur [tijdzone +0000], schreef tav: > As for Python-ideas, I had taken as granted that people posting would > have more than a passing interest in language design and the nature of > Python. Obviously a false assumption. I apologise for this. But you were right: most of them have a serious interest in such things. That doesn't mean they are all thinking in the same direction though... -- Jan Claeys From stefan_ml at behnel.de Thu Mar 12 07:47:31 2009 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 12 Mar 2009 07:47:31 +0100 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> References: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > I'm strongly for offering tools that make it easier to write > the tests in the first place. The syntax used by py.test > and nose is vastly superior to the one used by unittest.py, > a module that is more Javathonic than Pythonic. > [...] > Factoid of the Day: In Py2.7's test_datetime module, > the phrase self.assertEqual occurs 578 times. Doesn't that just scream for using a doctest instead? The interpreter driven type-think-copy-paste pattern works pretty well for these things. Stefan From denis.spir at free.fr Thu Mar 12 08:13:13 2009 From: denis.spir at free.fr (spir) Date: Thu, 12 Mar 2009 08:13:13 +0100 Subject: [Python-ideas] [Python-Dev] Ext4 data loss In-Reply-To: <200903120948.28250.steve@pearwood.info> References: <49B7C6BA.8010100@cheimes.de> <200903120948.28250.steve@pearwood.info> Message-ID: <20090312081313.40f8b68f@o> Le Thu, 12 Mar 2009 09:48:27 +1100, Steven D'Aprano s'exprima ainsi: > Even when programmers try to do the right thing, it is hard to know what > the right thing is: there are trade-offs to be made, and having made a > trade-off, the programmer then has to re-invent what usually turns out > to be a quite complicated wheel. To do the right thing in Python often > means delving into the world of os.O_* constants and file descriptors, > which is intimidating and unpythonic. They're great for those who > want/need them, but perhaps we should expose a Python interface to the > more common operations? To my mind, that means classes instead of magic > constants. > > Would there be interest in a filetools module? Replies and discussion to > python-ideas please. Sure. +1 Also: a programmer is not (always) a filesystem expert. denis ------ la vita e estrany From python at rcn.com Thu Mar 12 08:17:02 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 00:17:02 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) Message-ID: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> Motivation: Provide a simple, non-locale aware way to format a number with a thousands separator. Adding thousands separators is one of the simplest ways to improve the professional appearance and readability of output exposed to end users. In the finance world, output with commas is the norm. Finance users and non-professional programmers find the locale approach to be frustrating, arcane and non-obvious. It is not the goal to replace locale or to accommodate every possible convention. The goal is to make a common task easier for many users. Research so far: Scanning the web, I've found that thousands separators are usually one of COMMA, PERIOD, SPACE, or UNDERSCORE. The COMMA is used when a PERIOD is the decimal separator. James Knight observed that Indian/Pakistani numbering systems group by hundreds. Ben Finney noted that Chinese group by ten-thousands. Visual Basic and its brethren (like MS Excel) use a completely different style and have ultra-flexible custom format specifiers like: "_($* #,##0_)". Proposal I (from Nick Coghlan]: A comma will be added to the format() specifier mini-language: [[fill]align][sign][#][0][minimumwidth][,][.precision][type] The ',' option indicates that commas should be included in the output as a thousands separator. As with locales which do not use a period as the decimal point, locales which use a different convention for digit separation will need to use the locale module to obtain appropriate formatting. The proposal works well with floats, ints, and decimals. It also allows easy substitution for other separators. For example: format(n, "6,f").replace(",", "_") This technique is completely general but it is awkward in the one case where the commas and periods need to be swapped. format(n, "6,f").replace(",", "X").replace(".", ",").replace("X", ".") Proposal II (to meet Antoine Pitrou's request): Make both the thousands separator and decimal separator user specifiable but not locale aware. For simplicity, limit the choices to a comma, period, space, or underscore.. [[fill]align][sign][#][0][minimumwidth][T[tsep]][dsep precision][type] Examples: format(1234, "8.1f") --> ' 1234.0' format(1234, "8,1f") --> ' 1234,0' format(1234, "8T.,1f") --> ' 1.234,0' format(1234, "8T .f") --> ' 1 234,0' format(1234, "8d") --> ' 1234' format(1234, "8T,d") --> ' 1,234' This proposal meets mosts needs (except for people wanting grouping for hundreds or ten-thousands), but iIt comes at the expense of being a little more complicated to learn and remember. Also, it makes it more challenging to write custom __format__ methods that follow the format specification mini-language. For the locale module, just the "T" is necessary in a formatting string since the tool already has procedures for figuring out the actual separators from the local context. Comments and suggestions are welcome but I draw the line at Mayan numbering conventions ;-) Raymond From denis.spir at free.fr Thu Mar 12 08:24:29 2009 From: denis.spir at free.fr (spir) Date: Thu, 12 Mar 2009 08:24:29 +0100 Subject: [Python-ideas] [Python-Dev] Ext4 data loss In-Reply-To: <200903120948.28250.steve@pearwood.info> References: <49B7C6BA.8010100@cheimes.de> <200903120948.28250.steve@pearwood.info> Message-ID: <20090312082429.47dc09c8@o> Le Thu, 12 Mar 2009 09:48:27 +1100, Steven D'Aprano s'exprima ainsi: > Even when programmers try to do the right thing, it is hard to know what > the right thing is: there are trade-offs to be made, and having made a > trade-off, the programmer then has to re-invent what usually turns out > to be a quite complicated wheel. To do the right thing in Python often > means delving into the world of os.O_* constants and file descriptors, > which is intimidating and unpythonic. They're great for those who > want/need them, but perhaps we should expose a Python interface to the > more common operations? To my mind, that means classes instead of magic > constants. > > Would there be interest in a filetools module? Replies and discussion to > python-ideas please. Sure. +1 Also: a programmer is not (always) a filesystem expert. PS: What I meant is: the point of view from the filesystem is very different. A proper interface will to have to take the programmer's point of view while exposing the filesystem issues. I think (like always at the interface of two worlds -- cf specification talks between developper and client ;-) *terminology* choices will be very important. denis ------ la vita e estrany From lie.1296 at gmail.com Thu Mar 12 08:47:12 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Thu, 12 Mar 2009 18:47:12 +1100 Subject: [Python-ideas] Formatting mini-language suggestion In-Reply-To: <87k56v46tj.fsf@xemacs.org> References: <7A6631E172AC4D12956AEE832E28F52F@RaymondLaptop1> <87k56v46tj.fsf@xemacs.org> Message-ID: Stephen J. Turnbull wrote: > Raymond Hettinger writes: > > > Will start with Nick's simple proposal as a starting point. > > > > [Nick Coghlan] > > > [[fill]align][sign][#][0][minimumwidth][,][.precision][type] could maximumwidth be possible? It's useful if we rather break the display of the numbers than breaking the display of the table (and possibly add a special sign if width overflow occur, like <62432) > +1 for making that the stopping point, too. > > I can't speak for the Chinese, but the Japanese also use the Chinese > numbering system where the verbal expression of large numbers is > grouped by 10000s. However, in tables of government expenditure and > the like, the commas usually occur every three places. Eg, the > official GDP figures from the Japanese Ministry of Economy and Trade: > > http://www.mext.go.jp/b_menu/toukei/001/08030520/013.htm Should there should be a convenience function that will help construct the format string. Some kind of: create_format(self, type='i', base=16, seppos=4, sep=':', charset='0123456789abcdef', maxwidth=32, minwidth=32, pad='0') -- (cookies for you if you noticed that it is ipv6 number format) From python at rcn.com Thu Mar 12 08:49:24 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 00:49:24 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> Message-ID: [spir] > Probably you know that already, but it doesn't hurt anyway. > In french and most rroman languages comma is the standard decimal sep; and either space or dot is used, when necessary, to sep > thousands. (It's veeery difficult for me to read even short numbers with commas used as thousand separator.) > > en: 1,234,567.89 > fr: 1.234.567,89 > or: 1 234 567,89 Thanks for the informative comment. It looks like your needs are best met by Proposal II where those would be written as: en_num = format(x, "12T, 2f") fr_num = format(x, "12T.,2f") or_num =format(x, "12T ,2f") Raymond From leif.walsh at gmail.com Thu Mar 12 09:29:40 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Thu, 12 Mar 2009 04:29:40 -0400 Subject: [Python-ideas] [Python-Dev] Ext4 data loss In-Reply-To: <20090312082429.47dc09c8@o> References: <49B7C6BA.8010100@cheimes.de> <200903120948.28250.steve@pearwood.info> <20090312082429.47dc09c8@o> Message-ID: <1236846580.5214.4.camel@swarley> On Thu, 2009-03-12 at 08:24 +0100, spir wrote: > > Would there be interest in a filetools module? Replies and discussion to > > python-ideas please. > > Sure. +1 > Also: a programmer is not (always) a filesystem expert. > > PS: What I meant is: the point of view from the filesystem is very different. A proper interface will to have to take the programmer's point of view while exposing the filesystem issues. I think (like always at the interface of two worlds -- cf specification talks between developper and client ;-) *terminology* choices will be very important. Dealing with different types of OSes and filesystems in a generic way is difficult. I would urge everyone to err on the side of less generality, because I think it would be better for a programmer to write bad code, and be able to figure out why, than to write code that looks perfectly fine, and have a harder time discovering the problem. -- Cheers, Leif -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: From stephen at xemacs.org Thu Mar 12 10:18:28 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 12 Mar 2009 18:18:28 +0900 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> Message-ID: <87ab7r3xqz.fsf@xemacs.org> Raymond Hettinger writes: > Thanks for the informative comment. It looks like your needs are > best met by Proposal II where those would be written as: > > en_num = format(x, "12T, 2f") > fr_num = format(x, "12T.,2f") > or_num = format(x, "12T ,2f") That is way unreadable to me, especially the difference between en_num and or_num. Also, I wonder if en_num = format(x, "12T,.2f") isn't more explicit. From eric at trueblade.com Thu Mar 12 10:33:54 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 12 Mar 2009 05:33:54 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> Message-ID: <49B8D702.4040004@trueblade.com> Thanks for doing this, Raymond. I don't have any comments on the specific proposals, yet. I'm still thinking it over. But here are a few comments. Raymond Hettinger wrote: > Motivation: You might want to mention the existing 'n' format type. I don't think it's widely known. It handles the odd cases of locales that have odd groupings, such as James Knight's example from India (1,00,00,00,000). James: If you know the locale name for that, I'd like to know it. It would be handy for testing. floats are not terribly useful for 'n', however: >>> format(1000000, 'n') '1,000,000' >>> format(1000000.111111, 'n') '1e+06' >>> format(100000.111111, 'n') '100,000' > Proposal I (from Nick Coghlan]: > A comma will be added to the format() specifier mini-language: > > [[fill]align][sign][#][0][minimumwidth][,][.precision][type] Could you add the existing PEP-3101 specifier, just so we know what we're changing (and so that I don't have to look it up constantly!)? [[fill]align][sign][#][0][width][.precision][type] (As an aside, I copied this from http://docs.python.org/library/string.html#formatstrings, I just noticed that PEP 3101 differs in the name of the width/minwidth field.) > for hundreds or ten-thousands), but iIt comes at the expense of Typo (iIt). > Also, it makes it > more challenging to write custom __format__ methods that follow the > format specification mini-language. For this exact reason, I've always wanted to add a method somewhere that parses the mini-language. The code exists in the C implementation, it would just need to be exposed, probably returning a namedtuple with the various fields. > For the locale module, just the "T" is necessary in a formatting string > since the tool already has procedures for figuring out the actual > separators from the local context. Is this needed at all? That is, having just the "T"? How is this different from using type=n? Having asked the question, I guess the answer is it lets you use it with the more useful float type=f. > Comments and suggestions are welcome but I draw the line at Mayan > numbering conventions ;-) That's only a problem until December 21, 2012 anyway! Eric. From solipsis at pitrou.net Thu Mar 12 12:03:18 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 12 Mar 2009 11:03:18 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?Rough_draft=3A_Proposed_format_specifier?= =?utf-8?q?_for_a=09thousands_separator_=28discussion_moved_from_py?= =?utf-8?q?thon-dev=29?= References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> Message-ID: Raymond Hettinger writes: > or_num =format(x, "12T ,2f") In many cases, the space would have to be a non-breaking space, but it's probably too complicated for your PEP. Regards Antoine. From ggpolo at gmail.com Thu Mar 12 12:33:18 2009 From: ggpolo at gmail.com (Guilherme Polo) Date: Thu, 12 Mar 2009 08:33:18 -0300 Subject: [Python-ideas] Fwd: Adding a test discovery into Python In-Reply-To: References: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> Message-ID: On Thu, Mar 12, 2009 at 12:04 AM, Leonardo Santagada wrote: > > On Mar 11, 2009, at 8:05 PM, Guilherme Polo wrote: >> >> On Wed, Mar 11, 2009 at 7:37 PM, Raymond Hettinger wrote: >>> >>> [Christian Heimes] >>>>>> >>>>>> I'm +1 for a simple (!) test discovery system. I'm emphasizing on >>>>>> simple >>>>>> because there are enough frameworks for elaborate unit testing. >>> >>> Test discovery is not the interesting part of the problem. >> >> Interesting or not, it is a problem that is asking for a solution, >> this kind of code is being duplicated in several places for no good >> reason. >> >>> >>> Axiom: ?The more work involved in writing tests, the fewer >>> tests that will get written. >> >> At some point you will have to run them too, I don't think you want to >> reimplement the discovery part yet another time. > > > What I think he was getting at is that 20-30 lines of test discovery have to > be written once for each project (or none if using py.test/nose), but > self.assertequals and all of the other quirks of unittest are all over a > test suite and you need to write all of it each time you have to make a > test. > > Not that what you are trying to do is pointless, but fixing this other > problem is so much more interesting... > This is incredible pointless if you think it this way, "only 20-30 lines". I really don't believe you will come up with something decent in 20-30 lines if you intend this to be reusable for nose and maybe py.test (although I haven't looked much into py.test), it is not just about finding files, have you read the previous emails in the discussion ? > > -- > Leonardo Santagada > santagada at gmail.com -- -- Guilherme H. Polo Goncalves From wilk at flibuste.net Thu Mar 12 13:25:36 2009 From: wilk at flibuste.net (William Dode) Date: Thu, 12 Mar 2009 12:25:36 +0000 (UTC) Subject: [Python-ideas] float vs decimal References: <50697b2c0903111248o1f1c62afi10ecfb08f78f0bef@mail.gmail.com> <50697b2c0903111439k21171e3t19d18bff1601d0fd@mail.gmail.com> Message-ID: On 11-03-2009, Chris Rebert wrote: [...] > It's in the FAQ section of the decimal module - > http://docs.python.org/library/decimal.html : > > 17. Is there a way to convert a regular float to a Decimal? > A. Yes, all binary floating point numbers can be exactly expressed as > a Decimal. An exact conversion may take more precision than intuition > would suggest, so we trap Inexact to signal a need for more precision: > def float_to_decimal(f): > [definition snipped] > > 17. Why isn?t the float_to_decimal() routine included in the module? > A. There is some question about whether it is advisable to mix binary > and decimal floating point. Also, its use requires some care to avoid > the representation issues associated with binary floating point: >>>> float_to_decimal(1.1) > Decimal('1.100000000000000088817841970012523233890533447265625') I understand. It means that explicit is better and that we should know what we do with decimal numbers... thanks -- William Dod? - http://flibuste.net Informaticien Ind?pendant From solipsis at pitrou.net Thu Mar 12 13:28:59 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 12 Mar 2009 12:28:59 +0000 (UTC) Subject: [Python-ideas] Fwd: Adding a test discovery into Python References: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> Message-ID: Guilherme Polo writes: > > This is incredible pointless if you think it this way, "only 20-30 > lines". I really don't believe you will come up with something decent > in 20-30 lines if you intend this to be reusable for nose and maybe > py.test (although I haven't looked much into py.test), it is not just > about finding files, have you read the previous emails in the > discussion ? +1. self.assertEqual is just a typing annoyance compared to the burden of getting the aforementioned "20-30 lines" right. (you can probably even rip off the nose.tools module and use its convenience functions within unittest-driven tests: ok_, eq_, etc.) From suraj at barkale.com Thu Mar 12 14:59:23 2009 From: suraj at barkale.com (Suraj Barkale) Date: Thu, 12 Mar 2009 13:59:23 +0000 (UTC) Subject: [Python-ideas] cd statement? References: <49B67E6C.6020206@molden.no> <9bfc700a0903100917w863e358l7bccf5513283d451@mail.gmail.com> <49B69558.3090000@molden.no> Message-ID: Sturla Molden writes: > > Arnaud Delobelle wrote: > > Have you tried IPython? > Yes, it has all that I miss, but it's ugly (at least on Windows, where > it runs in a DOS shell). It is getting there. The 0.9 release had wx interface with minimal functionality. I have crossed my fingers for 0.10 release. Regards, Suraj From dangyogi at gmail.com Thu Mar 12 16:40:53 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Thu, 12 Mar 2009 11:40:53 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> Message-ID: <49B92D05.4010807@gmail.com> Raymond Hettinger wrote: > James Knight observed that Indian/Pakistani numbering systems > group by hundreds. I'm not 100% sure here, but I believe that in India, they insert a separator after the first 3 digits, then another after 2 more digits, then every 3 digits after that (not sure if they use commas or periods, I think commas): 1,000,000,00,000 -bruce From bruce at leapyear.org Thu Mar 12 19:17:06 2009 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 12 Mar 2009 11:17:06 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49B92D05.4010807@gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> Message-ID: > Make both the thousands separator and decimal separator user specifiable > but not locale aware. -1.0 as it stands (or -1,0 if you prefer) When you say 'user' you mean 'developer'. Having the developer choose the separators means it *won't* be what the user wants. Why would you stick in separators if not to display to a user? If I'm French then all decimal points should be ',' not '.' regardless of what language the developer speaks, right? A format specifier that says "please use the local-specific separators when formatting this number" would be fine. We already have 'n' for this but suppose we choose ';' as the character for this (chosen because it looks like a '.' or a ',' which is are two of the three most common choices). For example format(x, '6;d') == format(x, '6n') and you can use '';' with any number type: format(x, '6;.3f') or format(x, '10;g'). I'd be inclined to always group in units of four digits if someone writes format(x, '6;x'). --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Mar 12 19:21:05 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Mar 2009 11:21:05 -0700 Subject: [Python-ideas] Adding a test discovery into Python In-Reply-To: References: <7C7FFEC01D01407CBC9BBD84092D42E6@RaymondLaptop1> Message-ID: On Wed, Mar 11, 2009 at 11:47 PM, Stefan Behnel wrote: > Raymond Hettinger wrote: >> I'm strongly for offering tools that make it easier to write >> the tests in the first place. ? The syntax used by py.test >> and nose is vastly superior to the one used by unittest.py, >> a module that is more Javathonic than Pythonic. >> [...] >> Factoid of the Day: ?In Py2.7's test_datetime module, >> the phrase self.assertEqual occurs 578 times. > > Doesn't that just scream for using a doctest instead? > > The interpreter driven type-think-copy-paste pattern works pretty well for > these things. That depends on how well all the other tests (those that *don't* use assertEquals) fit in doctest's mold. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Thu Mar 12 19:28:38 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 11:28:38 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> Message-ID: <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> [Bruce Leban] > If I'm French then all decimal points should be ',' not '.' regardless > of what language the developer speaks, right? We already have a locale aware solution and that should be used for internationalized apps. The locale module is not going away. This proposal is for everyday programs for local consumption (most scripts never get internationalized). I would even venture that most Python scripts are not written by professional programmers. If an accountant needs to knock-out a quick report, he/she should have a simple means of basic formatting without invoking all of the locale machinery. Raymond From guido at python.org Thu Mar 12 19:42:21 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Mar 2009 11:42:21 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> Message-ID: Raymond, aren't you equating "local" with the US? The local module lets you take the locale as a separate parameter. I agree we should not try to duplicate it (though it's a bad API since it relies on global state -- that doesn't work very well in multi-threaded or web apps). But it does make sense for an accountant in France or Holland to hardcode her desire for a decimal comma and thousand-separating periods, as otherwise their boss won't be able to interpret the output. On Thu, Mar 12, 2009 at 11:28 AM, Raymond Hettinger wrote: > > [Bruce Leban] >> >> If I'm French then all decimal points should be ',' not '.' regardless >> of what language the developer speaks, right? > > We already have a locale aware solution and that should > be used for internationalized apps. ?The locale module > is not going away. > > This proposal is for everyday programs for local consumption > (most scripts never get internationalized). ?I would even > venture that most Python scripts are not written by > professional programmers. ?If an accountant needs to knock-out > a quick report, he/she should have a simple means of basic formatting > without invoking all of the locale machinery. > > > Raymond > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From pyideas at rebertia.com Thu Mar 12 19:44:13 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Thu, 12 Mar 2009 11:44:13 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49B92D05.4010807@gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> Message-ID: <50697b2c0903121144l48a89a08p35bb0fb902896808@mail.gmail.com> On Thu, Mar 12, 2009 at 8:40 AM, Bruce Frederiksen wrote: > Raymond Hettinger wrote: >> >> ? James Knight observed that Indian/Pakistani numbering systems >> ? group by hundreds. > > I'm not 100% sure here, but I believe that in India, they insert a separator > after the first 3 digits, then another after 2 more digits, then every 3 > digits after that (not sure if they use commas or periods, I think commas): > > 1,000,000,00,000 Not quite. I'm not Indian, but based off Wikipedia (http://en.wikipedia.org/wiki/Lakh): "after the first three digits, a comma divides every two rather than every three digits, thus: Indian system: 12,12,12,123 5,05,000 7,00,00,00,000" Cheers, Chris -- I have a blog: http://blog.rebertia.com From python at rcn.com Thu Mar 12 20:02:07 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 12:02:07 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> Message-ID: <7120107372EF49CC98939762F1C6D318@RaymondLaptop1> [GvR] > Raymond, aren't you equating "local" with the US? Not at all. "For local consumption" meant anything that isn't distributed as a fully internationalized app. Right now, all our reprs and string interpolations are not locale-aware (i.e. float reprs are hardwired to use periods for the decimal separator). Those tools are pretty useful to us in day-to-day work. I'm just proposing to extend those non-locale-aware capabilities to include a thousands separator. For a fully internationalized app, I would use something like Babel which addresses the challenge in a comprehensive and uniform manner. > The local module lets you take the locale as a separate parameter. I > agree we should not try to duplicate it (though it's a bad API since > it relies on global state -- that doesn't work very well in > multi-threaded or web apps). > > But it does make sense for an accountant in France or Holland to > hardcode her desire for a decimal comma and thousand-separating > periods, as otherwise their boss won't be able to interpret the > output. Well said. Raymond From jimjjewett at gmail.com Thu Mar 12 20:29:15 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 12 Mar 2009 15:29:15 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> Message-ID: On 3/12/09, Raymond Hettinger wrote: > If an accountant needs to knock-out > a quick report, he/she should have a simple means of basic > formatting without invoking all of the locale machinery. Fair enough. But what does a thousands separator provide that the "n" type doesn't already provide? (Well, except that n isn't as well known -- but initially this won't be either.) Do you want to avoid using locale even in the background? Do you want to avoid having to set a locale in the program startup? Do you want a better default for locale? Do you really want a different type, such as "m" for money? (That sounds sensible to me, except that there are so many different standard ways to format money, even within the US, so I'm not sure a single format would do it.) -jJ From python at rcn.com Thu Mar 12 20:51:07 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 12:51:07 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> Message-ID: [Jim Jewett] > Fair enough. But what does a thousands separator provide that the "n" > type doesn't already provide? (Well, except that n isn't as well > known -- but initially this won't be either.) It's nice to be have a non-locale aware alternative so you can say explicitly what you want. This is especially helpful in Guido's example where you need to format for a different locale than the one that is currently on your machine (i.e. the global state doesn't match the target). FWIW, C-Sharp provides both ways, a locale aware "n" format and a hard-wired explicit thousands separator. See the updated PEP for examples and a link. > Do you want to avoid using locale even in the background? I thought locale was always there. > Do you want to avoid having to set a locale in the program startup? Yes. I don't think most casaul users should have to figure that out. It's a little to magical and arcane: >>> import local >>> locale.setlocale(locale.LC_ALL, 'English_United States.1252') > Do you want a better default for locale? The default does suck: >>> format(1237, "n") '1237' > Do you really want a different type, such as "m" for money? I don't but I'm sure someone does. I did write a money formatter sample recipe for the decimal docs so people would have something to model from. FWIW, I've always thought it weird that the currency symbol could shift with a locale setting. ISTM, that if you change the symbol, you also have to change the amount that goes with it :-) Raymond From mrs at mythic-beasts.com Thu Mar 12 21:24:10 2009 From: mrs at mythic-beasts.com (Mark Seaborn) Date: Thu, 12 Mar 2009 20:24:10 +0000 (GMT) Subject: [Python-ideas] CapPython's use of unbound methods Message-ID: <20090312.202410.846948621.mrs@localhost.localdomain> Guido asked me to explain why the removal of unbound methods in Python 3.0 causes a problem for enforcing encapsulation in CapPython (an object-capability subset of Python), which I talked about in a blog post [1]. It also came up on python-dev [2]. Let me try a slightly different example to answer Guido's immediate question. Suppose we have an object x with a private attribute, "_field", defined by a class Foo: class Foo(object): def __init__(self): self._field = "secret" x = Foo() Suppose CapPython code is handed x. It should not be able to read x._field, and the expression x._field will be rejected by CapPython's static verifier. However, in Python 3.0, the CapPython code can do this: class C(object): def f(self): return self._field C.f(x) # returns "secret" Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is not being called on an instance of C. Guido said, "I don't understand where the function object f gets its magic powers". The answer is that function definitions directly inside class statements are treated specially by the verifier. If you wrote the same function definition at the top level: def f(var): return var._field # rejected the attribute access would be rejected by the verifier, because "var" is not a self variable, and private attributes may only be accessed through self variables. I renamed the variable in the example, but the name of the variable makes no difference to whether it is considered to be a self variable. Self variables are defined as follows: If a function definition "def f(v1, ...)" appears immediately within a "class" statement, the function's first argument, v1, is a self variable, provided that: * the "def" is not preceded by any decorators, and * "f" is not read anywhere in class scope and is not declared as global. The reason for these two restrictions is to prevent the function object from escaping and being used directly. Mark [1] http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-python-30.html [2] http://mail.python.org/pipermail/python-dev/2008-September/082499.html From solipsis at pitrou.net Thu Mar 12 21:33:03 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 12 Mar 2009 20:33:03 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?Rough_draft=3A_Proposed_format_specifier?= =?utf-8?q?_for_a=09thousands_separator_=28discussion_moved_from_py?= =?utf-8?q?thon-dev=29?= References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> Message-ID: Jim Jewett writes: > > Do you want to avoid using locale even in the background? > Do you want to avoid having to set a locale in the program startup? > Do you want a better default for locale? As Guido said, a problem is that locale relies on shared state. It makes it very painful to use (any module setting the locale to a value which suits its semantics can negatively impact other modules or libraries in your application). But even worse is that the desired locale is not necessarily installed. For example if I develop an app for French users but it is hosted on an US server, perhaps the 'fr_FR' locale won't be available at all. From eric at trueblade.com Thu Mar 12 22:24:01 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 12 Mar 2009 17:24:01 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> Message-ID: <49B97D71.5070107@trueblade.com> Raymond Hettinger wrote: > > [Jim Jewett] >> Fair enough. But what does a thousands separator provide that the "n" >> type doesn't already provide? (Well, except that n isn't as well >> known -- but initially this won't be either.) > > It's nice to be have a non-locale aware alternative so you can say > explicitly what you want. This is especially helpful in Guido's example > where you need to format for a different locale than the one that is > currently on your machine (i.e. the global state doesn't match the target). I've always thought that we should have a utility function which formats a number based on the same settings that are in the locale, but not actually use the locale. Something like: format_number(123456787654321.123, decimal_point=',', thousands_sep=' ', grouping=[4, 3, 2]) >>> '12 34 56 78 765 4321,123' That would get rid of threading issues, and you wouldn't have to worry about what locales were installed. I basically have this function in the various formatting routines, it just needs to be pulled out and exposed. >> Do you really want a different type, such as "m" for money? > > I don't but I'm sure someone does. I did write a money formatter > sample recipe for the decimal docs so people would have something > to model from. This becomes easier with the hypothetical "format_number" routine. But this is all orthogonal to the str.format() discussion. Eric. From guido at python.org Thu Mar 12 22:33:23 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 12 Mar 2009 14:33:23 -0700 Subject: [Python-ideas] CapPython's use of unbound methods In-Reply-To: <20090312.202410.846948621.mrs@localhost.localdomain> References: <20090312.202410.846948621.mrs@localhost.localdomain> Message-ID: On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn wrote: > Guido asked me to explain why the removal of unbound methods in Python > 3.0 causes a problem for enforcing encapsulation in CapPython (an > object-capability subset of Python), which I talked about in a blog > post [1]. ?It also came up on python-dev [2]. > > Let me try a slightly different example to answer Guido's immediate > question. > > Suppose we have an object x with a private attribute, "_field", > defined by a class Foo: > > class Foo(object): > > ? ?def __init__(self): > ? ? ? ?self._field = "secret" > > x = Foo() Can you add some principals to this example? Who wrote the Foo class definition? Does CapPython have access to the source code for Foo? To the class object? > Suppose CapPython code is handed x. What does it mean to "hand x to CapPython"? Who "is" CapPython? > It should not be able to read > x._field, and the expression x._field will be rejected by CapPython's > static verifier. > > However, in Python 3.0, the CapPython code can do this: > > class C(object): > > ? ?def f(self): > ? ? ? ?return self._field > > C.f(x) # returns "secret" > > Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is > not being called on an instance of C. In Python 2.x I could write class C(Foo): def f(self): return self._field or alternatively class C(x.__class__): > Guido said, "I don't understand where the function object f gets its > magic powers". > > The answer is that function definitions directly inside class > statements are treated specially by the verifier. Hm, this sounds like a major change in language semantics, and if I were Sun I'd sue you for using the name "Python" in your product. :-) > If you wrote the same function definition at the top level: > > def f(var): > ? ?return var._field # rejected > > the attribute access would be rejected by the verifier, because "var" > is not a self variable, and private attributes may only be accessed > through self variables. > > I renamed the variable in the example, What do you mean by this? > but the name of the variable > makes no difference to whether it is considered to be a self variable. > > Self variables are defined as follows: > > If a function definition "def f(v1, ...)" appears immediately within a > "class" statement, the function's first argument, v1, is a self > variable, provided that: > ?* the "def" is not preceded by any decorators, and > ?* "f" is not read anywhere in class scope and is not declared as global. > > The reason for these two restrictions is to prevent the function > object from escaping and being used directly. Do you also catch things like g = getattr s = 'field'.replace('f', '_f') print g(x, s) ? > Mark > > [1] http://lackingrhoticity.blogspot.com/2008/09/cappython-unbound-methods-and-python-30.html > [2] http://mail.python.org/pipermail/python-dev/2008-September/082499.html -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Thu Mar 12 22:36:14 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Mar 2009 10:36:14 +1300 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> Message-ID: <49B9804E.3010501@canterbury.ac.nz> Bruce Leban wrote: > When you say 'user' you mean 'developer'. Having the developer choose > the separators means it *won't* be what the user wants. Why would you > stick in separators if not to display to a user? I agree. I don't see a use case for hard-coding non-standard separators into every format string. So I'm +1 on proposal I and -1 on proposal II. Also +1 on providing a "use the locale" option that's orthogonal to the type specifier. -- Greg From solipsis at pitrou.net Thu Mar 12 23:25:03 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 12 Mar 2009 22:25:03 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > I agree. I don't see a use case for hard-coding non-standard > separators into every format string. Sorry, but what do you call "non-standard" exactly? From eric at trueblade.com Fri Mar 13 00:31:39 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 12 Mar 2009 19:31:39 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49B97D71.5070107@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> Message-ID: <49B99B5B.7060808@trueblade.com> Eric Smith wrote: > I've always thought that we should have a utility function which formats > a number based on the same settings that are in the locale, but not > actually use the locale. Something like: > > format_number(123456787654321.123, decimal_point=',', thousands_sep=' ', > grouping=[4, 3, 2]) > >>> '12 34 56 78 765 4321,123' To be maximally useful (for example, so it could be used in Decimal to implement locale formatting), maybe it should work on strings: >>> format_number(whole_part='123456787654321', fractional_part='123', decimal_point=',', thousands_sep=' ', grouping=[4, 3, 2]) >>> '12 34 56 78 765 4321,123' >>> format_number(whole_part='123456787654321', decimal_point=',', thousands_sep='.', grouping=[4, 3, 2]) >>> '12.34.56.78.765.4321' I think such a method, along with locale.localeconv(), would be the workhorse for much of formatting we've been talking about. It could be flushed out with the sign and other remaining fields from localeconv(). The key point is that it takes everything as parameters and doesn't use any global state. In particular, it by itself would not reference the locale. I'll probably add such a routine anyway, even if it doesn't get documented as a public API. Eric. From python at rcn.com Fri Mar 13 00:37:30 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 16:37:30 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> Message-ID: <2385B41B46D1491388AED0C1CE7E3060@RaymondLaptop1> Today's updates to http://www.python.org/dev/peps/pep-0378/ * Specify what width means when thousands separators are present. * Clarify that the locale module is not being proposed to change. * Add research on what is done in C-Sharp, MS-Excel, COBOL, and CommonLisp. * Add more examples. Raymond From dangyogi at gmail.com Fri Mar 13 00:41:59 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Thu, 12 Mar 2009 19:41:59 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49B99B5B.7060808@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> Message-ID: <49B99DC7.2020405@gmail.com> Eric Smith wrote: > >>> format_number(whole_part='123456787654321', > decimal_point=',', > thousands_sep='.', > grouping=[4, 3, 2]) > >>> '12.34.56.78.765.4321' > Maybe the 'thousands_sep' parameter should be called 'grouping_sep' (since it doesn't always group by thousands)? -bruce frederiksen From python at rcn.com Fri Mar 13 00:57:19 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 16:57:19 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> Message-ID: <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1> >> I've always thought that we should have a utility function which formats >> a number based on the same settings that are in the locale, but not >> actually use the locale. Something like: >> >> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ', >> grouping=[4, 3, 2]) >> >>> '12 34 56 78 765 4321,123' > > To be maximally useful (for example, so it could be used in Decimal to > implement locale formatting), maybe it should work on strings: > > >>> format_number(whole_part='123456787654321', > fractional_part='123', > decimal_point=',', > thousands_sep=' ', > grouping=[4, 3, 2]) > >>> '12 34 56 78 765 4321,123' Whoa guys! I think you're treading very far away from and rejecting the whole idea of PEP 3101 which was to be the one ring to bind them all with format(obj, fmt) having just two arguments and doing nothing but passing them on to obj.__fmt__() which would be responsible for parsing a format string. Also,even if you wanted a flexible clear separate tool just for number formatting, I don't think keyword arguments are the way to go. That is a somewhat heavy approach with limited flexibility. The research in PEP 378 shows that for languages needing fine control and extreme versatility in formatting, some kind of picture string is the way to go. MS Excel is a champ at number/date formatting strings: #,##0 and whatnot. The allow negatives to have placeholders, trailing minus signs, parentheses, etc. Columns can be aligned neating, any type of padding can be used, any type of separator may be specified. The COBOL picture statements also offer flexibility and clarity. Mini-languages of some sort beat the heck out of functions with a zillion optional arguments. Raymond "Working with creative thinkers can be like herding cats." From eric at trueblade.com Fri Mar 13 01:02:20 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 12 Mar 2009 20:02:20 -0400 Subject: [Python-ideas] locale-independent number formatting (was: Rough draft: Proposed format specifier for a thousands separator) In-Reply-To: <49B99DC7.2020405@gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <49B99DC7.2020405@gmail.com> Message-ID: <49B9A28C.9020708@trueblade.com> Bruce Frederiksen wrote: > Eric Smith wrote: >> >>> format_number(whole_part='123456787654321', >> decimal_point=',', >> thousands_sep='.', >> grouping=[4, 3, 2]) >> >>> '12.34.56.78.765.4321' >> > Maybe the 'thousands_sep' parameter should be called 'grouping_sep' > (since it doesn't always group by thousands)? > > -bruce frederiksen > thousands_sep is the locale.localeconv() name, which I suggest we use. I suggest that this particular API only support the LC_NUMERIC fields (decimal_point, grouping, thousands_sep), and that maybe we have a separate format_money which supports the LC_MONETARY fields. Eric. From eric at trueblade.com Fri Mar 13 01:08:23 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 12 Mar 2009 20:08:23 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1> Message-ID: <49B9A3F7.5010407@trueblade.com> Raymond Hettinger wrote: > Whoa guys! I think you're treading very far away from and rejecting the > whole idea of PEP 3101 which was to be the one ring to bind them all with > format(obj, fmt) having just two arguments and doing nothing but > passing them on to obj.__fmt__() which would be responsible for > parsing a format string. I completely agree. That's why I said "But this is all orthogonal to the str.format() discussion." I meant "orthogonal" in the "unrelated" sense. I'm completely on board with your PEP 378 as a simple way just to get some simple formatting into numbers. > Also,even if you wanted a flexible clear separate tool just for number > formatting, I don't think keyword arguments are the way to go. > That is a somewhat heavy approach with limited flexibility. > The research in PEP 378 shows that for languages needing > fine control and extreme versatility in formatting, some kind > of picture string is the way to go. MS Excel is a champ > at number/date formatting strings: #,##0 and whatnot. > The allow negatives to have placeholders, trailing minus signs, > parentheses, etc. Columns can be aligned neating, any type > of padding can be used, any type of separator may be specified. > The COBOL picture statements also offer flexibility and clarity. > Mini-languages of some sort beat the heck out of functions > with a zillion optional arguments. I think picture based is okay and has its place, but a routine like my proposed format_number (which I know is a bad name) is really the heavy lifter for all locale-based number formatting. Decimal shouldn't really have to completely reimplement locale-based formatting, especially when it already exists in the core. I just want to expose it. Eric. From python at rcn.com Fri Mar 13 01:18:05 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 17:18:05 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1> <49B9A3F7.5010407@trueblade.com> Message-ID: <5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1> [Eric Smith] > Decimal shouldn't really > have to completely reimplement locale-based formatting, especially when > it already exists in the core. I just want to expose it. I see. Sounds like you're looking for the parser to have some hooks so that people writing new __format__ methods don't have to start from scratch. > I completely agree. That's why I said "But this is all orthogonal to the > str.format() discussion." I meant "orthogonal" in the "unrelated" sense. Makes sense. Hopefully, we can get this thread back on track for evaluating the proposal for a minor buildout to the existing mini-language. Raymond From eric at trueblade.com Fri Mar 13 01:22:01 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 12 Mar 2009 20:22:01 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1> <49B9A3F7.5010407@trueblade.com> <5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1> Message-ID: <49B9A729.5020207@trueblade.com> Raymond Hettinger wrote: > > [Eric Smith] >> Decimal shouldn't really have to completely reimplement locale-based >> formatting, especially when it already exists in the core. I just want >> to expose it. > > I see. Sounds like you're looking for the parser to have some hooks > so that people writing new __format__ methods don't have to start > from scratch. Not necessarily hooks, but some support routines. I think the standard format specifier parser should be exposed, and also the locale-based formatter should be exposed. These are both unrelated to PEP 378, but they could be used to implement it. They'd be especially useful for non-builtin types like Decimal. > Makes sense. Hopefully, we can get this thread back on track for > evaluating the proposal for a minor buildout to the existing mini-language. Right. Apologies for hijacking it, and especially for not making it clear that I was veering off subject. Eric. From greg.ewing at canterbury.ac.nz Fri Mar 13 02:15:55 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Mar 2009 14:15:55 +1300 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> Message-ID: <49B9B3CB.2090906@canterbury.ac.nz> Antoine Pitrou wrote: > Greg Ewing writes: >> I agree. I don't see a use case for hard-coding non-standard >> separators into every format string. > > Sorry, but what do you call "non-standard" exactly? I mean something other than "," and ".". My point is that while it's perfectly reasonable for, e.g. a French programmer to want to format his numbers with dots and commas the other way around, it's *not* reasonable to force him to tediously specify it in each and every format specifier he writes. There needs to be some way of setting it once for the whole program, otherwise it just won't be practical. -- Greg From steve at pearwood.info Fri Mar 13 02:18:55 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 13 Mar 2009 12:18:55 +1100 Subject: [Python-ideas] [Python-Dev] Ext4 data loss In-Reply-To: References: <200903120948.28250.steve@pearwood.info> Message-ID: <200903131218.55647.steve@pearwood.info> On Thu, 12 Mar 2009 12:26:40 pm zooko wrote: > > Would there be interest in a filetools module? Replies and > > discussion to python-ideas please. > > I've been using and maintaining a few filesystem hacks for, let's > see, almost nine years now: > > http://allmydata.org/trac/pyutil/browser/pyutil/pyutil/fileutil.py > > (The first version of that was probably written by Greg Smith in > about 1999.) > > I'm sure there are many other such packages. A couple of quick > searches of pypi turned up these two: > > http://pypi.python.org/pypi/Pythonutils > http://pypi.python.org/pypi/fs > > I wonder if any of them have the sort of functionality you're > thinking of. Close, but not quite. I'm suggesting a module with a collection of subclasses of file that exhibit modified behaviour. For example: class FlushOnWrite(file): def write(self, data): super(FlushOnWrite, self).write(data) self.flush() # similarly for writelines class SyncOnWrite(FlushOnWrite): # ... class SyncOnClose(file): # ... plus functions which implement common idioms for safely writing data, making backups on a save, etc. A common idiom for safely over-writing a file while minimising the window of opportunity for file loss is: write to a temporary file and close it move the original to a backup location move the temporary file to where the original was if no errors, delete the backup although when I say "common" what I really mean is that it should be common, but probably isn't :-/ The sort of file handling that is complicated and tedious to get right, and so most developers don't bother, and those that do are re-inventing the wheel. There's a couple of recipes in the Python Cookbook which might be candidates. E.g. the first edition has recipes "Versioning Filenames" by Robin Parmar and "Module: Versioned Backups" by Mitch Chapman. What I DON'T mean is pathname utilities. Nor do I mean mini-applications that operate on files, like renaming file extensions, deleting files that meet some criterion, etc. I don't think they belong in the standard library, and even if they do, they don't belong in this proposed module. My intention is to offer a standard set of tools so people can choose the behaviour that suits their application best, rather than trying to make file() a one-size-fits-all solution. -- Steven D'Aprano From python at rcn.com Fri Mar 13 03:24:22 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 12 Mar 2009 19:24:22 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <9F3FF5E648CF4248A133D40E7B35C93D@RaymondLaptop1> <49B9A3F7.5010407@trueblade.com> <5BA897EBA01C43DBB1532FA665680CEC@RaymondLaptop1> <49B9A729.5020207@trueblade.com> Message-ID: <51AA03163D3942C9A74D633B1CD8BC0E@RaymondLaptop1> [Eric Smith] > Right. Apologies for hijacking it, and especially for not making it > clear that I was veering off subject. No problem. It was an interesting side discussion. I've updated the PEP to include your variant that doesn't use T. The examples show that it is much cleaner looking and self-evident. Raymond From jimjjewett at gmail.com Fri Mar 13 04:50:17 2009 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 12 Mar 2009 23:50:17 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49B99B5B.7060808@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> Message-ID: On 3/12/09, Eric Smith wrote: > Eric Smith wrote: >> ... formats >> a number based on the same settings that are in the locale, but not >> actually use the locale. ... > The key point is that it takes everything as parameters and doesn't use > any global state. In particular, it by itself would not reference the > locale. Why not? You'll need *some* default for decimal_point, and the one from localeconv makes at least as much sense as a hard-coded default. I agree that it shouldn't *change* anything in the locale, and any keywords explicitly passed in should override locale, but if it never looks at locale, you'll get patterns like import locale kw=dict(locale.localeconv) kw['thousands_sep']=' ' new_util_func(number, **kw) -jJ From solipsis at pitrou.net Fri Mar 13 11:02:22 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 13 Mar 2009 10:02:22 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> <49B9B3CB.2090906@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > My point is that while it's perfectly reasonable > for, e.g. a French programmer to want to format his > numbers with dots and commas the other way around, > it's *not* reasonable to force him to tediously specify > it in each and every format specifier he writes. A program often formatting numbers the same way can factor that into dedicated helpers: def format_float(f): return "{0:T.,2f}".format(f) or even: format_float = "{0:T.,2f}".format Regards Antoine. From greg.ewing at canterbury.ac.nz Fri Mar 13 11:26:05 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 13 Mar 2009 23:26:05 +1300 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> <49B9B3CB.2090906@canterbury.ac.nz> Message-ID: <49BA34BD.6070005@canterbury.ac.nz> Antoine Pitrou wrote: > A program often formatting numbers the same way can factor that into dedicated > helpers: If that's an acceptable thing to do on a daily basis, then we don't need format strings at all. -- Greg From denis.spir at free.fr Fri Mar 13 11:56:02 2009 From: denis.spir at free.fr (spir) Date: Fri, 13 Mar 2009 11:56:02 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49B99B5B.7060808@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> Message-ID: <20090313115602.3f9a19b9@o> Le Thu, 12 Mar 2009 19:31:39 -0400, Eric Smith s'exprima ainsi: > > I've always thought that we should have a utility function which formats > > a number based on the same settings that are in the locale, but not > > actually use the locale. Something like: > > > > format_number(123456787654321.123, decimal_point=',', thousands_sep=' ', > > grouping=[4, 3, 2]) > > >>> '12 34 56 78 765 4321,123' > > To be maximally useful (for example, so it could be used in Decimal to > implement locale formatting), maybe it should work on strings: > > >>> format_number(whole_part='123456787654321', > fractional_part='123', > decimal_point=',', > thousands_sep=' ', > grouping=[4, 3, 2]) > >>> '12 34 56 78 765 4321,123' > > >>> format_number(whole_part='123456787654321', > decimal_point=',', > thousands_sep='.', > grouping=[4, 3, 2]) > >>> '12.34.56.78.765.4321' I find the overall problem of providing an interface to specify a number format rather challenging. The issue I see is to design a formatting pattern that is simple, clear, _and_ practicle. A practicle pattern is easy to specify, but then it becomes rather illegible and/or hard to remember, while a legible one ends up excessively verbose. I have the impression, but I may well be wrong, that contrarily to a format, a *formatted number* instead seems easy to scan -- with human eyes. So, as a crazy idea, I wonder whether we shouldn't let the user provide a example formatted number instead. This may address most of use cases, but probably not all. To makes things easier, why not specify a canonical number, such as '-123456.789', of which the user should define the formatted version? Then a smart parser could deduce the format to be applied to further numbers. Below a purely artificial example. -123456.789 --> kg 00_123_456,79- format: unit: 'kg' unit_pos: LEFT unit_sep: ' ' thousand_sep: '_' fract_sep : ',' sign_pos: RIGHT sign_sep: None padding_char: '0' There are obvious issues: * Does rouding apply to whole precision (number of significative digits), or to the fractional part only? Then, should the format be interpreted as the most common case (probably fract. rounding), provide a disambiguation flag, provide a flag for non-default case only? What if rounding applies after a big number of digits? Should we instead allow the user providing a longer number? * Similar for padding: does it apply to the length of the whole number or to the integral part (common in financial apps to align decimal signs). What if the padding applies to a smaller number of digits than the one of the canonical number. Should we instead allow the user providing a shorter number? * probably more... The space of valid formats can be specified using a parsing grammar, so that a parse failure indicates invalid format, and a "tagged" parse tree provides all the information needed to construct a format object. Really do not know whether this idea is stupid or worth beeing explored ;-) [But I would well try it for personal use. At least as everyday-fast-and-easy feature.] Denis ------ la vita e estrany From solipsis at pitrou.net Fri Mar 13 11:58:16 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 13 Mar 2009 10:58:16 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> <49B9B3CB.2090906@canterbury.ac.nz> <49BA34BD.6070005@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > > A program often formatting numbers the same way can factor that into dedicated > > helpers: > > If that's an acceptable thing to do on a daily basis, > then we don't need format strings at all. Why exactly? From denis.spir at free.fr Fri Mar 13 12:05:25 2009 From: denis.spir at free.fr (spir) Date: Fri, 13 Mar 2009 12:05:25 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> Message-ID: <20090313120525.4e3d9aeb@o> Le Thu, 12 Mar 2009 23:50:17 -0400, Jim Jewett s'exprima ainsi: > Why not? You'll need *some* default for decimal_point, and the one > from localeconv makes at least as much sense as a hard-coded default. > > I agree that it shouldn't *change* anything in the locale, and any > keywords explicitly passed in should override locale, but if it never > looks at locale, you'll get patterns like I think this makes much sense. Actually, there may be a principle similar to 'cascade overriding' in CSS sheets: the last one who speaks wins. In the case of number formatting, this could be eg a cascade of: locale format --> coded format --> end-user config format denis ------ la vita e estrany From eric at trueblade.com Fri Mar 13 12:27:14 2009 From: eric at trueblade.com (Eric Smith) Date: Fri, 13 Mar 2009 07:27:14 -0400 Subject: [Python-ideas] String formatting utility functions Message-ID: <49BA4312.3000804@trueblade.com> Jim Jewett wrote: > On 3/12/09, Eric Smith wrote: >> Eric Smith wrote: >>> ... formats >>> a number based on the same settings that are in the locale, but not >>> actually use the locale. > ... > >> The key point is that it takes everything as parameters and doesn't use >> any global state. In particular, it by itself would not reference the >> locale. > > Why not? You'll need *some* default for decimal_point, and the one > from localeconv makes at least as much sense as a hard-coded default. I guess you could do this, but I can't see it ever actually being used that way. Do you really want to only specify that you're using commas for thousands, then find that someone has switched the locale to one where a comma is the decimal character? new_util_func(1234.56, thousands_sep=',') '1,234,56' Best to be explicit on what you're expecting. My use case for this function is one where all of the arguments are known and specified every time. Specifically it's for implementing 'n' formatting for Decimal or other numeric types. You will either know the arguments, or want to use every one of them from the locale. If you're using the locale, just call localeconv() and use every value you get back. I don't have a mix-and-match use case. Eric. From eric at trueblade.com Fri Mar 13 12:35:44 2009 From: eric at trueblade.com (Eric Smith) Date: Fri, 13 Mar 2009 07:35:44 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> <49B9B3CB.2090906@canterbury.ac.nz> Message-ID: <49BA4510.4000706@trueblade.com> Antoine Pitrou wrote: > Greg Ewing writes: >> My point is that while it's perfectly reasonable >> for, e.g. a French programmer to want to format his >> numbers with dots and commas the other way around, >> it's *not* reasonable to force him to tediously specify >> it in each and every format specifier he writes. > > A program often formatting numbers the same way can factor that into dedicated > helpers: > > def format_float(f): > return "{0:T.,2f}".format(f) > > or even: > > format_float = "{0:T.,2f}".format Or: float_fmt = 'T.,2f' then you can re-use it everywhere, and multiple times in a single .format() expression: '{0:{fmt}} {1:{fmt}}.format(3.14, 2.72, fmt=float_fmt) (Try that with %-formatting! :-) Or with a slight modification to the work I'm doing to implement auto-numbering: '{:{fmt}} {:{fmt}}'.format(3.14, 2.78, fmt=float_fmt) (but this is a different issue!) Eric. From lie.1296 at gmail.com Fri Mar 13 12:46:32 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Fri, 13 Mar 2009 22:46:32 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <20090313115602.3f9a19b9@o> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> Message-ID: spir wrote: > Le Thu, 12 Mar 2009 19:31:39 -0400, > Eric Smith s'exprima ainsi: > >>> I've always thought that we should have a utility function which formats >>> a number based on the same settings that are in the locale, but not >>> actually use the locale. Something like: >>> >>> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ', >>> grouping=[4, 3, 2]) >>> >>> '12 34 56 78 765 4321,123' >> To be maximally useful (for example, so it could be used in Decimal to >> implement locale formatting), maybe it should work on strings: >> >> >>> format_number(whole_part='123456787654321', >> fractional_part='123', >> decimal_point=',', >> thousands_sep=' ', >> grouping=[4, 3, 2]) >> >>> '12 34 56 78 765 4321,123' >> >> >>> format_number(whole_part='123456787654321', >> decimal_point=',', >> thousands_sep='.', >> grouping=[4, 3, 2]) >> >>> '12.34.56.78.765.4321' > > > I find the overall problem of providing an interface to specify a number format rather challenging. The issue I see is to design a formatting pattern that is simple, clear, _and_ practicle. A practicle pattern is easy to specify, but then it becomes rather illegible and/or hard to remember, while a legible one ends up excessively verbose. > > I have the impression, but I may well be wrong, that contrarily to a format, a *formatted number* instead seems easy to scan -- with human eyes. So, as a crazy idea, I wonder whether we shouldn't let the user provide a example formatted number instead. This may address most of use cases, but probably not all. > > To makes things easier, why not specify a canonical number, such as '-123456.789', of which the user should define the formatted version? Then a smart parser could deduce the format to be applied to further numbers. Below a purely artificial example. > > -123456.789 --> kg 00_123_456,79- > > format: > unit: 'kg' > unit_pos: LEFT > unit_sep: ' ' > thousand_sep: '_' > fract_sep : ',' > sign_pos: RIGHT > sign_sep: None > padding_char: '0' > > There are obvious issues: > * Does rouding apply to whole precision (number of significative digits), or to the fractional part only? Then, should the format be interpreted as the most common case (probably fract. rounding), provide a disambiguation flag, provide a flag for non-default case only? What if rounding applies after a big number of digits? Should we instead allow the user providing a longer number? > * Similar for padding: does it apply to the length of the whole number or to the integral part (common in financial apps to align decimal signs). What if the padding applies to a smaller number of digits than the one of the canonical number. Should we instead allow the user providing a shorter number? > * probably more... > > The space of valid formats can be specified using a parsing grammar, so that a parse failure indicates invalid format, and a "tagged" parse tree provides all the information needed to construct a format object. > > Really do not know whether this idea is stupid or worth beeing explored ;-) [But I would well try it for personal use. At least as everyday-fast-and-easy feature.] Your proposal (other than being harder to implement), is similar to the way Excel handled formatting, but instead of sample number, they uses # for placeholder. If you really want to test-implement it, better try using that. And I think it is impossible for the parser to be that smart to recognize that sign pos should be put in the rear (the smartest parser might only treat it as literal negative). Also it is highly inflexible, what about custom positive sign? What if I want to use literal -? What about literal number? What about non-latin number? From denis.spir at free.fr Fri Mar 13 13:20:23 2009 From: denis.spir at free.fr (spir) Date: Fri, 13 Mar 2009 13:20:23 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> Message-ID: <20090313132023.1becb505@o> Le Fri, 13 Mar 2009 22:46:32 +1100, Lie Ryan s'exprima ainsi: > spir wrote: > > Le Thu, 12 Mar 2009 19:31:39 -0400, > > Eric Smith s'exprima ainsi: > > > >>> I've always thought that we should have a utility function which > >>> formats a number based on the same settings that are in the locale, but > >>> not actually use the locale. Something like: > >>> > >>> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ', > >>> grouping=[4, 3, 2]) > >>> >>> '12 34 56 78 765 4321,123' > >> To be maximally useful (for example, so it could be used in Decimal to > >> implement locale formatting), maybe it should work on strings: > >> > >> >>> format_number(whole_part='123456787654321', > >> fractional_part='123', > >> decimal_point=',', > >> thousands_sep=' ', > >> grouping=[4, 3, 2]) > >> >>> '12 34 56 78 765 4321,123' > >> > >> >>> format_number(whole_part='123456787654321', > >> decimal_point=',', > >> thousands_sep='.', > >> grouping=[4, 3, 2]) > >> >>> '12.34.56.78.765.4321' > > > > > > I find the overall problem of providing an interface to specify a number > > format rather challenging. The issue I see is to design a formatting > > pattern that is simple, clear, _and_ practicle. A practicle pattern is > > easy to specify, but then it becomes rather illegible and/or hard to > > remember, while a legible one ends up excessively verbose. > > > > I have the impression, but I may well be wrong, that contrarily to a > > format, a *formatted number* instead seems easy to scan -- with human > > eyes. So, as a crazy idea, I wonder whether we shouldn't let the user > > provide a example formatted number instead. This may address most of use > > cases, but probably not all. > > > > To makes things easier, why not specify a canonical number, such as > > '-123456.789', of which the user should define the formatted version? > > Then a smart parser could deduce the format to be applied to further > > numbers. Below a purely artificial example. > > > > -123456.789 --> kg 00_123_456,79- > > > > format: > > unit: 'kg' > > unit_pos: LEFT > > unit_sep: ' ' > > thousand_sep: '_' > > fract_sep : ',' > > sign_pos: RIGHT > > sign_sep: None > > padding_char: '0' > > > > There are obvious issues: > > * Does rouding apply to whole precision (number of significative digits), > > or to the fractional part only? Then, should the format be interpreted as > > the most common case (probably fract. rounding), provide a disambiguation > > flag, provide a flag for non-default case only? What if rounding applies > > after a big number of digits? Should we instead allow the user providing > > a longer number? > > * Similar for padding: does it apply to the length of the whole number or > > to the integral part (common in financial apps to align decimal signs). > > What if the padding applies to a smaller number of digits than the one > > of the canonical number. Should we instead allow the user providing a > > shorter number? > > * probably more... > > > > The space of valid formats can be specified using a parsing grammar, so > > that a parse failure indicates invalid format, and a "tagged" parse tree > > provides all the information needed to construct a format object. > > > > Really do not know whether this idea is stupid or worth beeing > > explored ;-) [But I would well try it for personal use. At least as > > everyday-fast-and-easy feature.] > > Your proposal (other than being harder to implement), is similar to the > way Excel handled formatting, but instead of sample number, they uses # > for placeholder. If you really want to test-implement it, better try > using that. Right. I also think now that "picture strings" pointed in the PEP are a better option for such needs. While they probably cannot handle issues such as ambiguity of precision or padding without additional parameters, neither. The only advantage of my proposal is that the user provides an example, instead of an abstract representation. > And I think it is impossible for the parser to be that smart to > recognize that sign pos should be put in the rear (the smartest parser > might only treat it as literal negative). ? Either I do not understand, or it is wrong. You can well have a parse expression allowing either a front or a rear sign, as long as there is a non-ambiguous sign-pattern. What does 'literal negative' mean? > Also it is highly inflexible, > what about custom positive sign? What if I want to use literal -? What > about literal number? What about non-latin number? ~ true. But this applies to any formatting rule, no? You have to specify eg which code point areas are allowed for valid digits -- and that must not overlap with code points allowed as sign, separators, or whatever. Custom signs are not a problem, as long as they do not conflict with digits or seps. Idem for non-latin.?These points are not specific to my proposal, they apply to any kind of formatting instead. > What if I want to use literal -? What about literal number? I do not understand your point. Denis ------ la vita e estrany From lie.1296 at gmail.com Fri Mar 13 16:09:28 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Sat, 14 Mar 2009 02:09:28 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <20090313132023.1becb505@o> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <20090313132023.1becb505@o> Message-ID: spir wrote: > Le Fri, 13 Mar 2009 22:46:32 +1100, > Lie Ryan s'exprima ainsi: > >> spir wrote: >>> Le Thu, 12 Mar 2009 19:31:39 -0400, >>> Eric Smith s'exprima ainsi: >>> >>>>> I've always thought that we should have a utility function which >>>>> formats a number based on the same settings that are in the locale, but >>>>> not actually use the locale. Something like: >>>>> >>>>> format_number(123456787654321.123, decimal_point=',', thousands_sep=' ', >>>>> grouping=[4, 3, 2]) >>>>> >>> '12 34 56 78 765 4321,123' >>>> To be maximally useful (for example, so it could be used in Decimal to >>>> implement locale formatting), maybe it should work on strings: >>>> >>>> >>> format_number(whole_part='123456787654321', >>>> fractional_part='123', >>>> decimal_point=',', >>>> thousands_sep=' ', >>>> grouping=[4, 3, 2]) >>>> >>> '12 34 56 78 765 4321,123' >>>> >>>> >>> format_number(whole_part='123456787654321', >>>> decimal_point=',', >>>> thousands_sep='.', >>>> grouping=[4, 3, 2]) >>>> >>> '12.34.56.78.765.4321' >>> >>> >>> I find the overall problem of providing an interface to specify a number >>> format rather challenging. The issue I see is to design a formatting >>> pattern that is simple, clear, _and_ practicle. A practicle pattern is >>> easy to specify, but then it becomes rather illegible and/or hard to >>> remember, while a legible one ends up excessively verbose. >>> >>> I have the impression, but I may well be wrong, that contrarily to a >>> format, a *formatted number* instead seems easy to scan -- with human >>> eyes. So, as a crazy idea, I wonder whether we shouldn't let the user >>> provide a example formatted number instead. This may address most of use >>> cases, but probably not all. >>> >>> To makes things easier, why not specify a canonical number, such as >>> '-123456.789', of which the user should define the formatted version? >>> Then a smart parser could deduce the format to be applied to further >>> numbers. Below a purely artificial example. >>> >>> -123456.789 --> kg 00_123_456,79- >>> >>> format: >>> unit: 'kg' >>> unit_pos: LEFT >>> unit_sep: ' ' >>> thousand_sep: '_' >>> fract_sep : ',' >>> sign_pos: RIGHT >>> sign_sep: None >>> padding_char: '0' >>> >>> There are obvious issues: >>> * Does rouding apply to whole precision (number of significative digits), >>> or to the fractional part only? Then, should the format be interpreted as >>> the most common case (probably fract. rounding), provide a disambiguation >>> flag, provide a flag for non-default case only? What if rounding applies >>> after a big number of digits? Should we instead allow the user providing >>> a longer number? >>> * Similar for padding: does it apply to the length of the whole number or >>> to the integral part (common in financial apps to align decimal signs). >>> What if the padding applies to a smaller number of digits than the one >>> of the canonical number. Should we instead allow the user providing a >>> shorter number? >>> * probably more... >>> >>> The space of valid formats can be specified using a parsing grammar, so >>> that a parse failure indicates invalid format, and a "tagged" parse tree >>> provides all the information needed to construct a format object. >>> >>> Really do not know whether this idea is stupid or worth beeing >>> explored ;-) [But I would well try it for personal use. At least as >>> everyday-fast-and-easy feature.] >> Your proposal (other than being harder to implement), is similar to the >> way Excel handled formatting, but instead of sample number, they uses # >> for placeholder. If you really want to test-implement it, better try >> using that. > > Right. I also think now that "picture strings" pointed in the PEP are a better option for such needs. While they probably cannot handle issues such as ambiguity of precision or padding without additional parameters, neither. The only advantage of my proposal is that the user provides an example, instead of an abstract representation. > >> And I think it is impossible for the parser to be that smart to >> recognize that sign pos should be put in the rear (the smartest parser >> might only treat it as literal negative). > > ? Either I do not understand, or it is wrong. Partially wrong, when I said "literal negative" I really meant "literal -". > You can well have a parse expression allowing either a front or a rear sign, as long as there is a non-ambiguous sign-pattern. > What does 'literal negative' mean? But what if I want ~ to denote negative number? >> Also it is highly inflexible, >> what about custom positive sign? What if I want to use literal -? What >> about literal number? What about non-latin number? > > ~ true. But this applies to any formatting rule, no? Yes, but using number example introduces lots of ambiguities. You must use parameters to avoid these ambiguities. > You have to specify eg which code point areas are allowed for valid digits -- and that must not overlap with code points allowed as sign, separators, or whatever. > Custom signs are not a problem, as long as they do not conflict with digits or seps. Idem for non-latin. These points are not specific to my proposal, they apply to any kind of formatting instead. How would the example format interpret this: 123 456~ When I want ~ to be the negative sign? What if I want < for negative and > for positive? Those are quite hyphotetical, but if we're talking about languages that doesn't use latin numeral, that sort of thing is very likely to happen. >> What if I want to use literal -? What about literal number? > > I do not understand your point. What if I want to I want my number to look like this: 123-4567 Using example format would have a hard time to guess whether the "-" should be a negative sign or literal "-". Maybe you can use escape characters, but that would turn the strongest point of example format to itself From python at rcn.com Sat Mar 14 00:40:30 2009 From: python at rcn.com (Raymond Hettinger) Date: Fri, 13 Mar 2009 16:40:30 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> Message-ID: <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> Todays updates to: http://www.python.org/dev/peps/pep-0378/ * Summarize commentary to date. * Add APOSTROPHE and non-breaking SPACE to the list of separators. * Add more links to external references. * Detail issues with the locale module. * Clarify how proposal II is parsed. From eric at trueblade.com Sat Mar 14 02:44:40 2009 From: eric at trueblade.com (Eric Smith) Date: Fri, 13 Mar 2009 21:44:40 -0400 Subject: [Python-ideas] String formatting and namedtuple In-Reply-To: <49B509A3.1080404@trueblade.com> References: <20090212141040.0c89e0fc@o> <70e6cf130902121029l17ec06e0k9d3ea91e23ecd74e@mail.gmail.com> <4999D184.3080105@trueblade.com> <499AAF4E.3020506@trueblade.com> <49B509A3.1080404@trueblade.com> Message-ID: <49BB0C08.8050309@trueblade.com> Eric Smith wrote: > I've added a patch to http://bugs.python.org/issue5237 that implements > the basic '{}' functionality in str.format. I've added another patch to issue 5237 which I believe is production quality. I'll work on tests. >>> '{} {}'.format(1, 2) '1 2' From and-dev at doxdesk.com Sat Mar 14 02:56:41 2009 From: and-dev at doxdesk.com (And Clover) Date: Sat, 14 Mar 2009 02:56:41 +0100 Subject: [Python-ideas] str.split with padding Message-ID: <49BB0ED9.4000003@doxdesk.com> Here's a simple one I've reinvented in my own apps often enough that it might be worth adding to the built-in split() method: s.split(sep[, maxsplit[, pad]]) pad, if set True, would pad out the returned list with empty strings (strs/unicodes depending on returned datatype) so that the list was always (maxsplit+1) elements long. This allows one to do things like unpacking assignments: user, hostname= address.split('@', 1, True) without having to worry about exceptions when the number of ?sep?s in the string is unexpectedly fewer than ?maxsplit?. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From lie.1296 at gmail.com Sat Mar 14 03:43:28 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Sat, 14 Mar 2009 13:43:28 +1100 Subject: [Python-ideas] str.split with padding In-Reply-To: <49BB0ED9.4000003@doxdesk.com> References: <49BB0ED9.4000003@doxdesk.com> Message-ID: And Clover wrote: > Here's a simple one I've reinvented in my own apps often enough that it > might be worth adding to the built-in split() method: > > s.split(sep[, maxsplit[, pad]]) > > pad, if set True, would pad out the returned list with empty strings > (strs/unicodes depending on returned datatype) so that the list was > always (maxsplit+1) elements long. This allows one to do things like > unpacking assignments: > > user, hostname= address.split('@', 1, True) > > without having to worry about exceptions when the number of ?sep?s in > the string is unexpectedly fewer than ?maxsplit?. > Can you find a better use case? For splitting email address, I think I would want to know if the address turned out to be invalid (e.g. it does not contain exactly 1 @s) From and-dev at doxdesk.com Sat Mar 14 04:46:03 2009 From: and-dev at doxdesk.com (And Clover) Date: Sat, 14 Mar 2009 04:46:03 +0100 Subject: [Python-ideas] str.split with padding In-Reply-To: References: <49BB0ED9.4000003@doxdesk.com> Message-ID: <49BB287B.5020507@doxdesk.com> Lie Ryan wrote: > Can you find a better use case? Well here are some random uses from projects that a search on splitpad (one of the names I used for it) is turning up: command, parameters= splitpad(line, ' ', 1) # get SMTP command y, m, d= splitpad(t, '-', 2) # split date, month and day optional headers, body= splitpad(request, '\n\n', 1) # there might be no body table, column= rsplitpad(colname, '.', 1) # extract SQL [table.]column name id, cat, name, price= splitpad(line, ',', 3) # should be four columns, but editor might have lost trailing commas user, pwd= splitpad(base64.decodestring(authtoken), ':', 1) # will always contain ':' unless malformed pars= dict(splitpad(p, '=', 1) for p in input.split(';')) # no '=value' part is allowable server, version= splitpad(environ.get('SERVER_SOFTWARE', ''), '/', 1) # might not have a version And so on. (Obviously these have an internetty bias, where ?be liberal in what you accept? is desirable.) > For splitting email address, I think I would want to know if the address turned > out to be invalid (e.g. it does not contain exactly 1 @s) Maybe, maybe not. In this case I wanted to accept the case of a bare username, with or without ?@?, as a local user. An empty string instead of an exception for a missing part is something I find very common; it kind of fits with Python's ?string processing does what you usually want? behaviour (as compared to other languages that are still tediously throwing exceptions when you try to slice outside the string length range). For example with an HTTP command (eg. ?GET / HTTP/1.0?): method, path, version= splitpad(command, ' ', 2) ?version? might be missing, on ancient HTTP/0.9 clients. ?path? could be missing, on malformed requests. In either of those cases I don't want an exception, and I don't particularly want to burden my split code with extra checking; I'll probably have to do further checking on ?path? anyway so setting it to an empty string is the best I can do here. The alternative I use if I can't be bothered to define splitpad() again is something like: parts= command.split(' ', 2) method= parts[0] path= parts[1] if len(parts)>=2 else '' .... which is pretty ugly. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From ncoghlan at gmail.com Sat Mar 14 04:50:25 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 14 Mar 2009 13:50:25 +1000 Subject: [Python-ideas] [Python-Dev] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BA4A12.4040402@gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B8D702.4040004@trueblade.com> <1F032CDC26874991B78E47DBDB333917@RaymondLaptop1> <49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com> Message-ID: <49BB2981.5070008@gmail.com> Joining the discussion over here to add a couple of points that I haven't seen in Raymond's PEP updates on the checkin list: 1. The Single Unix Specification apparently uses an apostrophe as a flag in prinft() %-formatting to request inclusion of a thousands separator in a locale aware way [1]. Since the apostrophe is much harder to mistake for a period than a comma is, I would modify my "just a flag" suggestion to use an apostrophe as the flag instead of a comma: [[fill]align][sign][#][0][width]['][.precision][type] The output would still use commas though: format(1234, "8.1f") --> ' 1234.0' format(1234, "8'.1f") --> ' 1,234.0' format(1234, "8d") --> ' 1234' format(1234, "8'd") --> ' 1,234' 2. PEP 3101 *already included* a way to modify the handling of format strings in a consistent way: use a custom string.Formatter subclass instead of relying on the basic str.format method. When the mini language parser is exposed (which I consider a separate issue from this PEP), a locale aware custom formatter is going to find a "include digit separators" flag far more useful than the overly explicit "use this thousands separator and this decimal separator". Cheers, Nick. [1] http://linux.die.net/man/3/printf -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From steve at pearwood.info Sat Mar 14 05:22:42 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 14 Mar 2009 15:22:42 +1100 Subject: [Python-ideas] str.split with padding In-Reply-To: References: <49BB0ED9.4000003@doxdesk.com> Message-ID: <200903141522.42307.steve@pearwood.info> On Sat, 14 Mar 2009 01:43:28 pm Lie Ryan wrote: > And Clover wrote: > > Here's a simple one I've reinvented in my own apps often enough > > that it might be worth adding to the built-in split() method: > > > > s.split(sep[, maxsplit[, pad]]) > > > > pad, if set True, would pad out the returned list with empty > > strings (strs/unicodes depending on returned datatype) so that the > > list was always (maxsplit+1) elements long. This allows one to do > > things like unpacking assignments: > > > > user, hostname= address.split('@', 1, True) > > > > without having to worry about exceptions when the number of ?sep?s > > in the string is unexpectedly fewer than ?maxsplit?. > > Can you find a better use case? For splitting email address, I think > I would want to know if the address turned out to be invalid (e.g. it > does not contain exactly 1 @s) What makes you think that email address must contain exactly one @ sign? Email being sent locally may contain zero @ signs, and email being sent externally can contain one or more @ signs. Andy's code: user, hostname= address.split('@', 1, True) will fail on syntactically valid email addresses like this: fred(away @ the pub)@example.com -- Steven D'Aprano From lie.1296 at gmail.com Sat Mar 14 05:59:18 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Sat, 14 Mar 2009 15:59:18 +1100 Subject: [Python-ideas] str.split with padding In-Reply-To: <200903141522.42307.steve@pearwood.info> References: <49BB0ED9.4000003@doxdesk.com> <200903141522.42307.steve@pearwood.info> Message-ID: Steven D'Aprano wrote: > On Sat, 14 Mar 2009 01:43:28 pm Lie Ryan wrote: >> And Clover wrote: >>> Here's a simple one I've reinvented in my own apps often enough >>> that it might be worth adding to the built-in split() method: >>> >>> s.split(sep[, maxsplit[, pad]]) >>> >>> pad, if set True, would pad out the returned list with empty >>> strings (strs/unicodes depending on returned datatype) so that the >>> list was always (maxsplit+1) elements long. This allows one to do >>> things like unpacking assignments: >>> >>> user, hostname= address.split('@', 1, True) >>> >>> without having to worry about exceptions when the number of ?sep?s >>> in the string is unexpectedly fewer than ?maxsplit?. >> Can you find a better use case? For splitting email address, I think >> I would want to know if the address turned out to be invalid (e.g. it >> does not contain exactly 1 @s) > > > What makes you think that email address must contain exactly one @ sign? > > Email being sent locally may contain zero @ signs, and email being sent > externally can contain one or more @ signs. Andy's code: > > user, hostname= address.split('@', 1, True) > > will fail on syntactically valid email addresses like this: > > fred(away @ the pub)@example.com From Wikipedia: RFC invalid e-mail addresses * Abc.example.com (character @ is missing) * Abc. at example.com (character dot(.) is last in local part) * Abc..123 at example.com (character dot(.) is double) * A at b@c at example.com (only one @ is allowed outside quotations marks) * ()[]\;:,<>@example.com (none of the characters before the @ in this example, are allowed outside quotation marks) Your example is valid email address if and only if it is enclosed in quotation mark: "fred(away @ the pub)"@example.com From lie.1296 at gmail.com Sat Mar 14 06:04:18 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Sat, 14 Mar 2009 16:04:18 +1100 Subject: [Python-ideas] str.split with padding In-Reply-To: <49BB287B.5020507@doxdesk.com> References: <49BB0ED9.4000003@doxdesk.com> <49BB287B.5020507@doxdesk.com> Message-ID: And Clover wrote: > Lie Ryan wrote: > >> Can you find a better use case? > > Well here are some random uses from projects that a search on splitpad > (one of the names I used for it) is turning up: > > command, parameters= splitpad(line, ' ', 1) # get SMTP command > y, m, d= splitpad(t, '-', 2) # split date, month and day optional > headers, body= splitpad(request, '\n\n', 1) # there might be no body > table, column= rsplitpad(colname, '.', 1) # extract SQL > [table.]column name > id, cat, name, price= splitpad(line, ',', 3) # should be four > columns, but editor might have lost trailing commas > user, pwd= splitpad(base64.decodestring(authtoken), ':', 1) # will > always contain ':' unless malformed > pars= dict(splitpad(p, '=', 1) for p in input.split(';')) # no > '=value' part is allowable > server, version= splitpad(environ.get('SERVER_SOFTWARE', ''), '/', > 1) # might not have a version > > And so on. (Obviously these have an internetty bias, where ?be liberal > in what you accept? is desirable.) > >> For splitting email address, I think I would want to know if the >> address turned >> out to be invalid (e.g. it does not contain exactly 1 @s) > > Maybe, maybe not. In this case I wanted to accept the case of a bare > username, with or without ?@?, as a local user. An empty string instead > of an exception for a missing part is something I find very common; it > kind of fits with Python's ?string processing does what you usually > want? behaviour (as compared to other languages that are still tediously > throwing exceptions when you try to slice outside the string length range). > > For example with an HTTP command (eg. ?GET / HTTP/1.0?): > > method, path, version= splitpad(command, ' ', 2) > > ?version? might be missing, on ancient HTTP/0.9 clients. ?path? could be > missing, on malformed requests. In either of those cases I don't want an > exception, and I don't particularly want to burden my split code with > extra checking; I'll probably have to do further checking on ?path? > anyway so setting it to an empty string is the best I can do here. > > The alternative I use if I can't be bothered to define splitpad() again > is something like: > > parts= command.split(' ', 2) > method= parts[0] > path= parts[1] if len(parts)>=2 else '' > .... > > which is pretty ugly. > I am honestly quite surprised: http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html From bruce at leapyear.org Sat Mar 14 07:04:26 2009 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 13 Mar 2009 23:04:26 -0700 Subject: [Python-ideas] str.split with padding In-Reply-To: References: <49BB0ED9.4000003@doxdesk.com> <200903141522.42307.steve@pearwood.info> Message-ID: On Fri, Mar 13, 2009 at 9:59 PM, Lie Ryan wrote: > Steven D'Aprano wrote: > >> >> Email being sent locally may contain zero @ signs, and email being sent >> externally can contain one or more @ signs. Andy's code: >> >> user, hostname= address.split('@', 1, True) >> >> will fail on syntactically valid email addresses like this: >> >> fred(away @ the pub)@example.com >> > > From Wikipedia: > RFC invalid e-mail addresses > * Abc.example.com (character @ is missing) > * Abc. at example.com (character dot(.) is last in local part) > * Abc..123 at example.com (character dot(.) is double) > * A at b@c at example.com (only one @ is allowed outside quotations marks) > * ()[]\;:,<>@example.com (none of the characters before the @ in this > example, are allowed outside quotation marks) > > Your example is valid email address if and only if it is enclosed in > quotation mark: "fred(away @ the pub)"@example.com > That is valid but not because you can have nested email addresses like that.** The (...) part is a comment. I wouldn't bet that very many mail clients handle that according to the rfc. Many don't handle quoted strings either. And there are those that have a narrow view of which characters (Hint: if you don't want to get mail from hotmail users, just make sure your email address has '/' in it.) http://www.ietf.org/rfc/rfc0822.txt **Way back people wrote nested email addresses with % replacing the @ in the nested address (sna%foo at bar). I haven't seen that for a while. On topic: Making split more complicated seems like overspecialization. Wouldn't a generic padding function be more useful? FWIW, this has been discussed before. http://bugs.python.org/issue5034 --- Bruce (sorry for the digression) -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at rcn.com Sat Mar 14 07:08:18 2009 From: python at rcn.com (Raymond Hettinger) Date: Fri, 13 Mar 2009 23:08:18 -0700 Subject: [Python-ideas] [Python-Dev] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B8D702.4040004@trueblade.com> <1F032CDC26874991B78E47DBDB333917@RaymondLaptop1><49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com> <49BB2981.5070008@gmail.com> Message-ID: <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1> [Nick Coghlan] > 1. The Single Unix Specification apparently uses an apostrophe as a flag > in prinft() %-formatting to request inclusion of a thousands separator > in a locale aware way [1]. We already use C-sharp's "n" flag for a locale aware thousands separator. > Since the apostrophe is much harder to > mistake for a period than a comma is, I would modify my "just a flag" > suggestion to use an apostrophe as the flag instead of a comma: . . . > The output would still use commas though: That doesn't make sense for two reasons: 1. Why mark a non-locale aware form with a flag that indicates locale awareness in another language. 2. It seems to be basic bad design to require an apostrophe to emit commas. FWIW, the comma-only version of the proposal is probably going to die anyway. The more flexible alternative evolved to something simple and direct. Also, the newsgroup discussion make it abundantly clear that half the world will rebel if commas are the only supported option. > 2. PEP 3101 *already included* a way to modify the handling of format > strings in a consistent way: use a custom string.Formatter subclass > instead of relying on the basic str.format method. > > When the mini language parser is exposed (which I consider a separate > issue from this PEP), a locale aware custom formatter is going to find a > "include digit separators" flag far more useful than the overly explicit > "use this thousands separator and this decimal separator". Thanks. Will note that in the PEP when I get a chance. Raymond From denis.spir at free.fr Sat Mar 14 08:49:46 2009 From: denis.spir at free.fr (spir) Date: Sat, 14 Mar 2009 08:49:46 +0100 Subject: [Python-ideas] [Python-Dev] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B8D702.4040004@trueblade.com> <1F032CDC26874991B78E47DBDB333917@RaymondLaptop1> <49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com> <49BB2981.5070008@gmail.com> <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1> Message-ID: <20090314084946.50f64627@o> Le Fri, 13 Mar 2009 23:08:18 -0700, "Raymond Hettinger" s'exprima ainsi: > Since the apostrophe is much harder to > > mistake for a period than a comma is, I would modify my "just a flag" > > suggestion to use an apostrophe as the flag instead of a comma: > . . . > > The output would still use commas though: > > That doesn't make sense for two reasons: > 1. Why mark a non-locale aware form with a flag that indicates > locale awareness in another language. > 2. It seems to be basic bad design to require an apostrophe > to emit commas. If I properly understand the PEP (by the way, congratulations for the reformulation -- the motivation section esp. is clearer and more motivat-ing) there are 2 differences between the poposals: * choose char for thousand-sep * choose decimal sep > FWIW, the comma-only version of the proposal is probably going to > die anyway. The more flexible alternative evolved to something simple > and direct. Also, the newsgroup discussion make it abundantly clear > that half the world will rebel if commas are the only supported option. If the first proposal let the user choose the thousand-sep char it would be more appealing, indeed. As is, it has no chance. Anyway, the second proposal is now rather clear and simple. In my mind, both separators work together even when there no possible conflict between the actual chars. +1 for version #2 (more or less as is now) I would just add: The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable space). Denis PS - OT: As the width param is the width of whole number, how to cope with with decimal point alignment, meaning that there should be integral part width/padding instead? 123.45 1.2 123456.789 Maybe this need is mainly in the financial field, so that this will be implicitly addressed because to the 2-digit rounding? 123.45 1.20 123456.79 ------ la vita e estrany From g.brandl at gmx.net Sat Mar 14 09:55:11 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 14 Mar 2009 09:55:11 +0100 Subject: [Python-ideas] str.split with padding In-Reply-To: <49BB0ED9.4000003@doxdesk.com> References: <49BB0ED9.4000003@doxdesk.com> Message-ID: And Clover schrieb: > Here's a simple one I've reinvented in my own apps often enough that it > might be worth adding to the built-in split() method: > > s.split(sep[, maxsplit[, pad]]) > > pad, if set True, would pad out the returned list with empty strings > (strs/unicodes depending on returned datatype) so that the list was > always (maxsplit+1) elements long. This allows one to do things like > unpacking assignments: > > user, hostname= address.split('@', 1, True) > > without having to worry about exceptions when the number of ?sep?s in > the string is unexpectedly fewer than ?maxsplit?. Note that for maxsplit=1, you can use str.partition(). Georg From g.brandl at gmx.net Sat Mar 14 10:24:29 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 14 Mar 2009 10:24:29 +0100 Subject: [Python-ideas] Draft PEP: Standard daemon process library In-Reply-To: <874oxzxujs.fsf@benfinney.id.au> References: <87wscj11fl.fsf@benfinney.id.au> <874oxzxujs.fsf@benfinney.id.au> Message-ID: Ben Finney schrieb: > Howdy all, > > Significant changes in this release: > > * Name the daemon process context class `DaemonContext`, since it > doesn't actually represent a separate daemon. (The reference > implementation will also have a `DaemonRunner` class, but that's > outside the scope of this PEP.) > > * Implement the context manager protocol, allowing use as a ?with? > context manager or via explicit ?open? and ?close? calls. > > * Delegate PID file handling to a `pidfile` object handed to the > `DaemonContext` instance, and used simply as a context manager. > > * Simplify the set of options by using a mapping for signal handlers. > > * Target Python 3.2, since the reference implementation will very > likely not be complete in time for anything earlier. This looks like it should be submitted as a formal PEP now; that should also ensure more interest in it, and an eventual resolution. Georg From solipsis at pitrou.net Sat Mar 14 14:45:12 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 14 Mar 2009 13:45:12 +0000 (UTC) Subject: [Python-ideas] non-breaking space References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B8D702.4040004@trueblade.com> <1F032CDC26874991B78E47DBDB333917@RaymondLaptop1> <49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com> <49BB2981.5070008@gmail.com> <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1> <20090314084946.50f64627@o> Message-ID: Hello, spir writes: > > I would just add: > The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable space). Then the proposal should allow for any kind of space characters (that is, any character for which isspace() is True). There are several non-breaking space characters in the unicode character set, with varying character widths, which is important for typography rules. See http://en.wikipedia.org/wiki/Non-breaking_space for some examples. Regards Antoine (playing devil's advocate a bit - but only a bit). From jervisau at gmail.com Sat Mar 14 14:47:01 2009 From: jervisau at gmail.com (Jervis Whitley) Date: Sun, 15 Mar 2009 00:47:01 +1100 Subject: [Python-ideas] Inline assignment expression Message-ID: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> http://bugs.python.org/issue1714448 had an interesting proposal that I thought might be worthwhile discussing here. if something as x: however, of greater use would be assignment expressions that allow: if (something as x) == other: # can now use x. I propose that we implement assignment expressions that would allow assignments to be made any place that expressions are currently valid. The proposal uses the (nominal) right arrow (RARROW) '->' to indicate the assignment. The form would look like this: EXPR -> VAR which translates to VAR = EXPR (EXPR) Expression (EXPR) is evaluated and assigned to target VAR. The value of EXPR is left on the top of stack. another toy example to think about: while len(expensive() -> res) == 4: dosomething(res) A patch has been uploaded to the named issue in the bug tracker. I encourage you to try it out (py3k at the moment). As I mentioned earlier the exact syntax is only nominal. We needn't use the RARROW if consensus is against that, it is a simple operation to change this to any of ('becomes', 'into', 'assigns' ... I look forward to your comments. Cheers, Jervis From guido at python.org Sat Mar 14 15:57:25 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 14 Mar 2009 07:57:25 -0700 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> Message-ID: If we're going to allow unconstrained assignments inside expressions why don't we use the same syntax as C, C++, Java, JavaScript etc.? But I left this out intentionally for a reason. We would need to have a great deal of evidence that it was a mistake for making a U-turn. Have a happy discussion, --Guido On Sat, Mar 14, 2009 at 6:47 AM, Jervis Whitley wrote: > http://bugs.python.org/issue1714448 had an interesting proposal that I thought > might be worthwhile discussing here. > > if something as x: > > however, of greater use would be assignment expressions that allow: > if (something as x) == other: > ? ?# can now use x. > > I propose that we implement assignment expressions that would allow assignments > to be made any place that expressions are currently valid. The > proposal uses the > (nominal) right arrow (RARROW) '->' to indicate the assignment. The > form would look > like this: > > ? ?EXPR -> VAR > > which translates to > > ? ?VAR = EXPR > ? ?(EXPR) > > Expression (EXPR) is evaluated and assigned to target VAR. The value of EXPR is > left on the top of stack. > > another toy example to think about: > > ? ?while len(expensive() -> res) == 4: > ? ? ? ?dosomething(res) > > A patch has been uploaded to the named issue in the bug tracker. I encourage > you to try it out (py3k at the moment). As I mentioned earlier the > exact syntax is only nominal. We needn't use the RARROW if consensus > is against that, it is a simple operation to change this to any of > ('becomes', 'into', 'assigns' ... > > > I look forward to your comments. > > Cheers, > > Jervis > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Sat Mar 14 17:52:09 2009 From: python at rcn.com (Raymond Hettinger) Date: Sat, 14 Mar 2009 09:52:09 -0700 Subject: [Python-ideas] non-breaking space References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><49B8D702.4040004@trueblade.com><1F032CDC26874991B78E47DBDB333917@RaymondLaptop1><49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com><49BB2981.5070008@gmail.com><103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1><20090314084946.50f64627@o> Message-ID: <161AE36774DA4AAEA22CC88AC07285A5@RaymondLaptop1> denis.spir at ...> writes: >> >> I would just add: >> The SPACE can be either U+0020 (standard space) or U+00A0 (non-breakable > space). > > Then the proposal should allow for any kind of space characters (that is, any > character for which isspace() is True). There are several non-breaking space > characters in the unicode character set, with varying character widths, which is > important for typography rules. See > http://en.wikipedia.org/wiki/Non-breaking_space for some examples. > > Regards > > Antoine (playing devil's advocate a bit - but only a bit). Keeping in mind the needs of people writing parsers, I don't think it's a good idea to expand this set. Already, we're not supporting all possible separators whether they be spaces or not. Given just U+0020 and U+00A0, a person can easily do a str.replace() to get to anything else. Raymond From and-dev at doxdesk.com Sat Mar 14 18:09:58 2009 From: and-dev at doxdesk.com (And Clover) Date: Sat, 14 Mar 2009 18:09:58 +0100 Subject: [Python-ideas] str.split with padding In-Reply-To: References: <49BB0ED9.4000003@doxdesk.com> Message-ID: <49BBE4E6.1020606@doxdesk.com> Georg Brandl wrote: > Note that for maxsplit=1, you can use str.partition(). Indeed, though it does slightly spoil the cleanness of the unpacking assignment to include a dummy lvalue for the middle element. [Thanks for the on-topic reply! I'm surprised more people haven't felt the need to write unpacking splits like this to be honest, but I guess engaging in SMTP syntax law is much more fun. Yes guys, I'm well aware of the capabilities of the RFC2822 addr-spec format, thanks, and no, it's not relevant to the particular program that example came from. Cheers for the concern though.] -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From and-dev at doxdesk.com Sat Mar 14 18:15:10 2009 From: and-dev at doxdesk.com (And Clover) Date: Sat, 14 Mar 2009 18:15:10 +0100 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> Message-ID: <49BBE61E.6000304@doxdesk.com> Jervis Whitley wrote: > if (something as x) == other: > # can now use x. Interesting. I'd definitely prefer that to the C-style inline assignment syntax: I think it reads better, and there's less chance of the Accidental Assignment Instead Of Comparison trap that has plagued other languages. I remain to be convinced that inline assignment is enough of a win in general, but if implemented, that's the syntax I'd want. -- And Clover mailto:and at doxdesk.com http://www.doxdesk.com/ From guido at python.org Sat Mar 14 18:30:43 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 14 Mar 2009 10:30:43 -0700 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <49BBE61E.6000304@doxdesk.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <49BBE61E.6000304@doxdesk.com> Message-ID: On Sat, Mar 14, 2009 at 10:15 AM, And Clover wrote: > Jervis Whitley wrote: > >> if (something as x) == other: >> ? ?# can now use x. > > Interesting. I'd definitely prefer that to the C-style inline assignment > syntax: I think it reads better, and there's less chance of the Accidental > Assignment Instead Of Comparison trap that has plagued other languages. > > I remain to be convinced that inline assignment is enough of a win in > general, but if implemented, that's the syntax I'd want. Perhaps you want to replace top-level assignments with "expr as target" as well? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Sat Mar 14 19:50:28 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 14 Mar 2009 14:50:28 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BA34BD.6070005@canterbury.ac.nz> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> <49B9B3CB.2090906@canterbury.ac.nz> <49BA34BD.6070005@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Antoine Pitrou wrote: > >> A program often formatting numbers the same way can factor that into >> dedicated helpers: > > If that's an acceptable thing to do on a daily basis, > then we don't need format strings at all. Given that the helper functions *use* format strings, or could even be a method bound to a format string, that seems like an odd claim ;-). From tjreedy at udel.edu Sat Mar 14 20:03:19 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 14 Mar 2009 15:03:19 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <20090313115602.3f9a19b9@o> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> Message-ID: spir wrote: > I have the impression, but I may well be wrong, that contrarily to a > format, a *formatted number* instead seems easy to scan -- with human > eyes. So, as a crazy idea, I wonder whether we shouldn't let the user > provide a example formatted number instead. This may address most of > use cases, but probably not all. > > To makes things easier, why not specify a canonical number, such as > '-123456.789', of which the user should define the formatted version? > Then a smart parser could deduce the format to be applied to further > numbers. Below a purely artificial example. > > -123456.789 --> kg 00_123_456,79- > > format: unit: 'kg' unit_pos: LEFT unit_sep: ' ' thousand_sep: '_' > fract_sep : ',' sign_pos: RIGHT sign_sep: None padding_char: '0' Once the .format language is expanded to be able to define grouping separators, one will be able to define functions to turn such templates in field specs. Now many options are allowed would depend on the function. From tjreedy at udel.edu Sat Mar 14 21:08:56 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 14 Mar 2009 16:08:56 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > Todays updates to: http://www.python.org/dev/peps/pep-0378/ > > * Summarize commentary to date. > * Add APOSTROPHE and non-breaking SPACE to the list of separators. > * Add more links to external references. > * Detail issues with the locale module. > * Clarify how proposal II is parsed. +1 for proposal 2 Comment on locale. It was designed, perhaps 30 years ago, for *national* programming (hence the global locale setting). The doc should really describe it as for 'nationalization' rather than for 'internatioalization'. For *global* (international) programming, all the formatting functions should either take a locale dict or be instance methods of a Locale class whose instances are individual locales. With this PEP implemented, we could potentially locale with a platform- and implementation-language-independent countrybase and country module with Country class using the expanded str.format strings. The only thing not directly handled, as far as I can see, is groupings other than by threes, which would have to be handled by other means. Terry Jan Reedy From tjreedy at udel.edu Sat Mar 14 21:23:39 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 14 Mar 2009 16:23:39 -0400 Subject: [Python-ideas] str.split with padding In-Reply-To: <49BB0ED9.4000003@doxdesk.com> References: <49BB0ED9.4000003@doxdesk.com> Message-ID: And Clover wrote: > Here's a simple one I've reinvented in my own apps often enough that it > might be worth adding to the built-in split() method: > > s.split(sep[, maxsplit[, pad]]) > > pad, if set True, would pad out the returned list with empty strings > (strs/unicodes depending on returned datatype) so that the list was > always (maxsplit+1) elements long. This allows one to do things like > unpacking assignments: > > user, hostname= address.split('@', 1, True) > > without having to worry about exceptions when the number of ?sep?s in > the string is unexpectedly fewer than ?maxsplit?. I would make pad = . Example use case: major,minor,micro = pyversion.split('.', 2, '0') # 3.0 = 3.0.0, etc. # or major,minor,micro = (int(s) for s in pyversion.split('.', 2, '0') ) I suppose a counter argument is than one could write (pyversion+'.0').split('.',2) Terry Jan Reedy From jervisau at gmail.com Sat Mar 14 23:52:06 2009 From: jervisau at gmail.com (Jervis Whitley) Date: Sun, 15 Mar 2009 09:52:06 +1100 Subject: [Python-ideas] Inline assignment expression In-Reply-To: References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> Message-ID: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> > why don't we use the same syntax as C, C++, Java, JavaScript etc.? I have deliberately chosen to use a different syntax (right assignment for one, and a different, albeit nominal, operator) than C, C++ to address the concern that a user may unintentionally assign when they wanted to compare. http://bugs.python.org/issue1714448 the issue that I was responding to, also recognised the need to move away from a C style assignment for a similar situation (I have also written a patch, not posted yet, to address their situation.) > But I left this out intentionally for a reason. We would need to have > a great deal of evidence that it was a mistake for making a U-turn. I realise that this is a trivial (to implement) patch and that it must have been left out of Python for a reason, however I am sure that using an explicit and elegant enough syntax that this can shake the feeling that it is un-pythonic. I have drafted a PEP with some of the basic discussion included and some example situations. It does however, fail to discuss issues of precedence and implementations in other languages at this stage. As implemented the precedence for this operation is just below a BoolOp and above a BinOp so things like test() as x == answer should work and (for example) 4 * 4 as x == 16 # True I read your answer as a -0.5, if it is dead in the water, let me know we can close the Issue as a 'Wont Fix'. Cheers, Jervis From jervisau at gmail.com Sat Mar 14 23:59:38 2009 From: jervisau at gmail.com (Jervis Whitley) Date: Sun, 15 Mar 2009 09:59:38 +1100 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <49BBE61E.6000304@doxdesk.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <49BBE61E.6000304@doxdesk.com> Message-ID: <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com> > >> if (something as x) == other: >> ? ?# can now use x. > > Interesting. I'd definitely prefer that to the C-style inline assignment > syntax: I think it reads better, and there's less chance of the Accidental > Assignment Instead Of Comparison trap that has plagued other languages. However, allowing this "something as x" syntax for assignment would cause confusion with the "with contextmanager as x" scenario. "as" was chosen in their case because the expr contextmanager is not assigned to x. While I do like the "as" syntax too, I have not endorsed it for the above reason. However, this does show that using the "as" makes this somehow pythonic looking, and there must be an alternative that keeps this far enough away from the assignment expressions in other languages so as to avoid the trap you mention. > I remain to be convinced that inline assignment is enough of a win in > general, but if implemented, that's the syntax I'd want. Cheers, Jervis From tjreedy at udel.edu Sun Mar 15 02:52:47 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 14 Mar 2009 21:52:47 -0400 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> Message-ID: Jervis Whitley wrote: > [Guido] >> But I left this out intentionally for a reason. We would need to have >> a great deal of evidence that it was a mistake for making a U-turn. > I read your answer as a -0.5, if it is dead in the water, let me know > we can close the Issue as a 'Wont Fix'. Having read similar discussions over the last decade, I read it as about -.995 ;-) In other words, not quite as dead as adding braces, but close. tjr From cmjohnson.mailinglist at gmail.com Sun Mar 15 02:57:42 2009 From: cmjohnson.mailinglist at gmail.com (Carl Johnson) Date: Sat, 14 Mar 2009 15:57:42 -1000 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> Message-ID: <3bdda690903141857n34f517b0r537f0cca668006b7@mail.gmail.com> Jervis Whitley wrote: > so things like > > ? ?test() as x == answer > > should work and (for example) > > ? ?4 * 4 as x == 16 # True Would "4 * 4 == 16 as x" be equivalent to "(4 * 4 == 16) as x" or "4 * 4 == (16 as x)"? Either way, I suspect this is dead in the water, but I guess that issue should be clarified. -- Carl From jervisau at gmail.com Sun Mar 15 03:13:54 2009 From: jervisau at gmail.com (Jervis Whitley) Date: Sun, 15 Mar 2009 13:13:54 +1100 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <3bdda690903141857n34f517b0r537f0cca668006b7@mail.gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> <3bdda690903141857n34f517b0r537f0cca668006b7@mail.gmail.com> Message-ID: <8e63a5ce0903141913o6374acf3h6fa37c67537c225a@mail.gmail.com> > > Would "4 * 4 == 16 as x" be equivalent to "(4 * 4 == 16) as x" or "4 * > 4 == (16 as x)"? > > Either way, I suspect this is dead in the water, but I guess that > issue should be clarified. > This is one of the matters for discussion here. I much prefer the latter, that is the assignment expression has precedence below that of BoolOp but above BinOp. Cheers. Try running the patch, if nothing else kicking the tires a bit is a bit of fun (note that i nominally use '->' instead of 'as' in the patch.) I know that since writing it, it has been if nothing else, fun, doing assignments in expressions. From ben+python at benfinney.id.au Sun Mar 15 03:31:21 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 15 Mar 2009 13:31:21 +1100 Subject: [Python-ideas] Draft PEP (version 0.5): Standard daemon process library References: <87wscj11fl.fsf@benfinney.id.au> <874oxzxujs.fsf@benfinney.id.au> Message-ID: <87eiwzv7nq.fsf_-_@benfinney.id.au> Georg Brandl writes: > Ben Finney schrieb: > > Howdy all, > > > > Significant changes in this release [of the draft PEP]: > > This looks like it should be submitted as a formal PEP now; that > should also ensure more interest in it, and an eventual resolution. Thanks for the support. Unless anyone has strong objections within the next day or so, I'll submit this as a PEP. -- \ ?With Lisp or Forth, a master programmer has unlimited power | `\ and expressiveness. With Python, even a regular guy can reach | _o__) for the stars.? ?Raymond Hettinger | Ben Finney From guido at python.org Sun Mar 15 04:33:54 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 14 Mar 2009 20:33:54 -0700 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <8e63a5ce0903141552j65cd5341hfee9e2a39a8c048e@mail.gmail.com> Message-ID: On Sat, Mar 14, 2009 at 3:52 PM, Jervis Whitley wrote: > I read your answer as a -0.5, if it is dead in the water, let me know > we can close the Issue as a 'Wont Fix'. You're asking the wrong guy. :-) If it were up to me this would never go through. So, yes, a solid -1 from me. If anything, I'm stronger against your "as" version than against C-style "=", (a) because the latter draws more attention to the assignment (I hate side effects buried deeply) and (b) because it's more familiar. The existing "as" syntaxes have a different purpose, they are top-level only so they cannot be deeply buried. But numerically I'd still be -1 on the "=" syntax too. I'm just throwing that preference for "=" out in case a future BDFL or someone forking the language wants to do it differently. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Sun Mar 15 05:39:05 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Mar 2009 14:39:05 +1000 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <49BBE61E.6000304@doxdesk.com> <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com> Message-ID: <49BC8669.6000004@gmail.com> Jervis Whitley wrote: >>> if (something as x) == other: >>> # can now use x. >> Interesting. I'd definitely prefer that to the C-style inline assignment >> syntax: I think it reads better, and there's less chance of the Accidental >> Assignment Instead Of Comparison trap that has plagued other languages. > > However, allowing this "something as x" syntax for assignment would cause > confusion with the "with contextmanager as x" scenario. "as" was chosen in their > case because the expr contextmanager is not assigned to x. > While I do like the "as" syntax too, I have not endorsed it for the > above reason. If you look at the current uses for 'as' it is never for direct assignment: import x as y: 'x' is not a normal expression here, it's a reference into the module namespace. The value assigned to 'y' has nothing to do with what you would get if you evaluated the expression 'x' in the current namespace. with x as y: 'y' is assigned the value of x.__enter__(), not x except x as y: 'y' is assigned the value of a raised exception that meets the criteria "isinstance(y, x)". In all three cases, while the value eventually assigned to 'y' is *related* to the value of 'x' in some way, it isn't necessarily 'x' itself that is assigned (although the with statement version can sometimes give that impression, since many __enter__() methods finish with "return self"). Proposals for implicit assignment tend to founder on one of two objections: A. The proposal uses existing assignment syntax ('x = y') and runs afoul of the C embedded assignment ambiguity problem (i.e. did the programmer intentionally write "if x = y:" or did they actually mean to write "if x == y:"?) B. The proposal uses different assignment syntax ('x -> y', 'y <- x', 'x as y') and runs afoul of the question of why are there two forms of assignment statement? (Since any expression can be a statement, the new embedded assignment syntax would either work as a statement as well, or else a special rule would have to added to the compiler to say "cannot use embedded assignment expression as statement - use an assignment statement instead"). There are also a couple of more general points of confusion related to nested namespaces as far as embedded assignments go: 1. Assignments inside lambda expressions, list/dict/set comprehensions and generator expressions (all of which create their own local namespace) won't affect the current scope, but assignment in any other expression *will* affect the current scope. Just to make things even more confusing, assignments in the outermost iterator of a comprehension or genexp actually *will* affect the current scope. 2. Since global and nonlocal declarations only affect the current namespace, they're subject to the same kind of confusion as happens with local assignments: they won't affect assignments embedded inside lambda expressions, comprehensions and genexps (except for the outermost iterator for the latter two expression groups). With the way nested namespaces are set up, allowing embedded assignments would just be a recipe for long term confusion even if it did occasionally make some algorithms fractionally easier to write down. Cheers, Nick. Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sun Mar 15 05:58:38 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Mar 2009 14:58:38 +1000 Subject: [Python-ideas] A real limitation of contextlib.nested() Message-ID: <49BC8AFE.8050209@gmail.com> I missed the discussion about potentially adding syntactic support for multiple context managers in the with statement, but figured I should mention a real limitation of contextlib.nested that *would* be fixed by adding dedicated syntactic support. There's a genuine semantic difference between this: with cmA(): with cmB(): # whatever and this: with nested(cmA(), cmB()): # whatever The latter is actually more accurately translated as: mgr1, mgr2 = cmA(), cmB(): with mgr1: with mgr2: # whatever That is, when using nested() the later context managers are created outside the scope of the earlier context managers. So, to use Christian's example from the previous discussion: with lock: with open(infile) as fin: with open(outfile, 'w') as fout: fout.write(fin.read()) Using contextlib.nested for that would be outright broken: with nested(lock, open(infile), open(outfile) as (_, fin, fout): fout.write(fin.read()) 1. The files are opened without acquiring the lock first 2. If an IOError is raised while opening "outfile", then "infile" doesn't get closed immediately I created issue 5491 [1] to point out that the contextlib.nested docs could do with being tweaked to make this limitation clearer. Dedicated syntax (such as the form that Christian proposed) would fix this problem: with lock, (open(infile) as fin), (open(outfile, 'w') as fout): fout.write(fin.read()) Of course, a custom context manager doesn't suffer any problems either: @contextmanager def synced_io(lock, infile, outfile): with lock: with open(infile) as fin: with open(outfile) as fout: yield fin, fout with synced_io(lock, infile, outfile) as (fin, fout): fout.write(fin.read()) Cheers, Nick. [1] http://bugs.python.org/issue5491 -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From jervisau at gmail.com Sun Mar 15 07:04:00 2009 From: jervisau at gmail.com (Jervis Whitley) Date: Sun, 15 Mar 2009 17:04:00 +1100 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <49BC8669.6000004@gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <49BBE61E.6000304@doxdesk.com> <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com> <49BC8669.6000004@gmail.com> Message-ID: <8e63a5ce0903142304o70607d3ck51f51aa3e0411e8a@mail.gmail.com> > With the way nested namespaces are set up, allowing embedded assignments > would just be a recipe for long term confusion even if it did > occasionally make some algorithms fractionally easier to write down. > > Cheers, > Nick. Agreed. I wont be taking this argument any further. Could we close issue http://bugs.python.org/issue1714448? Cheers, Jervis From greg at krypto.org Sun Mar 15 08:45:23 2009 From: greg at krypto.org (Gregory P. Smith) Date: Sun, 15 Mar 2009 00:45:23 -0700 Subject: [Python-ideas] A real limitation of contextlib.nested() In-Reply-To: <49BC8AFE.8050209@gmail.com> References: <49BC8AFE.8050209@gmail.com> Message-ID: <52dc1c820903150045u792715cbhbc18a0c18f93bfa8@mail.gmail.com> On Sat, Mar 14, 2009 at 9:58 PM, Nick Coghlan wrote: > I missed the discussion about potentially adding syntactic support for > multiple context managers in the with statement, but figured I should > mention a real limitation of contextlib.nested that *would* be fixed by > adding dedicated syntactic support. > > There's a genuine semantic difference between this: > > with cmA(): > with cmB(): > # whatever > > and this: > > with nested(cmA(), cmB()): > # whatever > > The latter is actually more accurately translated as: > > mgr1, mgr2 = cmA(), cmB(): > with mgr1: > with mgr2: > # whatever > > That is, when using nested() the later context managers are created > outside the scope of the earlier context managers. > > So, to use Christian's example from the previous discussion: > > with lock: > with open(infile) as fin: > with open(outfile, 'w') as fout: > fout.write(fin.read()) > > Using contextlib.nested for that would be outright broken: > > with nested(lock, open(infile), open(outfile) as (_, fin, fout): > fout.write(fin.read()) > > 1. The files are opened without acquiring the lock first > 2. If an IOError is raised while opening "outfile", then "infile" > doesn't get closed immediately > > I created issue 5491 [1] to point out that the contextlib.nested docs > could do with being tweaked to make this limitation clearer. > > Dedicated syntax (such as the form that Christian proposed) would fix > this problem: > > with lock, (open(infile) as fin), (open(outfile, 'w') as fout): > fout.write(fin.read()) > > Of course, a custom context manager doesn't suffer any problems either: > > @contextmanager > def synced_io(lock, infile, outfile): > with lock: > with open(infile) as fin: > with open(outfile) as fout: > yield fin, fout > > with synced_io(lock, infile, outfile) as (fin, fout): > fout.write(fin.read()) fwiw, I believe you could write a version of nested that generates the above code based on the parameters but I believe it'd be disgustingly slow... @contextmanager def slow_nested(*args): code_lines = [] vars = [] code_lines.append('@contextmanager') code_lines.append('def _nested(*args):') for idx in xrange(len(args)): vars.append('c%d' % idx) code_lines.append('%swith args[%d] as %s:' % (' '*(idx+1), idx, vars[-1])) code_lines.append('%syield %s' % (' '*(len(args)+1), ','.join(vars))) code = '\n'.join(code_lines) print 'CODE:\n', code exec(code) yield _nested(*args) > > > Cheers, > Nick. > > [1] http://bugs.python.org/issue5491 > > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > --------------------------------------------------------------- > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Mar 15 11:24:53 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 15 Mar 2009 20:24:53 +1000 Subject: [Python-ideas] A real limitation of contextlib.nested() In-Reply-To: <52dc1c820903150045u792715cbhbc18a0c18f93bfa8@mail.gmail.com> References: <49BC8AFE.8050209@gmail.com> <52dc1c820903150045u792715cbhbc18a0c18f93bfa8@mail.gmail.com> Message-ID: <49BCD775.1010602@gmail.com> > fwiw, I believe you could write a version of nested that generates the > above code based on the parameters but I believe it'd be disgustingly > slow... > > @contextmanager > def slow_nested(*args): > code_lines = [] > vars = [] > code_lines.append('@contextmanager') > code_lines.append('def _nested(*args):') > for idx in xrange(len(args)): > vars.append('c%d' % idx) > code_lines.append('%swith args[%d] as %s:' % (' '*(idx+1), idx, > vars[-1])) > code_lines.append('%syield %s' % (' '*(len(args)+1), ','.join(vars))) > code = '\n'.join(code_lines) > print 'CODE:\n', code > exec(code) > yield _nested(*args) Unfortunately, that doesn't help: the problem is the fact that the arguments to nested() are evaluated *before* the call to nested() itself. A version with lazily evaluated arguments (i.e. accepting zero-argument callables that create the context managers instead of accepting the context managers themselves) could do the trick, but that then becomes enough of a pain to use that it wouldn't offer much benefit over writing a dedicated context manager. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From grosser.meister.morti at gmx.net Sun Mar 15 13:21:02 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sun, 15 Mar 2009 13:21:02 +0100 Subject: [Python-ideas] A real limitation of contextlib.nested() In-Reply-To: <49BC8AFE.8050209@gmail.com> References: <49BC8AFE.8050209@gmail.com> Message-ID: <49BCF2AE.5030701@gmx.net> Oh, your right! +1 for adding the new syntax. Nick Coghlan wrote: > I missed the discussion about potentially adding syntactic support for > multiple context managers in the with statement, but figured I should > mention a real limitation of contextlib.nested that *would* be fixed by > adding dedicated syntactic support. > > There's a genuine semantic difference between this: > > with cmA(): > with cmB(): > # whatever > > and this: > > with nested(cmA(), cmB()): > # whatever > ... From rrr at ronadam.com Sun Mar 15 13:58:57 2009 From: rrr at ronadam.com (Ron Adam) Date: Sun, 15 Mar 2009 07:58:57 -0500 Subject: [Python-ideas] A real limitation of contextlib.nested() In-Reply-To: <49BC8AFE.8050209@gmail.com> References: <49BC8AFE.8050209@gmail.com> Message-ID: <49BCFB91.9040006@ronadam.com> Nick Coghlan wrote: > Dedicated syntax (such as the form that Christian proposed) would fix > this problem: > > with lock, (open(infile) as fin), (open(outfile, 'w') as fout): > fout.write(fin.read()) Could 'and' possibly be used sense it is a flow control operator in python. with lock and open(infile) as fin and open(outfile, 'w' as fout: fout.write(fin.read()) Ron From aahz at pythoncraft.com Sun Mar 15 15:04:13 2009 From: aahz at pythoncraft.com (Aahz) Date: Sun, 15 Mar 2009 07:04:13 -0700 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <49BC8669.6000004@gmail.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <49BBE61E.6000304@doxdesk.com> <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com> <49BC8669.6000004@gmail.com> Message-ID: <20090315140413.GA26355@panix.com> On Sun, Mar 15, 2009, Nick Coghlan wrote: > > B. The proposal uses different assignment syntax ('x -> y', 'y <- x', 'x > as y') and runs afoul of the question of why are there two forms of > assignment statement? (Since any expression can be a statement, the new > embedded assignment syntax would either work as a statement as well, or > else a special rule would have to added to the compiler to say "cannot > use embedded assignment expression as statement - use an assignment > statement instead"). Just for the record, the most common different syntax suggested has historically been Pascal's ``:=`` -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Adopt A Process -- stop killing all your children! From Scott.Daniels at Acm.Org Sun Mar 15 17:58:53 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun, 15 Mar 2009 09:58:53 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > Todays updates to: http://www.python.org/dev/peps/pep-0378/ > > * Summarize commentary to date. > * Add APOSTROPHE and non-breaking SPACE to the list of separators. > * Add more links to external references. > * Detail issues with the locale module. > * Clarify how proposal II is parsed. Still doesn't specify to digits beyond the decimal point. I don't really care what the choice is, but I do care that the choice is specified. Is the precision in digits, or is it width of the post- decimal point field? If the latter, does a precision of 4 end with a comma or not? In particular, what should (format(9876.54321, "13,.5f"), format(9876.54321, "12,.4f")) produce? Possible "reasonable" answers: A ' 9,876.54321', ' 9,876.5432' B ' 9,876.543,21', ' 9,876.543,2' C ' 9,876.543,2', ' 9,876.543,' D ' 9,876.543,2', ' 9,876.543' I prefer B, but I can see an argument for any of the four above. --Scott David Daniels Scott.Daniels at Acm.Org From eric at trueblade.com Sun Mar 15 18:28:56 2009 From: eric at trueblade.com (Eric Smith) Date: Sun, 15 Mar 2009 13:28:56 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> Message-ID: <49BD3AD8.80905@trueblade.com> Scott David Daniels wrote: > Still doesn't specify to digits beyond the decimal point. I don't > really care what the choice is, but I do care that the choice is > specified. Is the precision in digits, or is it width of the post- > decimal point field? If the latter, does a precision of 4 end with > a comma or not? > > In particular, what should (format(9876.54321, "13,.5f"), > format(9876.54321, "12,.4f")) produce? > Possible "reasonable" answers: > A ' 9,876.54321', ' 9,876.5432' > B ' 9,876.543,21', ' 9,876.543,2' > C ' 9,876.543,2', ' 9,876.543,' > D ' 9,876.543,2', ' 9,876.543' > I prefer B, but I can see an argument for any of the four above. The C locale functions don't support grouping to the right of the decimal. I don't think I've ever seen a system that supports it. Do you have any examples? I'd say A. Eric. From grosser.meister.morti at gmx.net Sun Mar 15 18:35:05 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sun, 15 Mar 2009 18:35:05 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> Message-ID: <49BD3C49.1020904@gmx.net> Scott David Daniels wrote: > Raymond Hettinger wrote: >> Todays updates to: http://www.python.org/dev/peps/pep-0378/ >> >> * Summarize commentary to date. >> * Add APOSTROPHE and non-breaking SPACE to the list of separators. >> * Add more links to external references. >> * Detail issues with the locale module. >> * Clarify how proposal II is parsed. > Still doesn't specify to digits beyond the decimal point. I don't > really care what the choice is, but I do care that the choice is > specified. Is the precision in digits, or is it width of the post- > decimal point field? If the latter, does a precision of 4 end with > a comma or not? > > In particular, what should (format(9876.54321, "13,.5f"), > format(9876.54321, "12,.4f")) produce? > Possible "reasonable" answers: > A ' 9,876.54321', ' 9,876.5432' > B ' 9,876.543,21', ' 9,876.543,2' > C ' 9,876.543,2', ' 9,876.543,' > D ' 9,876.543,2', ' 9,876.543' > I prefer B, but I can see an argument for any of the four above. > > > --Scott David Daniels > Scott.Daniels at Acm.Org > Has anyone mentioned yet that in german you write the following? 10.000.000,000.001 (In german , and . are swapped.) Is this aspect taken into account? How is i18n/l10n managed? -panzi From g.brandl at gmx.net Sun Mar 15 20:18:51 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 15 Mar 2009 20:18:51 +0100 Subject: [Python-ideas] A real limitation of contextlib.nested() In-Reply-To: <49BCFB91.9040006@ronadam.com> References: <49BC8AFE.8050209@gmail.com> <49BCFB91.9040006@ronadam.com> Message-ID: Ron Adam schrieb: > > Nick Coghlan wrote: > >> Dedicated syntax (such as the form that Christian proposed) would fix >> this problem: >> >> with lock, (open(infile) as fin), (open(outfile, 'w') as fout): >> fout.write(fin.read()) > > Could 'and' possibly be used sense it is a flow control operator in python. > > with lock > and open(infile) as fin > and open(outfile, 'w' as fout: > fout.write(fin.read()) But it isn't a control flow operator. It is a boolean operator, and since "with" expressions are expressions, it's perfectly valid there. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From Scott.Daniels at Acm.Org Sun Mar 15 21:41:38 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun, 15 Mar 2009 13:41:38 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BD3AD8.80905@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> <49BD3AD8.80905@trueblade.com> Message-ID: Eric Smith wrote: > Scott David Daniels wrote: >> Still doesn't specify [how to deal with] digits beyond the decimal point.... >> what should (format(9876.54321, "13,.5f"), format(9876.54321, "12,.4f")) produce? >> A ' 9,876.54321', ' 9,876.5432' >> B ' 9,876.543,21', ' 9,876.543,2' >> C ' 9,876.543,2', ' 9,876.543,' >> D ' 9,876.543,2', ' 9,876.543' >> I prefer B, but I can see an argument for any of the four above. > > The C locale functions don't support grouping to the right of the > decimal. I don't think I've ever seen a system that supports it. Do you > have any examples? I've only used separators to check digits below the decimal point. Most high-precision tables of constants that I've seen use 5-digit grouping (e.g. wikipedia for pi): 3.14159 26535 89793 23846 26433 83279 50288 41971 69399 37510 But 3 on the left and 5 on the right really seems to be too much. > I'd say A. For me, A and B are the "preferable" solutions; I just think the PEP needs to say what it chooses. --Scott David Daniels Scott.Daniels at Acm.Org From greg.ewing at canterbury.ac.nz Sun Mar 15 22:33:44 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 16 Mar 2009 09:33:44 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> <49B9B3CB.2090906@canterbury.ac.nz> <49BA34BD.6070005@canterbury.ac.nz> Message-ID: <49BD7438.6090006@canterbury.ac.nz> Antoine Pitrou wrote: > Greg Ewing writes: > >>If that's an acceptable thing to do on a daily basis, >>then we don't need format strings at all. Because you can do all your formatting by calling a function to format each number and then concatenating the results with whatever other text you want. You can do that now, but someone invented format strings, so they must have wanted a more convenient way of going about it. -- Greg From rasky at develer.com Sun Mar 15 23:20:56 2009 From: rasky at develer.com (Giovanni Bajo) Date: Sun, 15 Mar 2009 22:20:56 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> Message-ID: On Thu, 12 Mar 2009 00:49:24 -0700, Raymond Hettinger wrote: > [spir] >> Probably you know that already, but it doesn't hurt anyway. In french >> and most rroman languages comma is the standard decimal sep; and either >> space or dot is used, when necessary, to sep thousands. (It's veeery >> difficult for me to read even short numbers with commas used as >> thousand separator.) >> >> en: 1,234,567.89 >> fr: 1.234.567,89 >> or: 1 234 567,89 I'll notice that the international standard is to use just space: http://www.bipm.org/jsp/en/ViewCGPMResolution.jsp?CGPM=22&RES=10 reaffirms that "Numbers may be divided in groups of three in order to facilitate reading; neither dots nor commas are ever inserted in the spaces between groups", as stated in Resolution 7 of the 9th CGPM, 1948. In Italian, we use a character which is not available in keyboard, nor I find it in an Unicode map, so let's ignore it :) On computers, we usually simply put a period between thousands. -- Giovanni Bajo Develer S.r.l. http://www.develer.com From pyideas at rebertia.com Sun Mar 15 23:34:54 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Sun, 15 Mar 2009 15:34:54 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> Message-ID: <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> On Sun, Mar 15, 2009 at 3:20 PM, Giovanni Bajo wrote: > On Thu, 12 Mar 2009 00:49:24 -0700, Raymond Hettinger wrote: > >> [spir] >>> Probably you know that already, but it doesn't hurt anyway. In french >>> and most rroman languages comma is the standard decimal sep; and either >>> space or dot is used, when necessary, to sep thousands. (It's veeery >>> difficult for me to read even short numbers with commas used as >>> thousand separator.) >>> >>> en: 1,234,567.89 >>> fr: 1.234.567,89 >>> or: 1 234 567,89 > > I'll notice that the international standard is to use just space: > > http://www.bipm.org/jsp/en/ViewCGPMResolution.jsp?CGPM=22&RES=10 Of course, that's primarily a /scientific/ standard; others have explained that commas are apparently the international /financial/ standard. "Aren't standards great? There's so many to choose from!" This thread continues to get more complicated by the day... (Localization doth be *hard*) Cheers, Chris -- I have a blog: http://blog.rebertia.com From greg.ewing at canterbury.ac.nz Sun Mar 15 23:37:25 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 16 Mar 2009 10:37:25 +1200 Subject: [Python-ideas] [Python-Dev] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B8D702.4040004@trueblade.com> <1F032CDC26874991B78E47DBDB333917@RaymondLaptop1> <49B9505C.1050803@trueblade.com> <49BA4A12.4040402@gmail.com> <49BB2981.5070008@gmail.com> <103DCFB40D144238B8562441FBC0A65C@RaymondLaptop1> Message-ID: <49BD8325.7060404@canterbury.ac.nz> Raymond Hettinger wrote: > 1. Why mark a non-locale aware form with a flag that indicates > locale awareness in another language. > 2. It seems to be basic bad design to require an apostrophe > to emit commas. Okay, so how about: comma - always use a comma apostrophe - use the locale And for the decimal point: dot - always use a dot semicolon - use the locale -- Greg From greg.ewing at canterbury.ac.nz Sun Mar 15 23:58:28 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 16 Mar 2009 10:58:28 +1200 Subject: [Python-ideas] Inline assignment expression In-Reply-To: <20090315140413.GA26355@panix.com> References: <8e63a5ce0903140647i1584630fic864230d42bcf58c@mail.gmail.com> <49BBE61E.6000304@doxdesk.com> <8e63a5ce0903141559y2aa6b2c0g67850c327ff3f607@mail.gmail.com> <49BC8669.6000004@gmail.com> <20090315140413.GA26355@panix.com> Message-ID: <49BD8814.7050406@canterbury.ac.nz> Aahz wrote: > Just for the record, the most common different syntax suggested has > historically been Pascal's ``:=`` I'd rather reserve that for a possible future "in-place assignment" operator, though. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 16 00:01:50 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 16 Mar 2009 11:01:50 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com> <49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> Message-ID: <49BD88DE.20807@canterbury.ac.nz> Scott David Daniels wrote: > B ' 9,876.543,21', ' 9,876.543,2' > C ' 9,876.543,2', ' 9,876.543,' > D ' 9,876.543,2', ' 9,876.543' What??? On the planet I come from, nobody uses separators for digits *after* the decimal point, unless perhaps if they're spaces. Certainly never commas. -- Greg From solipsis at pitrou.net Mon Mar 16 01:10:51 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 16 Mar 2009 00:10:51 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <49B9804E.3010501@canterbury.ac.nz> <49B9B3CB.2090906@canterbury.ac.nz> <49BA34BD.6070005@canterbury.ac.nz> <49BD7438.6090006@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > You can do that now, but someone invented format > strings, so they must have wanted a more convenient way > of going about it. I don't see how that contradicts what I said and you don't seem eager to produce understandable explanations, so I'll leave it there. From python at rcn.com Mon Mar 16 07:05:42 2009 From: python at rcn.com (Raymond Hettinger) Date: Sun, 15 Mar 2009 23:05:42 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o><09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> Message-ID: > Raymond Hettinger wrote: >> * Summarize commentary to date. >> * Add APOSTROPHE and non-breaking SPACE to the list of separators. >> * Add more links to external references. >> * Detail issues with the locale module. >> * Clarify how proposal II is parsed. [Scott David Daniels] > Still doesn't specify to digits beyond the decimal point. Will clarify that the intent is to put thousands separators only to the left of the decimal point. > In particular, what should (format(9876.54321, "13,.5f"), > format(9876.54321, "12,.4f")) produce? > Possible "reasonable" answers: > A ' 9,876.54321', ' 9,876.5432' > B ' 9,876.543,21', ' 9,876.543,2' > C ' 9,876.543,2', ' 9,876.543,' > D ' 9,876.543,2', ' 9,876.543' > I prefer B, but I can see an argument for any of the four above. Am proposing A That matches the existing precedent in the local module: >>> locale.setlocale(locale.LC_ALL, 'English_United States.1252') 'English_United States.1252' >>> locale.format("%15.8f", pi*1000, grouping=True) ' 3,141.59265359' It also matches what my adding have machines done, what my HP calculator does, how excel handles thousands grouping, and the other examples cited in the PEP. Am thinking that anything else this would be a new, made-up requirement. The closest I've seen to this is grouping of digits in long sequences of pi and in logarithm tables. It may be useful to someone somewhere, but am not going to propose it for the PEP. Raymond Raymond From python at rcn.com Mon Mar 16 08:01:32 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 00:01:32 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> <49BD3C49.1020904@gmx.net> Message-ID: [Mathias Panzenb?ck] > Has anyone mentioned yet that in german you write the following? > 10.000.000,000.001 These are all red-herrings. The proposal is not about internationalization and it says as much. There is no doubt that everyone and his brother can think up a different convention for writing down numbers. The PEP proposes a non-localized way to specify one of several separators to group thousands to the left of the decimal point. At least one way (spaces or underscores) should be readable, understandable, and useful to folks from many diverse backgrounds. It is not the intention to be able to reproduce everything that a person can think up. That would be a fools errand. Raymond From python at rcn.com Mon Mar 16 08:27:03 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 00:27:03 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49B92D05.4010807@gmail.com> <34E3F59B404D467BB2865203C94C1287@RaymondLaptop1> <49B97D71.5070107@trueblade.com><49B99B5B.7060808@trueblade.com> <20090313115602.3f9a19b9@o> <09C871E443EE4FDE9F85DEA22FFEFE33@RaymondLaptop1> <49BD3AD8.80905@trueblade.com> Message-ID: [Scott David Daniels] >> I'd say A. > For me, A and B are the "preferable" solutions; I just think > the PEP needs to say what it chooses. Thanks. I've updated the PEP to say A explicitly. Raymond From python at rcn.com Mon Mar 16 08:31:19 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 00:31:19 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for athousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><20090312083401.33cc525b@o> <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> Message-ID: [Chris Rebert] > This thread continues to get more complicated by the day... > (Localization doth be *hard*) Good thing the PEP is *not* about localization :-) It does not attempt to cater to every possible way to write numbers. Instead, it offers a handful of choices for thousands groupings. At least one of those choices (perhaps spaces or underscores) should be readable and useful in many (though not all) contexts. Raymond From ncoghlan at gmail.com Mon Mar 16 13:03:41 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 16 Mar 2009 22:03:41 +1000 Subject: [Python-ideas] Rough draft: Proposed format specifier for athousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><20090312083401.33cc525b@o> <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> Message-ID: <49BE401D.5050908@gmail.com> Raymond Hettinger wrote: > > [Chris Rebert] >> This thread continues to get more complicated by the day... >> (Localization doth be *hard*) > > Good thing the PEP is *not* about localization :-) > It does not attempt to cater to every possible way to write numbers. > Instead, it offers a handful of choices for thousands groupings. > At least one of those choices (perhaps spaces or underscores) > should be readable and useful in many (though not all) contexts. Emphatically agreed that this PEP shouldn't be targeted at end-user output for a commercial product. There are plenty of good solutions for that already in the l10n/i18n space. What is currently missing (and what the PEP will provide) is the ability to easily output more readable comparatively large integers for debugging output or quick and dirty "internal" scripts that are not intended for wide distribution. Having had my eyes glaze over attempting to decipher overly long integers in debugging output, I look forward to the day when I no longer have to write my own formatting functions to deal with that (even if the time when I can use 2.7 or 3.1 day to day is still somewhere in the dim distant future...) On a completely different topic, I noticed that the PEP doesn't currently state what the thousands separator means for bases other than 10 (i.e. octal, hex, binary). Is it ignored? Always delineates groups of 3 digits as for decimal numbers? Delineates an "appropriate" group size (e.g. 3 for octal, 4 for hex and binary)? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From denis.spir at free.fr Mon Mar 16 15:10:36 2009 From: denis.spir at free.fr (spir) Date: Mon, 16 Mar 2009 15:10:36 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> Message-ID: <20090316151036.67755239@o> Le Sun, 15 Mar 2009 15:34:54 -0700, Chris Rebert s'exprima ainsi: > commas are apparently the international /financial/ > standard Certainly not! I am very surprised to read that. Do you mean the standard in english-speaking countries? Or in countries which currency is $ or ?? see http://en.wikipedia.org/wiki/Decimal_point#Examples_of_use Denis ------ la vita e estrany From eric at trueblade.com Mon Mar 16 15:23:25 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 16 Mar 2009 10:23:25 -0400 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers Message-ID: <49BE60DD.7010603@trueblade.com> I'd like to add a function (or method) to parse str.format()'s standard mini-language format specifiers. It's hard to get right, and if PEP 378 is accepted, it gets more complex. The primary use case is for non-builtin numeric types that want to add __format__, and want it to support the same mini-language that the built in types support. For example see issue 2110, where Mark Dickinson implements his own version for Decimal, and suggests it be moved elsewhere. This function exists in Objects/stringlib/formatter.h, and will just need to be exposed to Python code. I propose a function that takes a single str (or unicode) and returns a named tuple with the appropriate values filled in. So, is such a function desirable, and if so, where would it go? I could expose it through the string module, which is where the sort-of-related Formatter class lives. It could be a method on str and unicode, but I'm not sure that's most appropriate. Eric. From pyideas at rebertia.com Mon Mar 16 15:55:23 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 16 Mar 2009 07:55:23 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <20090316151036.67755239@o> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> <20090316151036.67755239@o> Message-ID: <50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com> On Mon, Mar 16, 2009 at 7:10 AM, spir wrote: > Le Sun, 15 Mar 2009 15:34:54 -0700, > Chris Rebert s'exprima ainsi: > >> commas are apparently the international /financial/ >> standard > > Certainly not! I am very surprised to read that. Do you mean the standard in english-speaking countries? Or in countries which currency is $ or ?? > > see http://en.wikipedia.org/wiki/Decimal_point#Examples_of_use Yes, I know it varies from nation to nation, but apparently less so when specifically working internationally (like you, I was surprised there was a standard at all); see earlier response by Raymond Hettinger in the parallel c.l.p thread. Relevant quote: """ I'm a CPA, was a 15 year division controller for a Fortune 500 company, and an auditor for an international accounting firm. Believe me when I say it is the norm in finance. """ "It" referring to period-as-decimal-point and comma-as-thousands-separator notation. Cheers, Chris -- I have a blog: http://blog.rebertia.com From eric at trueblade.com Mon Mar 16 16:15:06 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 16 Mar 2009 11:15:06 -0400 Subject: [Python-ideas] Add a function to parse str.format() mini-language specifiers In-Reply-To: <49BE60DD.7010603@trueblade.com> References: <49BE60DD.7010603@trueblade.com> Message-ID: <49BE6CFA.9010808@trueblade.com> The subject shouldn't have said "Added". It's not a done deal! Eric. Eric Smith wrote: > I'd like to add a function (or method) to parse str.format()'s standard > mini-language format specifiers. It's hard to get right, and if PEP 378 > is accepted, it gets more complex. > > The primary use case is for non-builtin numeric types that want to add > __format__, and want it to support the same mini-language that the built > in types support. For example see issue 2110, where Mark Dickinson > implements his own version for Decimal, and suggests it be moved elsewhere. > > This function exists in Objects/stringlib/formatter.h, and will just > need to be exposed to Python code. I propose a function that takes a > single str (or unicode) and returns a named tuple with the appropriate > values filled in. > > So, is such a function desirable, and if so, where would it go? I could > expose it through the string module, which is where the sort-of-related > Formatter class lives. > > It could be a method on str and unicode, but I'm not sure that's most > appropriate. > > Eric. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From dickinsm at gmail.com Mon Mar 16 17:26:10 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 16 Mar 2009 16:26:10 +0000 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers In-Reply-To: <49BE60DD.7010603@trueblade.com> References: <49BE60DD.7010603@trueblade.com> Message-ID: <5c6f2a5d0903160926t629bff1eif9444fec547e8a78@mail.gmail.com> On Mon, Mar 16, 2009 at 2:23 PM, Eric Smith wrote: > I'd like to add a function (or method) to parse str.format()'s standard > mini-language format specifiers. +1 from me. (Of course. :-) Once the 'n' format code goes in, decimal.py will contain over 200 lines of Python code that really has very little to do with the decimal module at all. I'd like to see that code move somewhere else, partly out of a desire to unclutter the decimal module, and partly to make it easier to cope with changes and new features in the formatting mini-language. Out of curiosity, does anyone know of any numeric types (other than Decimal) that might benefit from this? Something like the '_format_align' function from decimal.py might also be of general use: it just does the job of padding and aligning a numeric string (as well as dealing with the sign). > be exposed to Python code. I propose a function that takes a single str (or > unicode) and returns a named tuple with the appropriate values filled in. Are there advantages to using a named tuple instead of a dict? If there's a possibility that some fields may or may not be defined depending on the value of other fields, then a dict may make more sense. (Not sure whether this can happen with the mini-language in its current form.) > So, is such a function desirable, and if so, where would it go? Yes, and don't know! > It could be a method on str and unicode, but I'm not sure that's most > appropriate. Doesn't seem right to me, either. Mark From eric at trueblade.com Mon Mar 16 17:31:51 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 16 Mar 2009 12:31:51 -0400 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers In-Reply-To: <5c6f2a5d0903160926t629bff1eif9444fec547e8a78@mail.gmail.com> References: <49BE60DD.7010603@trueblade.com> <5c6f2a5d0903160926t629bff1eif9444fec547e8a78@mail.gmail.com> Message-ID: <49BE7EF7.6090904@trueblade.com> Mark Dickinson wrote: > Something like the '_format_align' function from decimal.py > might also be of general use: it just does the job of padding > and aligning a numeric string (as well as dealing with the sign). Standby. That's next on my list of proposals. From hwpuschm at yahoo.de Mon Mar 16 17:37:59 2009 From: hwpuschm at yahoo.de (Heinrich W Puschmann) Date: Mon, 16 Mar 2009 16:37:59 +0000 (GMT) Subject: [Python-ideas] Keyword same in right hand side of assignments Message-ID: <239732.95222.qm@web25802.mail.ukl.yahoo.com> Python-Ideas: Keyword same in right hand side of assignments ------------------------------------------------------------------------------------------- It is proposed to introduce a Keyword "same", to be used in the right hand side of assignments, as follows: ? ? "xx = same + 5" or "xx = 5 + same"? synonymous with? "xx += 5" ? "value =? 2*same + 5"? synonymous with "value =*2; value +=5" ? "switch = 1 - same"? synonymous with "switch *-1; switch +=1" ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" ? "lst = [5,6] + same" synonymous with? "lst = [5,6] + lst" ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" ? and so on. ? ? ? IN GENERALITY, the effect of keyword same would be the following: ? The keyword same should only appear in the right hand side expression of an assignment, any number of times and wherever a name can appear. The left hand side of the assignment would have to be bound to some object. ? The keyword same is substituted by a name, which is bound to the object binding the left hand side of the assignment. The? expression at the right hand side is evaluated. The left hand side identifyer is bound to the result of the expression. ? ? ? Since I am not a developer, I have no idea on ? how to difficult or how easy it would be to implement such a feature. As a programmer, however, I believe that it improves readability and user friendliness, and that it fully adjusts to the Python philosophy. It is very hard for me to discern, ? whether a similar idea has already been proposed by somebody else. ? I did not yet have any useful application for statements like ? ? "xx = xx.do(*args)" or? "xx = yy.do(xx,*args)" or "xx = xx.do(xx,*args)" ? but as a matter of generalization, they should also allow subtitution of the right hand side appearances of xx by the keyword same. ? ? ? Heinrich Puschmann, Ulm From python at rcn.com Mon Mar 16 17:42:52 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 09:42:52 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for athousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><20090312083401.33cc525b@o><50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com><20090316151036.67755239@o> <50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com> Message-ID: <49380553492E4209BB05DA4149DD3C8F@RaymondLaptop1> > "It" referring to period-as-decimal-point and > comma-as-thousands-separator notation. Guys, I just meant that grouping of thousands is common in finance. The actual grouping separator varies. Raymond From eric at trueblade.com Mon Mar 16 17:51:24 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 16 Mar 2009 12:51:24 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> Message-ID: <49BE838C.9000906@trueblade.com> I vote we move ahead with Proposal II from PEP 378. I don't think there's anything else to add to the discussion. Eric. From pyideas at rebertia.com Mon Mar 16 18:24:40 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 16 Mar 2009 10:24:40 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for athousands separator (discussion moved from python-dev) In-Reply-To: <49380553492E4209BB05DA4149DD3C8F@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> <20090316151036.67755239@o> <50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com> <49380553492E4209BB05DA4149DD3C8F@RaymondLaptop1> Message-ID: <50697b2c0903161024p72a8d1c9hb4e052242cf8aec8@mail.gmail.com> On Mon, Mar 16, 2009 at 9:42 AM, Raymond Hettinger wrote: > >> "It" referring to period-as-decimal-point and >> comma-as-thousands-separator notation. > > Guys, I just meant that grouping of thousands is common in finance. > The actual grouping separator varies. Ah, evidently I misinterpreted. Apologies. Cheers, Chris From zac256 at gmail.com Mon Mar 16 18:25:00 2009 From: zac256 at gmail.com (Zac Burns) Date: Mon, 16 Mar 2009 10:25:00 -0700 Subject: [Python-ideas] PEP links in docs Message-ID: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com> I would like to see links to relevant PEP / mailing list docs in the main docs. For example http://docs.python.org/library/dircache.html after "Deprecated since version 2.6" could include a (PEP x.x) link where I could read why it was deprecated and probably what to use in it's place. The docs right now are quite excellent describing the "what" about everything, but often have little to say about the "why". This is a good thing, but links would surely help those that want to learn more. -- Zachary Burns (407)590-4814 Aim - Zac256FL Production Engineer (Digital Overlord) Zindagi Games From rdmurray at bitdance.com Mon Mar 16 17:48:50 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 16 Mar 2009 16:48:50 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <20090312083401.33cc525b@o> <50697b2c0903151534t6f5f1f57jb82ab30808e7c91f@mail.gmail.com> <20090316151036.67755239@o> <50697b2c0903160755r2cc830b9p222fa5748942ea5@mail.gmail.com> Message-ID: Chris Rebert wrote: > On Mon, Mar 16, 2009 at 7:10 AM, spir wrote: > > Le Sun, 15 Mar 2009 15:34:54 -0700, > > Chris Rebert s'exprima ainsi: > > > >> commas are apparently the international /financial/ > >> standard > > > > Certainly not! I am very surprised to read that. Do you mean the standard in english-speaking countries? Or in countries which currency is $ or ?? > > > > see http://en.wikipedia.org/wiki/Decimal_point#Examples_of_use > > Yes, I know it varies from nation to nation, but apparently less so > when specifically working internationally (like you, I was surprised > there was a standard at all); see earlier response by Raymond > Hettinger in the parallel c.l.p thread. Relevant quote: > > """ > I'm a CPA, was a 15 year division controller > for a Fortune 500 company, and an auditor for an international > accounting firm. Believe me when I say it is the norm in finance. > """ > > "It" referring to period-as-decimal-point and > comma-as-thousands-separator notation. Regardless of any standards, I find it interesting that I just now ran into exactly the use case that prompted Raymond to propose this addition. I need to format a one-off report for a client, and it would be _most_ helpful if I could easily tell Python to format the numbers, which are reporting bytes transmitted, with comma thousands separators for clarity. I guess that means I'm +1 for some form of this making it through. -- R. David Murray http://www.bitdance.com From rdmurray at bitdance.com Mon Mar 16 18:57:14 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Mon, 16 Mar 2009 17:57:14 +0000 (UTC) Subject: [Python-ideas] Keyword same in right hand side of assignments References: <239732.95222.qm@web25802.mail.ukl.yahoo.com> Message-ID: Heinrich W Puschmann wrote: > > Python-Ideas: Keyword same in right hand side of assignments > ------------------------------------------------------------------------------------------- > > It is proposed to introduce a Keyword "same", > to be used in the right hand side of assignments, as follows: > ? > ? "xx = same + 5" or "xx = 5 + same"? synonymous with? "xx += 5" > ? "value =? 2*same + 5"? synonymous with "value =*2; value +=5" > ? "switch = 1 - same"? synonymous with "switch *-1; switch +=1" > ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" > ? "lst = [5,6] + same" synonymous with? "lst = [5,6] + lst" > ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" > ? > and so on. > ? > ? > ? > IN GENERALITY, the effect of keyword same would be the following: > > ? > The keyword same should only appear in > the right hand side expression of an assignment, > any number of times and wherever a name can appear. > The left hand side of the assignment would have to be bound to some object. > ? > The keyword same is substituted by a name, > which is bound to the object binding the left hand side of the assignment. > The? expression at the right hand side is evaluated. > The left hand side identifyer is bound to the result of the expression. Given your definition, this: "lst = same + [5,6]"? synonymous with? "lst += [5,6]" is not true. By your definition (and a programmer's naive expectation based on other python semantics), lst = same + [5,6] translates to lst = lst + [5,6]. But: >>> old = lst = [1] >>> lst = lst + [5,6] >>> old, lst ([1], [1, 5, 6]) >>> old is lst False While: >>> old = lst = [1] >>> lst += [5,6] >>> old, lst ([1, 5, 6], [1, 5, 6]) >>> old is lst True So the two are _not_ synonymous. That point aside, I do not see the utility of this feature. To me, it means that my eye would have to scan backward to the start of the line to find out what 'same' was, whereas in the current formulation the answer is right under my eyes. Code is more about reading that it is about writing, since reading code is something done much more often than writing it, so I'd rather keep it easier to read. Personally I wouldn't find typing 'same' any easier than typing the variable, anyway. -- R. David Murray http://www.bitdance.com From steve at pearwood.info Mon Mar 16 19:18:35 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 17 Mar 2009 05:18:35 +1100 Subject: [Python-ideas] Keyword same in right hand side of assignments In-Reply-To: <239732.95222.qm@web25802.mail.ukl.yahoo.com> References: <239732.95222.qm@web25802.mail.ukl.yahoo.com> Message-ID: <200903170518.36011.steve@pearwood.info> On Tue, 17 Mar 2009 03:37:59 am Heinrich W Puschmann wrote: > Python-Ideas: Keyword same in right hand side of assignments > --------------------------------------------------------------------- >---------------------- > > It is proposed to introduce a Keyword "same", > to be used in the right hand side of assignments, as follows: > ? > ? "xx = same + 5" or "xx = 5 + same"? synonymous with? "xx += 5" > ? "value =? 2*same + 5"? synonymous with "value =*2; value +=5" > ? "switch = 1 - same"? synonymous with "switch *-1; switch +=1" > ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" > ? "lst = [5,6] + same" synonymous with? "lst = [5,6] + lst" > ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" > ? > and so on. What's the point? Why not just write the following? xx = xx + 5 value = 2*value + 5 switch = 1 - switch lst = lst + [5,6] lst = [5,6] + lst lst[2] = 1/lst[2] What value do we gain by creating a new keyword that obscures what the assignment does? -- Steven D'Aprano From denis.spir at free.fr Mon Mar 16 19:59:40 2009 From: denis.spir at free.fr (spir) Date: Mon, 16 Mar 2009 19:59:40 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BE838C.9000906@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> Message-ID: <20090316195940.6e4e1061@o> Le Mon, 16 Mar 2009 12:51:24 -0400, Eric Smith s'exprima ainsi: > I vote we move ahead with Proposal II from PEP 378. I don't think > there's anything else to add to the discussion. > > Eric. Agree. denis ------ la vita e estrany From denis.spir at free.fr Mon Mar 16 19:59:09 2009 From: denis.spir at free.fr (spir) Date: Mon, 16 Mar 2009 19:59:09 +0100 Subject: [Python-ideas] PEP links in docs In-Reply-To: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com> References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com> Message-ID: <20090316195909.6c81f5e6@o> Le Mon, 16 Mar 2009 10:25:00 -0700, Zac Burns s'exprima ainsi: > I would like to see links to relevant PEP / mailing list docs in the main > docs. +++ PEPs -- with purpose and rationale, and deeply reviewed -- are most often great. There are actually often more legible even for non-specialists, simply because they tell you why. If one doesn't understand the PEP's introduction, then yes, probably this person does not need the feature ;-) > For example http://docs.python.org/library/dircache.html after > "Deprecated since version 2.6" could include a (PEP x.x) link where I > could read why it was deprecated and probably what to use in it's > place. > > The docs right now are quite excellent describing the "what" about > everything, but often have little to say about the "why". This is a > good thing, but links would surely help those that want to learn more. More than true, imo. Every feature documentation could (should?) start answering the infamous "why?"; meaning the purpose, issue,... I really do not agree with the "if you don't know why, you don't need it" (elitist) argument. Denis ------ la vita e estrany From python at rcn.com Mon Mar 16 20:19:24 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 12:19:24 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> Message-ID: Guido, The conversation on the thousands separator seems to have wound down and the PEP has stabilized: http://www.python.org/dev/peps/pep-0378/ Please pronounce. Raymond ----- Original Message ----- From: "spir" To: Sent: Monday, March 16, 2009 11:59 AM Subject: Re: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) > Le Mon, 16 Mar 2009 12:51:24 -0400, > Eric Smith s'exprima ainsi: > >> I vote we move ahead with Proposal II from PEP 378. I don't think >> there's anything else to add to the discussion. >> >> Eric. > > Agree. > > denis > ------ > la vita e estrany > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From sligocki at gmail.com Mon Mar 16 21:02:05 2009 From: sligocki at gmail.com (Shawn Ligocki) Date: Mon, 16 Mar 2009 13:02:05 -0700 Subject: [Python-ideas] PEP links in docs In-Reply-To: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com> References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com> Message-ID: <2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com> +2, Sign me up. The "why" is the most annoyingly avoided question in teaching anything. If someone leads this, I'll help contribute! Cheers, Shawn Ligocki sligocki at gmail.com On Mon, Mar 16, 2009 at 10:25 AM, Zac Burns wrote: > I would like to see links to relevant PEP / mailing list docs in the main > docs. > > For example http://docs.python.org/library/dircache.html after > "Deprecated since version 2.6" could include a (PEP x.x) link where I > could read why it was deprecated and probably what to use in it's > place. > > The docs right now are quite excellent describing the "what" about > everything, but often have little to say about the "why". This is a > good thing, but links would surely help those that want to learn more. > > -- > Zachary Burns > (407)590-4814 > Aim - Zac256FL > Production Engineer (Digital Overlord) > Zindagi Games > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Mon Mar 16 21:08:33 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 16 Mar 2009 16:08:33 -0400 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers In-Reply-To: <49BE60DD.7010603@trueblade.com> References: <49BE60DD.7010603@trueblade.com> Message-ID: Eric Smith wrote: > I'd like to add a function (or method) to parse str.format()'s standard > mini-language format specifiers. It's hard to get right, and if PEP 378 > is accepted, it gets more complex. > > The primary use case is for non-builtin numeric types that want to add > __format__, and want it to support the same mini-language that the built > in types support. For example see issue 2110, where Mark Dickinson > implements his own version for Decimal, and suggests it be moved elsewhere. > > This function exists in Objects/stringlib/formatter.h, and will just > need to be exposed to Python code. I propose a function that takes a > single str (or unicode) and returns a named tuple with the appropriate > values filled in. > > So, is such a function desirable, and if so, Yes, but I would take it further and and consider the string and dict/named-tuple as alternate interfaces to the formatting machinery. So I would a) add an inverse function that would take a dict or named tuple and produce the field specifier as a string (or raise ValueError). Such a string could be embedded into a complete format string. Some people might prefer this specification method. b> amend built-in format() to take a dict/n-t as the second argument on the basis that it is silly to transform the parse result back into a string just to be parsed again. This would make repeated calls to format faster by eliminating the parsing step. > where would it go? I could expose it through the string module, which is where the sort-of-related > Formatter class lives. That seems the most obvious place. > > It could be a method on str and unicode, but I'm not sure that's most > appropriate. Terry Jan Reedy From eric at trueblade.com Mon Mar 16 21:31:20 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 16 Mar 2009 16:31:20 -0400 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers In-Reply-To: References: <49BE60DD.7010603@trueblade.com> Message-ID: <49BEB718.6020708@trueblade.com> Terry Reedy wrote: > Eric Smith wrote: >> I'd like to add a function (or method) to parse str.format()'s >> standard mini-language format specifiers. It's hard to get right, and >> if PEP 378 is accepted, it gets more complex. ... >> So, is such a function desirable, and if so, > > Yes, but I would take it further and and consider the string and > dict/named-tuple as alternate interfaces to the formatting machinery. So > I would If the only use case for this is for non-builtin numeric types, I'd vote for a named tuple. But since Mark (who's one of the primary users) also raised the dict issue, I'll give it some thought. > a) add an inverse function that would take a dict or named tuple and > produce the field specifier as a string (or raise ValueError). Such a > string could be embedded into a complete format string. Some people > might prefer this specification method. This is a pretty simple transformation. I'm not so sure it's all that useful. > b> amend built-in format() to take a dict/n-t as the second argument on > the basis that it is silly to transform the parse result back into a > string just to be parsed again. This would make repeated calls to > format faster by eliminating the parsing step. But format() works for any type, including ones that don't understand the standard mini-language. What would they do with this info? From tjreedy at udel.edu Mon Mar 16 21:53:35 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 16 Mar 2009 16:53:35 -0400 Subject: [Python-ideas] Keyword same in right hand side of assignments In-Reply-To: <239732.95222.qm@web25802.mail.ukl.yahoo.com> References: <239732.95222.qm@web25802.mail.ukl.yahoo.com> Message-ID: Heinrich W Puschmann wrote: > Python-Ideas: Keyword same in right hand side of assignments > ------------------------------------------------------------------------------------------- > > It is proposed to introduce a Keyword "same", > to be used in the right hand side of assignments, as follows: > > "xx = same + 5" or "xx = 5 + same" synonymous with "xx += 5" Only if + is commutative and xx in immutable, and even then, see below. > "value = 2*same + 5" synonymous with "value =*2; value +=5" > "switch = 1 - same" synonymous with "switch *-1; switch +=1" > "lst = same + [5,6]" synonymous with "lst += [5,6]" > "lst = [5,6] + same" synonymous with "lst = [5,6] + lst" > "lst[2] = 1/same" synonymous with "lst[2] **=-1" > > and so on. This is an intriguing idea, which seems to extend the idea of augmented assignment, and which would seem most useful for complicated target expressions or for multiple uses of the pre-existing object bound to the target. It is similar in that the target must have a pre-existing binding. While it might work best for one target, it might not have to be so restricted. But one problem is that it changes and complicates the meaning of assignment statements. Currently, target-expression(s) = source-expression means evaluate source-expression; then evaluate target expression (s, left to right) and bind it (them) to the source object (or objects produce by iterating). This proposal requires that target be evaluated first, resolved to a object, bound to 'same' (or the internal equivalent) and after the assignment, unbound from 'same'. Since targets are expressions and not objects, I believe the target expression would have to be re-evaluated (without major change to the virtual machine) to make the binding, so this constructions would not save a target evaluation and would not be synonymous with augmented assignment even if it otherwise would be. So lst[2] = 1/same would really be equivalent to ____ = lst[2]; lst[2] = 1/____; del ____ Another problem is that any new short keyword breaks code and therefore needs a strong justification that it will also improve a substantial amount of other code. Terry Jan Reedy From guido at python.org Mon Mar 16 22:05:47 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Mar 2009 14:05:47 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> Message-ID: On Mon, Mar 16, 2009 at 12:19 PM, Raymond Hettinger wrote: > The conversation on the thousands separator seems to have wound down > and the PEP has stabilized: ?http://www.python.org/dev/peps/pep-0378/ > > Please pronounce. That's not a PEP, it's just a summary of a discussion without any choice. :-) Typically PEPs put the discussion of alternatives in some section at the end, after the specification and other stuff relevant going forward. Just to add more fuel to the fire, did anyone propose refactoring the problem into (a) a way to produce output with a thousands separator, and (b) a way to localize formats? We could solve (a) by adding a comma to all numeric format languages along Nick's proposal, and we could solve (b) either now or later by adding some other flag that means "use locale-specific numeric formatting for this value". Or perhaps there could be two separate flags corresponding to the grouping and monetary arguments to locale.format(). I'd be happy to punt on (b) until later. This is somewhat analogous to the approach for strftime() which has syntax to invoke locale-specific formatting (%a, %A, %b, %B, %c, %p, %x, %X). I guess in the end this means I am in favor of Nick's alternative. One thing I don't understand: the PEP seems to exclude the 'e' and 'g' format. I would think that in case 'g' defers to 'f' it should act the same, and in case it defers to 'e', well, in the future (under (b) above) that could still change the period into a comma, right? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Mon Mar 16 22:52:26 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Mar 2009 07:52:26 +1000 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers In-Reply-To: <49BE60DD.7010603@trueblade.com> References: <49BE60DD.7010603@trueblade.com> Message-ID: <49BECA1A.9020100@gmail.com> Eric Smith wrote: > So, is such a function desirable, and if so, where would it go? I could > expose it through the string module, which is where the sort-of-related > Formatter class lives. string.parse_format and string.build_format perhaps? The inverse operation would be useful if you just wanted to do something like "use a default precision of 3" but otherwise leave things up to the original object. def custom_format(fmt, value): details = string.parse_format(fmt) if details["precision"] is None: # Assumes None indicates missing details["precision"] = 3 fmt = string.build_format(details) return format(fmt, value) While having to rebuild and reparse the string is a little annoying, changing that would involve changing the spec for the __format__ magic method and I don't think we want to go there. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From g.brandl at gmx.net Mon Mar 16 22:56:24 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 16 Mar 2009 22:56:24 +0100 Subject: [Python-ideas] PEP links in docs In-Reply-To: <2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com> References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com> <2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com> Message-ID: Feel free to send patches, as small as they may seem. Mark up PEP numbers in reST like this -- :pep:`42` -- to get automatic linking. Georg Shawn Ligocki schrieb: > +2, Sign me up. > > The "why" is the most annoyingly avoided question in teaching anything. > > If someone leads this, I'll help contribute! > > Cheers, > > Shawn Ligocki > sligocki at gmail.com > > > On Mon, Mar 16, 2009 at 10:25 AM, Zac Burns > > wrote: > > I would like to see links to relevant PEP / mailing list docs in the > main docs. > > For example http://docs.python.org/library/dircache.html after > "Deprecated since version 2.6" could include a (PEP x.x) link where I > could read why it was deprecated and probably what to use in it's > place. > > The docs right now are quite excellent describing the "what" about > everything, but often have little to say about the "why". This is a > good thing, but links would surely help those that want to learn more. > > -- > Zachary Burns > (407)590-4814 > Aim - Zac256FL > Production Engineer (Digital Overlord) > Zindagi Games > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > ------------------------------------------------------------------------ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From python at rcn.com Mon Mar 16 23:05:24 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 15:05:24 -0700 Subject: [Python-ideas] PEP links in docs References: <333edbe80903161025s67bb6960r49674c4f2b70cc75@mail.gmail.com><2f680a9a0903161302v3a00494dhe912c402372346a4@mail.gmail.com> Message-ID: <671579EA9C314A70AE166A0173289355@RaymondLaptop1> > Feel free to send patches, as small as they may seem. Mark up PEP numbers > in reST like this -- :pep:`42` -- to get automatic linking. We might want to include a PEP index in the documentation but I think it's a really bad idea to include links from within the docs. The PEPs get out of date quickly. They document an early decision but not its ultimate evolution and diffusion through-out the language. Raymond From tjreedy at udel.edu Mon Mar 16 23:09:47 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 16 Mar 2009 18:09:47 -0400 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers In-Reply-To: <49BEB718.6020708@trueblade.com> References: <49BE60DD.7010603@trueblade.com> <49BEB718.6020708@trueblade.com> Message-ID: Eric Smith wrote: > Terry Reedy wrote: >> Eric Smith wrote: >>> I'd like to add a function (or method) to parse str.format()'s >>> standard mini-language format specifiers. It's hard to get right, and >>> if PEP 378 is accepted, it gets more complex. > ... >>> So, is such a function desirable, and if so, >> >> Yes, but I would take it further and and consider the string and >> dict/named-tuple as alternate interfaces to the formatting machinery. >> So I would > > If the only use case for this is for non-builtin numeric types, I'd vote > for a named tuple. But since Mark (who's one of the primary users) also > raised the dict issue, I'll give it some thought. > >> a) add an inverse function that would take a dict or named tuple and >> produce the field specifier as a string (or raise ValueError). Such a >> string could be embedded into a complete format string. Some people >> might prefer this specification method. > > This is a pretty simple transformation. I'm not so sure it's all that > useful. > >> b> amend built-in format() to take a dict/n-t as the second argument >> on the basis that it is silly to transform the parse result back into >> a string just to be parsed again. This would make repeated calls to >> format faster by eliminating the parsing step. > > But format() works for any type, including ones that don't understand > the standard mini-language. What would they do with this info? The same thing they would do (whatever that is) if the second argument were instead the equivalent unparsed format-spec string. The easiest implementation of what I am proposing would be for the parse_spec function whose output you propose to expose were to recognize when its input is not a string but previous parse output. Just as iter(iter(ob)) is iter(ob), parse(parse(spec)) should be the same as parse(spec). tjr From tjreedy at udel.edu Mon Mar 16 23:19:55 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 16 Mar 2009 18:19:55 -0400 Subject: [Python-ideas] Added a function to parse str.format() mini-language specifiers In-Reply-To: <49BECA1A.9020100@gmail.com> References: <49BE60DD.7010603@trueblade.com> <49BECA1A.9020100@gmail.com> Message-ID: Nick Coghlan wrote: > Eric Smith wrote: >> So, is such a function desirable, and if so, where would it go? I could >> expose it through the string module, which is where the sort-of-related >> Formatter class lives. > > string.parse_format and string.build_format perhaps? The inverse > operation would be useful if you just wanted to do something like "use a > default precision of 3" but otherwise leave things up to the original > object. > > def custom_format(fmt, value): > details = string.parse_format(fmt) > if details["precision"] is None: # Assumes None indicates missing > details["precision"] = 3 > fmt = string.build_format(details) > return format(fmt, value) return format(value, fmt) # ;-) > While having to rebuild and reparse the string is a little annoying, yes > changing that would involve changing the spec for the __format__ magic > method and I don't think we want to go there. If parse_format were idempotent for its output like iter, then the change would, I think, be pretty minimal. I am assuming here that the __format__ method calls the parse_format(fmt) function that Eric proposed to expose. If details == parse_format(details) == parse_format(build_format(details)), then the rebuild and reparse is not needed and passing details instead of the rebuilt string should be transparent to __format__. tjr From python at rcn.com Mon Mar 16 23:24:26 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 15:24:26 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> Message-ID: <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> [Guido van Rossum] > Typically PEPs put the discussion of alternatives in some > section at the end, after the specification and other stuff relevant > going forward. Okay, re-arranged to make it more peppy. > I guess in the end this means I am in favor of Nick's alternative. Was hoping you would be more attracted to the other proposal which more people's needs right out of the box. No matter what country you're in, it's nice to have the option to switch to spaces or underscores regardless of your local convention. In the end, most respondants seemed to support the more flexible version (Eric's proposal). > One thing I don't understand: the PEP seems to exclude the 'e' and 'g' > format. I would think that in case 'g' defers to 'f' it should act the > same, and in case it defers to 'e', well, in the future (under (b) > above) that could still change the period into a comma, right? Makes sense. So noted in the PEP. Raymond From ncoghlan at gmail.com Mon Mar 16 23:37:41 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 17 Mar 2009 08:37:41 +1000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> Message-ID: <49BED4B5.2060100@gmail.com> Raymond Hettinger wrote: >> I guess in the end this means I am in favor of Nick's alternative. > > Was hoping you would be more attracted to the other proposal > which more people's needs right out of the box. No matter > what country you're in, it's nice to have the option to switch > to spaces or underscores regardless of your local convention. I actually prefer proposal II as well. It provides a decent quick solution for one-off scripts and debugging output, while leaving proper l10n/i18n support to the appropriate (heavier) tools. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From guido at python.org Mon Mar 16 23:46:37 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Mar 2009 15:46:37 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> Message-ID: On Mon, Mar 16, 2009 at 3:24 PM, Raymond Hettinger wrote: > [Guido van Rossum] >> >> Typically PEPs put the discussion of alternatives in some >> section at the end, after the specification and other stuff relevant >> going forward. > > Okay, re-arranged to make it more peppy. >> I guess in the end this means I am in favor of Nick's alternative. > > Was hoping you would be more attracted to the other proposal > which more people's needs right out of the box. ?No matter > what country you're in, it's nice to have the option to switch > to spaces or underscores regardless of your local convention. Your preference wasn't clear from the PEP. :-) > In the end, most respondants seemed to support the more flexible > version (Eric's proposal). Well, Python survived for about 19 years without having a way to override the decimal point *except* by using the locale module. I guess that divides our users in two classes: (1) Those for whom the default (C) locale is sufficient -- either because they live in the US (1a), or because they're used to programming languages US-centric approach (1b). (2) Those who absolutely need their numbers formatted for a locale -- either because they want to write heavy-duty localized code (2a), or because their locale doesn't use a comma and their end users would be upset to see US-formatted numbers (2b). For category (1), Nick's minimal proposal is good enough; someone in category (1b) who can live with a US-centric decimal point can also live with a US-centric thousands separator. For category (2a), Eric's proposal is not good enough. Which leaves category (2b), which must be pretty small because they've apparently put up with using the locale module anyways. >> One thing I don't understand: the PEP seems to exclude the 'e' and 'g' >> format. I would think that in case 'g' defers to 'f' it should act the >> same, and in case it defers to 'e', well, in the future (under (b) >> above) that could still change the period into a comma, right? > > Makes sense. ?So noted in the PEP. On Mon, Mar 16, 2009 at 3:37 PM, Nick Coghlan wrote: > I actually prefer proposal II as well. It provides a decent quick > solution for one-off scripts and debugging output, while leaving proper > l10n/i18n support to the appropriate (heavier) tools. For debugging output and one-offs I don't think the period-vs-comma issue matters much; I'd expect those all to fall in category (1). Another way to look at it is: adding a thousands separator makes a *huge* difference for a large group of potential users, because interpreting numbers with more than 5 or 6 digits is very cumbersome otherwise. However adding a facility to specify a different character for the decimal point and for the separator only matters for a much smaller group of people (2b only), and IMO isn't worth the extra syntactic complexities. I would much rather add syntactic complexity to address a larger issue like (2a). I also have to say that I find Eric's proposal a bit ambiguous: why shouldn't {:8,d} mean "insert commas between thousands"? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Mar 17 00:02:52 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 16:02:52 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> Message-ID: <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> > I also have to say that I find Eric's proposal a bit ambiguous: why > shouldn't {:8,d} mean "insert commas between thousands"? It does. That is the sixth example listed: format(1234, "8.1f") --> ' 1234.0' format(1234, "8,1f") --> ' 1234,0' format(1234, "8.,1f") --> ' 1.234,0' format(1234, "8 ,f") --> ' 1 234,0' format(1234, "8d") --> ' 1234' format(1234, "8,d") --> ' 1,234' format(1234, "8_d") --> ' 1_234' Raymond From guido at python.org Tue Mar 17 00:06:12 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Mar 2009 16:06:12 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> Message-ID: On Mon, Mar 16, 2009 at 4:02 PM, Raymond Hettinger wrote: >> I also have to say that I find Eric's proposal a bit ambiguous: why >> shouldn't {:8,d} mean "insert commas between thousands"? > > It does. ?That is the sixth example listed: > > ?format(1234, "8.1f") ? ? --> ? ?' ?1234.0' > ?format(1234, "8,1f") ? ? --> ? ?' ?1234,0' > ?format(1234, "8.,1f") ? ?--> ? ?' 1.234,0' > ?format(1234, "8 ,f") ? ? --> ? ?' 1 234,0' > ?format(1234, "8d") ? ? ? --> ? ?' ? ?1234' > ?format(1234, "8,d") ? ? ?--> ? ?' ? 1,234' > ?format(1234, "8_d") ? ? ?--> ? ?' ? 1_234' Argh! So "8,1f" means "use comma instead of point" wherease "8,1d" means "use comma as 1000 separator"? You guys can't seriously propose that. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Mar 17 00:25:58 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 16:25:58 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> Message-ID: <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> > Argh! So "8,1f" means "use comma instead of point" wherease "8,1d" > means "use comma as 1000 separator"? They both mean use the comma for the thousands separator. The decimal separator only gets overridden as part of the precision specification if provided: format(1234, "8,1f") --> ' 1234,0' Originally, I proposed prefixing the thousands separator with the letter T: format(1234, "8T,d") --> ' 1,234'. That made it crystal clear that the next character was the thousands separator. But people found it to be ugly and reacted badly. Eric then noticed that the T wasn't essential as long as the decimal separator is tightly associated with the precision specifier. If you find that to be screwy, then I guess Nick comma-only alternative wins. Or, there is an alternative that is a little more flexible. Make the thousands separator one of SPACE, UNDERSCORE, COMMA, or APOSTROPHE, leaving out the DOT which is reserved to be the sole decimal separator. That is unambiguous but doesn't help folks who want both a DOT thousands separator and COMMA decimal separator. Raymond From guido at python.org Tue Mar 17 00:42:10 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Mar 2009 16:42:10 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> Message-ID: On Mon, Mar 16, 2009 at 4:25 PM, Raymond Hettinger wrote: >> Argh! So "8,1f" means "use comma instead of point" wherease "8,1d" >> means "use comma as 1000 separator"? > > They both mean use the comma for the thousands separator. ?The decimal > separator only gets overridden as part of the precision specification if > provided: ? ? ?format(1234, "8,1f") ? ? --> ? ?' ?1234,0' So I misread, but it is exceedingly subtle indeed: apparently if there's *one* special character it's the decimal point with 'f' and the thousands separator with 'd'; only 'f' supports *two* special characters and then the *first* one is the decimal point. The fact that we need so many emails to sort this out makes it clear that this proposal will lead to endless user confusion. > Originally, I proposed prefixing the thousands separator with the letter T: > ? ? ?format(1234, "8T,d") ? ? ?--> ? ?' ? 1,234'. ?That made it crystal > clear that the next character was the thousands separator. ?But people found > it to be ugly and reacted badly. ?Eric then noticed that the T wasn't > essential as long as the decimal separator is tightly associated with the > precision specifier. > > If you find that to be screwy, then I guess Nick comma-only alternative > wins. Yes. > Or, there is an alternative that is a little more flexible. ?Make the > thousands separator one of SPACE, UNDERSCORE, COMMA, or APOSTROPHE, leaving > out the DOT which is reserved to be the sole decimal separator. ?That is > unambiguous but doesn't help folks who want both a DOT thousands separator > and COMMA decimal separator. Right. Let's go ahead with Nick's proposal and put ways of specifying alternate separators (either via the locale or hardcoded) on the back burner Note that, unlike with the original % syntax, in .format() strings we can easily append extra syntax to the end. E.g. format(1234.5, "08,.1f;L"} could mean "use the locale", wherease format(1234.5, "08,.1f;T=_;D=,") could mean "use '_' for thousands, ',' for decimal point. But please, let's put this off and get Nick's simple proposal in first. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Mar 17 00:45:56 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Mar 2009 11:45:56 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <20090316195940.6e4e1061@o> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> Message-ID: <49BEE4B4.6040709@canterbury.ac.nz> > Eric Smith s'exprima ainsi: > >>I vote we move ahead with Proposal II from PEP 378. Looks fairly good to me. -- Greg From guido at python.org Tue Mar 17 00:48:50 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Mar 2009 16:48:50 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BEE4B4.6040709@canterbury.ac.nz> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <49BEE4B4.6040709@canterbury.ac.nz> Message-ID: On Mon, Mar 16, 2009 at 4:45 PM, Greg Ewing wrote: >> Eric Smith s'exprima ainsi: >> >>> >>> I vote we move ahead with Proposal II from PEP 378. > > Looks fairly good to me. Of course this is by now ambiguous -- the latest version of the PEP no longer numbers the versions I and II, and has Nick's version second. (Which may be reversed by the time you read this if Raymond keeps updating the PEP in real time. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From aahz at pythoncraft.com Tue Mar 17 00:54:44 2009 From: aahz at pythoncraft.com (Aahz) Date: Mon, 16 Mar 2009 16:54:44 -0700 Subject: [Python-ideas] PEP links in docs In-Reply-To: <671579EA9C314A70AE166A0173289355@RaymondLaptop1> References: <671579EA9C314A70AE166A0173289355@RaymondLaptop1> Message-ID: <20090316235444.GB26292@panix.com> On Mon, Mar 16, 2009, Raymond Hettinger wrote: >attribution for Georg Brandl deleted: >> >> Feel free to send patches, as small as they may seem. Mark up PEP numbers >> in reST like this -- :pep:`42` -- to get automatic linking. > > We might want to include a PEP index in the documentation > but I think it's a really bad idea to include links from within > the docs. The PEPs get out of date quickly. They document > an early decision but not its ultimate evolution and diffusion > through-out the language. There are two reasons to link to PEPs: * Provide the historical context * Give detailed info lacking in the docs The first purpose will always exist, and I see no reason to delete such a link if documented as such. Links for the latter purpose can be removed when the docs are updated to match what's available in the PEP. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ Adopt A Process -- stop killing all your children! From tjreedy at udel.edu Tue Mar 17 01:01:53 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 16 Mar 2009 20:01:53 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> Message-ID: Guido van Rossum wrote: > On Mon, Mar 16, 2009 at 12:19 PM, Raymond Hettinger wrote: >> The conversation on the thousands separator seems to have wound down >> and the PEP has stabilized: http://www.python.org/dev/peps/pep-0378/ >> >> Please pronounce. > > That's not a PEP, it's just a summary of a discussion without any > choice. :-) I hope Raymond can understand this. To me, the choice presented is to add the Main Proposal syntax extension, or not. Typically PEPs put the discussion of alternatives in some > section at the end, after the specification and other stuff relevant > going forward. You want more alternatives than the Nick's Alternative Proposal, discussed at the end? I believe most of the other ideas on the list were directed at some sense of (b) below. > Just to add more fuel to the fire, did anyone propose refactoring the > problem into (a) a way to produce output with a thousands separator, > and (b) a way to localize formats? Since a way to produce output with a choice of thousands separators is a necessary part of a way localize formats, I am not sure of what distinction you are trying to draw. 'Localize formats' has two quite distinct meanings: 'format this number in a particular way (which can vary from number to number or at least user to user)' versus 'format all numbers according to a particular national standard'. > We could solve (a) by adding a > comma to all numeric format languages along Nick's proposal, Raymond current proposal, based on discussion, is to offer users a choice of 5 chars as thousands separators (and allow a choice of decimal separator). Nick's proposal is to only offer comma as thousands separator. While the latter meets my current parochial needs, I favor the more inclusive approach. > and we could solve (b) either now Raymond's main proposal partially solves that now (which is to say, completely solves than now for most of the world) in the first sense I gave for (b), on a case-by-case basis. > or later by adding some other flag that > means "use locale-specific numeric formatting for this value". As I understand from Raymond's introductory comments and those in the locale module docs, the global C locale setting is not intended to be changed on an output-by-output basis. Hence, while useful for nationalizing software, it is not so useful for individualized output from global software. > perhaps there could be two separate flags corresponding to the > grouping and monetary arguments to locale.format(). The flags just say to use the global locale settings, which have the limitations indicated above. Raymond's proposal is that a Python programmer should be better able to say "Format this number how I (or a particular user) want it to be formatted, regardless of the 'locale' setting". > I'd be happy to punt on (b) until later. > This is somewhat analogous to the approach for strftime() which has > syntax to invoke locale-specific formatting (%a, %A, %b, %B, %c, %p, > %x, %X). With the attendant pluses and minuses. > I guess in the end this means I am in favor of Nick's alternative. I fail to see how this follows from your previous comments. > One thing I don't understand: the PEP seems to exclude the 'e' and 'g' > format. Both proposals claim to include e and g. However, since thousands separators only apply to the left of the decimal point, and e notation only has one digit to the left, no thousands separator proposal will apply the e (and g when it produces e). The only known separator used to the left is a space, typically in groups of 5 digits, in some math tables. The decimal separator part of the PEP *does* apply to e and g. > I would think that in case 'g' defers to 'f' it should act the > same, and in case it defers to 'e', well, in the future (under (b) > above) that could still change the period into a comma, right? With the main proposal, one could simply specify, for instance, '8,1f' instead of '8.1f' to make that change *now*. I consider that much better than post-processing, which Nick's alternative would continue to require, and which gets worse with thousands separators added. Terry Jan Reedy From python at rcn.com Tue Mar 17 01:12:20 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 17:12:20 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for athousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o><49BEE4B4.6040709@canterbury.ac.nz> Message-ID: >>>> I vote we move ahead with Proposal II from PEP 378. >> >> Looks fairly good to me. > > Of course this is by now ambiguous -- the latest version of the PEP no > longer numbers the versions I and II, and has Nick's version second. > (Which may be reversed by the time you read this if Raymond keeps > updating the PEP in real time. :-) To keep the conversation in sync with today's real-time updates, I've put back in the "perma-names", Proposal I (nick's) and Proposal II (eric's). Raymond From greg.ewing at canterbury.ac.nz Tue Mar 17 01:36:39 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Mar 2009 12:36:39 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <49BEE4B4.6040709@canterbury.ac.nz> Message-ID: <49BEF097.2080408@canterbury.ac.nz> Guido van Rossum wrote: > Of course this is by now ambiguous -- the latest version of the PEP no > longer numbers the versions I and II To be clear, I'm in favour of Nick's version. (I share your concern about the apparent ambiguities in Eric's version -- it confused me too the first few times I read it!) -- Greg From greg.ewing at canterbury.ac.nz Tue Mar 17 01:41:11 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Mar 2009 12:41:11 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> Message-ID: <49BEF1A7.3030007@canterbury.ac.nz> Concerning the difficulty of exchanging "." and "," by post-processing, it might be generally useful to have a swap(s1, s2) method on strings that would replace occurrences of s1 by s2 and vice versa. -- Greg From rdmurray at bitdance.com Tue Mar 17 02:55:51 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Tue, 17 Mar 2009 01:55:51 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <49BEE4B4.6040709@canterbury.ac.nz> <49BEF097.2080408@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > To be clear, I'm in favour of Nick's version. > > (I share your concern about the apparent ambiguities > in Eric's version -- it confused me too the first > few times I read it!) I'll chime in in favor of the simpler proposal and leaving the 'specify what characters to use' ability for later. That's the way I've felt from the beginning of the discussion, for what it's worth. It feels like the factoring Guido talked about ("yes I want thousands separators" and then separately "here's what I want to use for thousands/decimal separators") is the correct way to break down the problem. -- R. David Murray http://www.bitdance.com From guido at python.org Tue Mar 17 03:06:30 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Mar 2009 19:06:30 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> Message-ID: Our emails crossed. On Mon, Mar 16, 2009 at 5:01 PM, Terry Reedy wrote: > Guido van Rossum wrote: >> >> On Mon, Mar 16, 2009 at 12:19 PM, Raymond Hettinger >> wrote: >>> >>> The conversation on the thousands separator seems to have wound down >>> and the PEP has stabilized: ?http://www.python.org/dev/peps/pep-0378/ >>> >>> Please pronounce. >> >> That's not a PEP, it's just a summary of a discussion without any >> choice. :-) > > I hope Raymond can understand this. ?To me, the choice presented is to add > the Main Proposal syntax extension, or not. > > ?Typically PEPs put the discussion of alternatives in some >> >> section at the end, after the specification and other stuff relevant >> going forward. > > You want more alternatives than the Nick's Alternative Proposal, discussed > at the end? ?I believe most of the other ideas on the list were directed at > some sense of (b) below. > >> Just to add more fuel to the fire, did anyone propose refactoring the >> problem into (a) a way to produce output with a thousands separator, >> and (b) a way to localize formats? > > Since a way to produce output with a choice of thousands separators is a > necessary part of a way localize formats, I am not sure of what distinction > you are trying to draw. > > 'Localize formats' has two quite distinct meanings: 'format this number in a > particular way (which can vary from number to number or at least user to > user)' versus 'format all numbers according to a particular national > standard'. > >> We could solve (a) by adding a >> >> comma to all numeric format languages along Nick's proposal, > > Raymond current proposal, based on discussion, is to offer users a choice of > 5 chars as thousands separators (and allow a choice of decimal separator). > ?Nick's proposal is to only offer comma as thousands separator. ?While the > latter meets my current parochial needs, I favor the more inclusive > approach. > >> and we could solve (b) either now > > Raymond's main proposal partially solves that now (which is to say, > completely solves than now for most of the world) in the first sense I gave > for (b), on a case-by-case basis. > >> or later by adding some other flag that >> >> means "use locale-specific numeric formatting for this value". > > As I understand from Raymond's introductory comments and those in the locale > module docs, the global C locale setting is not intended to be changed on an > output-by-output basis. ?Hence, while useful for nationalizing software, it > is not so useful for individualized output from global software. > >> perhaps there could be two separate flags corresponding to the >> grouping and monetary arguments to locale.format(). > > The flags just say to use the global locale settings, which have the > limitations indicated above. ?Raymond's proposal is that a Python programmer > should be better able to say "Format this number how I (or a particular > user) want it to be formatted, regardless of the 'locale' setting". > >> I'd be happy to punt on (b) until later. > >> This is somewhat analogous to the approach for strftime() which has >> syntax to invoke locale-specific formatting (%a, %A, %b, %B, %c, %p, >> %x, %X). > > With the attendant pluses and minuses. > >> I guess in the end this means I am in favor of Nick's alternative. > > I fail to see how this follows from your previous comments. > >> One thing I don't understand: the PEP seems to exclude the 'e' and 'g' >> format. > > Both proposals claim to include e and g. ?However, since thousands > separators only apply to the left of the decimal point, and e notation only > has one digit to the left, no thousands separator proposal will apply the e > (and g when it produces e). ?The only known separator used to the left is a > space, typically in groups of 5 digits, in some math tables. ?The decimal > separator part of the PEP *does* apply to e and g. > >> I would think that in case 'g' defers to 'f' it should act the >> same, and in case it defers to 'e', well, in the future (under (b) >> above) that could still change the period into a comma, right? > > With the main proposal, one could simply specify, for instance, '8,1f' > instead of '8.1f' to make that change *now*. ?I consider that much better > than post-processing, which Nick's alternative would continue to require, > and which gets worse with thousands separators added. > > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Mar 17 03:33:16 2009 From: python at rcn.com (Raymond Hettinger) Date: Mon, 16 Mar 2009 19:33:16 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> Message-ID: > Right. Let's go ahead with Nick's proposal and put ways of specifying > alternate separators (either via the locale or hardcoded) on the back > burner Mark PEP 378 as accepted with Nick's original comma-only version? Raymond From guido at python.org Tue Mar 17 04:14:58 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 16 Mar 2009 20:14:58 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> Message-ID: On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger wrote: >> Right. Let's go ahead with Nick's proposal and put ways of specifying >> alternate separators (either via the locale or hardcoded) on the back >> burner > > Mark PEP 378 as accepted with Nick's original comma-only version? OK, done. Looking forward to a swift implementation! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From eric at trueblade.com Tue Mar 17 09:23:44 2009 From: eric at trueblade.com (Eric Smith) Date: Tue, 17 Mar 2009 04:23:44 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> Message-ID: <49BF5E10.6060205@trueblade.com> Guido van Rossum wrote: > On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger wrote: >>> Right. Let's go ahead with Nick's proposal and put ways of specifying >>> alternate separators (either via the locale or hardcoded) on the back >>> burner >> Mark PEP 378 as accepted with Nick's original comma-only version? > > OK, done. Looking forward to a swift implementation! > I'm on it. Eric. From dickinsm at gmail.com Tue Mar 17 10:01:27 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 17 Mar 2009 09:01:27 +0000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> Message-ID: <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> On Tue, Mar 17, 2009 at 3:14 AM, Guido van Rossum wrote: > On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger wrote: >> Mark PEP 378 as accepted with Nick's original comma-only version? > > OK, done. Looking forward to a swift implementation! I'll implement this for Decimal; it shouldn't take long. One question from the PEP, which I've been too slow to read until this morning: should commas appear in the zero-filled part of a number? That is, should format(1234, "09,d") give '00001,234' or '0,001,234'? The PEP specifies that format(1234, "08,d") should give '0001,234', but that's something of a special case: ',001,234' isn't really a viable alternative. Mark From cmjohnson.mailinglist at gmail.com Tue Mar 17 10:30:12 2009 From: cmjohnson.mailinglist at gmail.com (Carl Johnson) Date: Mon, 16 Mar 2009 23:30:12 -1000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BEF1A7.3030007@canterbury.ac.nz> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49BE838C.9000906@trueblade.com> <20090316195940.6e4e1061@o> <49BEF1A7.3030007@canterbury.ac.nz> Message-ID: <3bdda690903170230w24a9352cu9621d87fc0b0d59f@mail.gmail.com> Greg Ewing wrote: > Concerning the difficulty of exchanging "." and "," by > post-processing, it might be generally useful to have > a swap(s1, s2) method on strings that would replace > occurrences of s1 by s2 and vice versa. I would appreciate having that. There are a lot of small jobs where str.translate and re are overkill, but s.replace(s1, TEMPCHAR); is awkward, since you're not sure what you can safely use as a tempchar. -- Carl Johnson From python at rcn.com Tue Mar 17 10:50:28 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 02:50:28 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> Message-ID: <15F41E9FF37D47E98CBCD67418787D49@RaymondLaptop1> [Mark] > One question from the PEP, which I've been too slow to read until > this morning: should commas appear in the zero-filled part of a > number? I think it should. That lets all the commas and decimals line up vertically. Anything else would look weird. >>> for n in seq: ... print format(n, "09,d") 1,234,567 0,000,001 0,255,989 Raymond From denis.spir at free.fr Tue Mar 17 11:28:02 2009 From: denis.spir at free.fr (spir) Date: Tue, 17 Mar 2009 11:28:02 +0100 Subject: [Python-ideas] PEP links in docs In-Reply-To: <20090316235444.GB26292@panix.com> References: <671579EA9C314A70AE166A0173289355@RaymondLaptop1> <20090316235444.GB26292@panix.com> Message-ID: <20090317112802.4ef95767@o> Le Mon, 16 Mar 2009 16:54:44 -0700, Aahz s'exprima ainsi: > On Mon, Mar 16, 2009, Raymond Hettinger wrote: > >attribution for Georg Brandl deleted: > >> > >> Feel free to send patches, as small as they may seem. Mark up PEP > >> numbers in reST like this -- :pep:`42` -- to get automatic linking. > > > > We might want to include a PEP index in the documentation > > but I think it's a really bad idea to include links from within > > the docs. The PEPs get out of date quickly. They document > > an early decision but not its ultimate evolution and diffusion > > through-out the language. > > There are two reasons to link to PEPs: > > * Provide the historical context Agree with Raymond. It should be made clear along with the pointer that the pointed PEP could be outdated. Maybe simply writing (original PEP: :pep:`42`) is enough? The word 'original' implicitely stating that things could have changed? > * Give detailed info lacking in the docs > > The first purpose will always exist, and I see no reason to delete such > a link if documented as such. Links for the latter purpose can be > removed when the docs are updated to match what's available in the PEP. To me the most important aspect is not about having details (still, it's important). Instead it is to get an (even partial or obscure) answer to "why", that often simply misses in the official docs. This answer is necessary to interpret the "what" and/or "how". Nobody can properly understand a feature description without knowing its purpose. In the best case, the reader will guess it, in he worst, he will guess wrong. Explicit is better than... also for docs! There is also a pedagogical aspect that I find even more relevant. An unexperienced programmer or pythonist should at least be able to figure out vaguely what this or that feature is about. No doubt this is very difficult to achieve -- especially for experts! I imagine a 2-stage "why" introduction to every feature description in the docs: the first one targeted to non-specialists, the second one more technical. (I'm sure that the first one would also sometimes help specialists.) The pedagogical one must be worded by, or reviewed by, or written in colloboration with, "tutors" that are able to imagine where/how/why newbies may stuck. It needs not beeing long -- sometimes a few words may be enough. (But it will sometimes be the hardest part ;-). It would also benefit from newbie feedback. I wonder whether this task could be partially delegated to the python-tutor mailing list activists. Denis PS: If ever, I volonteer to take part to this kind of task -- for the french version. [Have been technical trainer (in automation) in a previous life.] ------ la vita e estrany From greg.ewing at canterbury.ac.nz Tue Mar 17 12:04:34 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 17 Mar 2009 23:04:34 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> Message-ID: <49BF83C2.9090505@canterbury.ac.nz> Mark Dickinson wrote: > The PEP specifies that format(1234, "08,d") > should give '0001,234', but that's something of a special case: > ',001,234' isn't really a viable alternative. Both of those look equally unviable to me. I don't think I'd ever use zero filling together with commas myself, as it looks decidedly weird, but if I had to pick a meaning for format(1234, "08,d") I think I would make it ' 001,234' the reasoning being that since a comma falls on the first position of an 8-char field, you can never put a digit there, and putting a comma at the beginning is no use. If there are more than 6 digits, then you get a comma plus an extra digit, making the field overflow to 9 characters, e.g. format(1234567, "08,d") gives '1,234,567' -- Greg From eric at trueblade.com Tue Mar 17 12:15:13 2009 From: eric at trueblade.com (Eric Smith) Date: Tue, 17 Mar 2009 07:15:13 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> Message-ID: <49BF8641.5080609@trueblade.com> Mark Dickinson wrote: > On Tue, Mar 17, 2009 at 3:14 AM, Guido van Rossum wrote: >> On Mon, Mar 16, 2009 at 7:33 PM, Raymond Hettinger wrote: >>> Mark PEP 378 as accepted with Nick's original comma-only version? >> OK, done. Looking forward to a swift implementation! > > I'll implement this for Decimal; it shouldn't take long. > > One question from the PEP, which I've been too slow to read until > this morning: should commas appear in the zero-filled part of a > number? That is, should format(1234, "09,d") give '00001,234' > or '0,001,234'? The PEP specifies that format(1234, "08,d") > should give '0001,234', but that's something of a special case: > ',001,234' isn't really a viable alternative. Hmm. No good answers here. I'd vote for not putting the commas in the leading zeros. I don't think anyone would ever actually use this combination, and putting commas there complicates things due to the special case with the first digit. Plus, they're not inserted by the 'n' formatter, and no one has complained (which might mean no one's using it, of course). In 2.6: >>> import locale >>> locale.setlocale(locale.LC_ALL, 'en_US.UTF8') 'en_US.UTF8' >>> format(12345, '010n') '000012,345' >>> format(12345, '09n') '00012,345' >>> format(12345, '08n') '0012,345' >>> format(12345, '07n') '012,345' >>> format(12345, '06n') '12,345' >>> From hwpuschm at yahoo.de Tue Mar 17 13:23:23 2009 From: hwpuschm at yahoo.de (hwpuschm at yahoo.de) Date: Tue, 17 Mar 2009 12:23:23 +0000 (GMT) Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) Message-ID: <339525.47450.qm@web25805.mail.ukl.yahoo.com> Thank you very much for correctly remarking that the "definition" I formulated contradicts the examples I gave and is therefore utterly inadecuate: > It is proposed to introduce a Keyword "same", > to be used in the right hand side of assignments, as > follows: > ? > ? "xx = same + 5" or "...DELETED..." synonymous with "xx += 5" > ? "value =? 2*same + 5"? synonymous with "value =*2; > value +=5" > ? "switch = 1 - same"? synonymous with "switch *-1; > switch +=1" > ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" > > ? "lst = [5,6] + same" synonymous with? "...DELETED..." > ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" > ? > and so on. What I would like is to extend the augmented assignment and make it easy to understand for naive readers. I hope the following literary definition is consistent enough to convey the correct meaning: ? "whenever it is possible, modify the target IN PLACE ? according to the right hand side expression. ? If it is not possible to do such a thing, ? substitute the target object with ? an object that is build according to the right hand side expression ? and subsequently deleted" The following examples should be correct: ? "xx = same + 5"? synonymous with? "xx += 5" ? "value =? 2*same + 5"? synonymous with? "value =*2; value +=5" ? "switch = 1 - same"? synonymous with? "switch *-1; switch +=1" ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" The following examples would be extensions: ? "lst = [5,6] + same" synonymous with ? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()" ? "inmutable = same*(same+1)"? synonymous with ? ? ? "unused=inmutable+1; inmutable*=unused; del unused" There seems to be no really simple expression for the above extensions, and I take that as an indication that the proposed feature could be quite useful. From hwpuschm at yahoo.de Tue Mar 17 13:34:33 2009 From: hwpuschm at yahoo.de (hwpuschm at yahoo.de) Date: Tue, 17 Mar 2009 12:34:33 +0000 (GMT) Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) Message-ID: <433096.96894.qm@web25807.mail.ukl.yahoo.com> Thank you very much for correctly remarking that the "definition" I formulated contradicts the examples I gave and is therefore utterly inadecuate: > It is proposed to introduce a Keyword "same", > to be used in the right hand side of assignments, as > follows: >? >???"xx = same + 5" or "...DELETED..." synonymous with "xx += 5" >???"value =? 2*same + 5"? synonymous with "value =*2; > value +=5" >???"switch = 1 - same"? synonymous with "switch *-1; > switch +=1" >???"lst = same + [5,6]"? synonymous with? "lst += [5,6]" > >???"lst = [5,6] + same" synonymous with? "...DELETED..." >???"lst[2] = 1/same" synonymous with? "lst[2] **=-1" >? > and so on. What I would like is to extend the augmented assignment and make it easy to understand for naive readers. I hope the following literary definition is consistent enough to convey the correct meaning: ? "whenever it is possible, modify the target IN PLACE ? according to the right hand side expression. ? If it is not possible to do such a thing, ? substitute the target object with ? an object that is build according to the right hand side expression ? and subsequently deleted" The following examples should be correct: ? "xx = same + 5"? synonymous with? "xx += 5" ? "value =? 2*same + 5"? synonymous with? "value =*2; value +=5" ? "switch = 1 - same"? synonymous with? "switch *-1; switch +=1" ? "lst = same + [5,6]"? synonymous with? "lst += [5,6]" ? "lst[2] = 1/same" synonymous with? "lst[2] **=-1" The following examples would be extensions: ? "lst = [5,6] + same" synonymous with ? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()" ? "inmutable = same*(same+1)"? synonymous with ? ? ? "unused=inmutable+1; inmutable*=unused; del unused" There seems to be no really simple expression for the above extensions, and I take that as an indication that the proposed feature could be quite useful. From andre.roberge at gmail.com Tue Mar 17 13:41:42 2009 From: andre.roberge at gmail.com (Andre Roberge) Date: Tue, 17 Mar 2009 09:41:42 -0300 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <339525.47450.qm@web25805.mail.ukl.yahoo.com> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> Message-ID: <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com> On Tue, Mar 17, 2009 at 9:23 AM, wrote: > > Thank you very much for correctly remarking that the "definition" I > formulated contradicts the examples I gave and is therefore utterly > inadecuate: > > > It is proposed to introduce a Keyword "same", > > to be used in the right hand side of assignments, as > > follows: I once wrote a blog post on how an expression like "N = N + 1" was confusing to beginners, so I'm sympathetic with the underlying idea. (http://aroberge.blogspot.com/2005/05/n-n-1.html Note that there are much better explanations for naming objects in Python than this old post I wrote.) However, I am -1 on this proposal. IMO, it decreases readability and achieves very little in terms of clearing up the confusion. Quick test: which is the easier to read and get right? n = same + 1 n = sane + 1 n = n + 1 Andr? > > > > > "xx = same + 5" or "...DELETED..." synonymous with "xx += 5" > > "value = 2*same + 5" synonymous with "value =*2; > > value +=5" > > "switch = 1 - same" synonymous with "switch *-1; > > switch +=1" > > "lst = same + [5,6]" synonymous with "lst += [5,6]" > > > > "lst = [5,6] + same" synonymous with "...DELETED..." > > "lst[2] = 1/same" synonymous with "lst[2] **=-1" > > > > and so on. > > What I would like is to extend the augmented assignment > and make it easy to understand for naive readers. > I hope the following literary definition > is consistent enough to convey the correct meaning: > "whenever it is possible, modify the target IN PLACE > according to the right hand side expression. > If it is not possible to do such a thing, > substitute the target object with > an object that is build according to the right hand side expression > and subsequently deleted" > > The following examples should be correct: > "xx = same + 5" synonymous with "xx += 5" > "value = 2*same + 5" synonymous with "value =*2; value +=5" > "switch = 1 - same" synonymous with "switch *-1; switch +=1" > "lst = same + [5,6]" synonymous with "lst += [5,6]" > "lst[2] = 1/same" synonymous with "lst[2] **=-1" > The following examples would be extensions: > "lst = [5,6] + same" synonymous with > "lst.reverse(); lst.extend([6,5]); lst.reverse()" > "inmutable = same*(same+1)" synonymous with > "unused=inmutable+1; inmutable*=unused; del unused" > > There seems to be no really simple expression for the above extensions, > and I take that as an indication > that the proposed feature could be quite useful. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dickinsm at gmail.com Tue Mar 17 13:57:26 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 17 Mar 2009 12:57:26 +0000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BF8641.5080609@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> Message-ID: <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> On Tue, Mar 17, 2009 at 11:15 AM, Eric Smith wrote: > Hmm. No good answers here. I'd vote for not putting the commas in the > leading zeros. ?I don't think anyone would ever actually use this > combination, and putting commas there complicates things due to the special > case with the first digit. > > Plus, they're not inserted by the 'n' formatter, and no one has complained > (which might mean no one's using it, of course). But they *are* inserted by locale.format, and presumably no-one has complained about that either. :-) >>> format('%014f', 123.456, grouping=1) '0,000,123.456000' It appears that locale.format adds the thousand separators after the fact, so the issue with the leading comma doesn't come up. That also means that the relationship between the field width (14 in this case) and the string length (16) is somewhat obscured. Mark From dickinsm at gmail.com Tue Mar 17 14:18:42 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 17 Mar 2009 13:18:42 +0000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BF83C2.9090505@canterbury.ac.nz> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF83C2.9090505@canterbury.ac.nz> Message-ID: <5c6f2a5d0903170618w758d994dyb9254b1bcc3a9660@mail.gmail.com> On Tue, Mar 17, 2009 at 11:04 AM, Greg Ewing wrote: > [...] as it looks decidedly weird, but if I had > to pick a meaning for format(1234, "08,d") I think > I would make it > > ?' 001,234' Yes, that looks better than either of the alternatives I gave. I think I prefer that commas *do* appear in the zero padding, though as Eric says, it does add some extra complication to the code. In the case of the decimal code that complication is significant, mainly because of the need to figure out how much space is available for the zeros *before* doing the comma insertion. Mark From eric at trueblade.com Tue Mar 17 14:24:04 2009 From: eric at trueblade.com (Eric Smith) Date: Tue, 17 Mar 2009 09:24:04 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> Message-ID: <49BFA474.9040308@trueblade.com> Mark Dickinson wrote: > On Tue, Mar 17, 2009 at 11:15 AM, Eric Smith wrote: >> Hmm. No good answers here. I'd vote for not putting the commas in the >> leading zeros. I don't think anyone would ever actually use this >> combination, and putting commas there complicates things due to the special >> case with the first digit. >> >> Plus, they're not inserted by the 'n' formatter, and no one has complained >> (which might mean no one's using it, of course). > > But they *are* inserted by locale.format, and > presumably no-one has complained about that either. :-) > >>>> format('%014f', 123.456, grouping=1) > '0,000,123.456000' > > It appears that locale.format adds the thousand separators after > the fact, so the issue with the leading comma doesn't come up. > That also means that the relationship between the field width (14 > in this case) and the string length (16) is somewhat obscured. Ick. Presumably you specified a width because that's how wide you wanted the output to be! I still like leaving the commas out of leading zeros. From jh at improva.dk Tue Mar 17 14:21:46 2009 From: jh at improva.dk (Jacob Holm) Date: Tue, 17 Mar 2009 14:21:46 +0100 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com> Message-ID: <49BFA3EA.70108@improva.dk> Andre Roberge wrote: > > > On Tue, Mar 17, 2009 at 9:23 AM, > wrote: > > > Thank you very much for correctly remarking that the "definition" > I formulated contradicts the examples I gave and is therefore > utterly inadecuate: > > > It is proposed to introduce a Keyword "same", > > to be used in the right hand side of assignments, as > > follows: > > > > I once wrote a blog post on how an expression like "N = N + 1" was > confusing to beginners, so I'm sympathetic with the underlying idea. > (http://aroberge.blogspot.com/2005/05/n-n-1.html Note that there > are much better explanations for naming objects in Python than this > old post I wrote.) > > However, I am -1 on this proposal. IMO, it decreases readability and > achieves very little in terms of clearing up the confusion. > > Quick test: which is the easier to read and get right? > > n = same + 1 > n = sane + 1 > n = n + 1 > I believe that as soon as the left-hand side stops being a simple variable and it is used in non-trivial expressions on the right-hand side, using the keyword would help clarify the intent. What I mean is that the examples you should be looking at are more like: A[n+1] = same*same + 1 B[2*j].foo = frobnicate(same, same+1) ... If you try expanding these into current python with minimal change in semantics you will end up with something like _1 = n+1 _2 = A[_1] A[_1] = _2*_2 + 1 del _1 del _2 _1 = B[2*j] _2 = _1.foo _1.foo = frobnicate(_2, _2+1) del _1 del _2 which is much less readable. I still think that the cost of a new keyword is probably too high a price to pay, but I like the idea. Jacob From guido at python.org Tue Mar 17 15:17:10 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Mar 2009 07:17:10 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BFA474.9040308@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> Message-ID: On Tue, Mar 17, 2009 at 6:24 AM, Eric Smith wrote: > Mark Dickinson wrote: >> >> On Tue, Mar 17, 2009 at 11:15 AM, Eric Smith wrote: >>> >>> Hmm. No good answers here. I'd vote for not putting the commas in the >>> leading zeros. ?I don't think anyone would ever actually use this >>> combination, and putting commas there complicates things due to the >>> special >>> case with the first digit. >>> >>> Plus, they're not inserted by the 'n' formatter, and no one has >>> complained >>> (which might mean no one's using it, of course). >> >> But they *are* inserted by locale.format, and >> presumably no-one has complained about that either. :-) >> >>>>> format('%014f', 123.456, grouping=1) >> >> '0,000,123.456000' >> >> It appears that locale.format adds the thousand separators after >> the fact, so the issue with the leading comma doesn't come up. >> That also means that the relationship between the field width (14 >> in this case) and the string length (16) is somewhat obscured. > > Ick. Presumably you specified a width because that's how wide you wanted the > output to be! > > I still like leaving the commas out of leading zeros. Ick, the discrepancy between the behavior of locale.format() and PEP 378 is unfortunate. I agree that the given width should include the commas, but I strongly feel that leading zeros should be comma-fied just like everything else. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From eric at trueblade.com Tue Mar 17 15:56:07 2009 From: eric at trueblade.com (Eric Smith) Date: Tue, 17 Mar 2009 10:56:07 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5c6f2a5d0903170618w758d994dyb9254b1bcc3a9660@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF83C2.9090505@canterbury.ac.nz> <5c6f2a5d0903170618w758d994dyb9254b1bcc3a9660@mail.gmail.com> Message-ID: <49BFBA07.1080105@trueblade.com> Mark Dickinson wrote: > On Tue, Mar 17, 2009 at 11:04 AM, Greg Ewing > wrote: >> [...] as it looks decidedly weird, but if I had >> to pick a meaning for format(1234, "08,d") I think >> I would make it >> >> ' 001,234' > > Yes, that looks better than either of the alternatives I gave. > > I think I prefer that commas *do* appear in the zero padding, though > as Eric says, it does add some extra complication to the code. In > the case of the decimal code that complication is significant, mainly > because of the need to figure out how much space is available > for the zeros *before* doing the comma insertion. If you look at _PyString_InsertThousandsGrouping, you'll see that it gets called twice. Once to compute the size, and once to actually do the inserting. From eric at trueblade.com Tue Mar 17 15:58:17 2009 From: eric at trueblade.com (Eric Smith) Date: Tue, 17 Mar 2009 10:58:17 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> Message-ID: <49BFBA89.3060406@trueblade.com> Guido van Rossum wrote: > Ick, the discrepancy between the behavior of locale.format() and PEP > 378 is unfortunate. I agree that the given width should include the > commas, but I strongly feel that leading zeros should be comma-fied > just like everything else. And what happens when the comma would be the first character? ,012,345 0012,345 or something else? From mishok13 at gmail.com Tue Mar 17 16:00:52 2009 From: mishok13 at gmail.com (Andrii V. Mishkovskyi) Date: Tue, 17 Mar 2009 17:00:52 +0200 Subject: [Python-ideas] dict '+' operator and slicing support for pop Message-ID: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> First of all, this is my first attempt at submitting an idea to python-ideas. So, here it goes. :) 1. Add ability to use '+' operator for dicts I often wonder why list and tuple instances have '+' and '+=' operators but dicts don't? It's not that rare in my code (and code written by others, as it seems) that i have to write: a.update(b) return a I do understand that adding additional magic method may be inappropriate for dict, but I think it would be nice addition to a language. So, my proposal is that: x = a + b would become equivalent to x = dict(a, **b) a += b would become equivalent to a.update(b) And the example I gave before would be translated to: return a + b Note, that there is a difference between these two examples in semantics, the latter one creates a new dict. But that's what user doesn't care about in 99% of use-cases. A very basic implementation in Python: >>> class Dict(dict): ... def __add__(self, other): ... return self.__class__(self, **other) ... def __iadd__(self, other): ... self.update(other) ... return self ... >>> a = Dict(foo=12, bar=14, baz=16) >>> b = Dict(spam=13, eggs=17, bacon=19) >>> a + b {'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16} >>> a += b >>> a {'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16} >>> b {'eggs': 17, 'bacon': 19, 'spam': 13} >>> class Dict(dict): ... def __add__(self, other): ... return self.__class__(self, **other) ... def __iadd__(self, other): ... self.update(other) ... return self ... >>> a = Dict(foo=12, bar=14, baz=16) >>> b = Dict(spam=13, eggs=17, bacon=19) >>> a + b {'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16} >>> a += b >>> a {'bar': 14, 'spam': 13, 'bacon': 19, 'eggs': 17, 'foo': 12, 'baz': 16} >>> b {'eggs': 17, 'bacon': 19, 'spam': 13} Note: if this is ever going to be implemented, then Mapping ABCs will have to implement these methods, which doesn't sound like backwards-compatible to me. :) 2. Ability to use slices in pop This was discussed earlier (4.5 years ago, actually) in this thread: http://mail.python.org/pipermail/python-dev/2004-November/049895.html Even though original request became a full-blown proposal (PEP-3132) and was implemented in py3k, the internal discussion about pop allowing slice as arguments has silenced. There was some positive feedback from Python developers and I think I can provide a patch for this functionality in 2 weeks. Is there still some interest in this? There is nothing really hard, this would be something like this: >>> class List(list): ... def pop(self, index_or_slice): ... ret = self[index_or_slice] ... del self[index_or_slice] ... return ret ... >>> x = List(range(10)) >>> x.pop(slice(1, 4)) [1, 2, 3] >>> x [0, 4, 5, 6, 7, 8, 9] >>> x.pop(5) 8 >>> x [0, 4, 5, 6, 7, 9] Note: some people think that pop returning different list or item depending on what is being passed to pop() is bad or something. I don't see a problem here, because simple some_list[index_or_slice] can also return list or just one item depending on what type `index_or_slice` is. -- Wbr, Andrii V. Mishkovskyi. He's got a heart of a little child, and he keeps it in a jar on his desk. From dickinsm at gmail.com Tue Mar 17 16:12:20 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 17 Mar 2009 15:12:20 +0000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BFBA89.3060406@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49BFBA89.3060406@trueblade.com> Message-ID: <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> On Tue, Mar 17, 2009 at 2:58 PM, Eric Smith wrote: > And what happens when the comma would be the first character? > > ,012,345 > 0012,345 > > or something else? Options are: (A) ",012,345" (B) "0012,345" (C) " 012,345" (D) "0,012,345" (E) write-in option here I vote for (D): it's one character too large, but the given precision is only supposed to be a minimum anyway. We already end up with a length-9 string when formatting 1234567. (D) is the minimum width string that: doesn't look weird (like (A) and (B)), has length at least 8, and is still in the right basic format (C) would be my second choice, but I find the extra space padding to be somewhat arbitrary (why a space? why not some other padding character?) Mark From denis.spir at free.fr Tue Mar 17 16:36:51 2009 From: denis.spir at free.fr (spir) Date: Tue, 17 Mar 2009 16:36:51 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49BFBA89.3060406@trueblade.com> <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> Message-ID: <20090317163651.056a2807@o> Le Tue, 17 Mar 2009 15:12:20 +0000, Mark Dickinson s'exprima ainsi: > On Tue, Mar 17, 2009 at 2:58 PM, Eric Smith wrote: > > And what happens when the comma would be the first character? > > > > ,012,345 > > 0012,345 > > > > or something else? > > Options are: > > (A) ",012,345" > (B) "0012,345" > (C) " 012,345" > (D) "0,012,345" > (E) write-in option here > > I vote for (D): it's one character too large, but the given precision > is only supposed to be a minimum anyway. We already end up > with a length-9 string when formatting 1234567. > > (D) is the minimum width string that: > doesn't look weird (like (A) and (B)), > has length at least 8, and > is still in the right basic format > > (C) would be my second choice, but I find the extra space padding > to be somewhat arbitrary (why a space? why not some other > padding character?) I agree with all the comments above. * A is ... (censured). * B does not comply with user choice. * D is the best in theory, but would trouble table-like vertical alignment. * So remains only C for me. Also, the issue here comes from user inconsistency: a (total) width of 8 simply cannot fit with group separators every 3 digits (warning?). At best, there should be some information on this topic to avoid bad surprises, but then the implementation should not care much. > Mark Denis ------ la vita e estrany From dickinsm at gmail.com Tue Mar 17 16:53:38 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 17 Mar 2009 15:53:38 +0000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <20090317163651.056a2807@o> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49BFBA89.3060406@trueblade.com> <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> <20090317163651.056a2807@o> Message-ID: <5c6f2a5d0903170853w5bf40eddu1e0c140829240a2f@mail.gmail.com> On Tue, Mar 17, 2009 at 3:36 PM, spir wrote: > * A is ... (censured). > * B does not comply with user choice. > * D is the best in theory, but would trouble table-like vertical alignment. I don't see why it would: could you elaborate? Mark From george.sakkis at gmail.com Tue Mar 17 17:03:16 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Tue, 17 Mar 2009 12:03:16 -0400 Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> Message-ID: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi wrote: > 1. Add ability to use '+' operator for dicts > > I often wonder why list and tuple instances have '+' and '+=' > operators but dicts don't? > It's not that rare in my code (and code written by others, as it > seems) that i have to write: > > a.update(b) > return a > > I do understand that adding additional magic method may be > inappropriate for dict, but I think it would be nice addition to a > language. So, my proposal is that: > > x = a + b > would become equivalent to > x = dict(a, **b) > > a += b > would become equivalent to > a.update(b) That's one way to define dict addition but it's not the only, or even, the best one. It's hard to put in words exactly why but I expect "a+b" to take into account the full state of the operands, not just a part of it. In your proposal the values of the first dict for the common keys are effectively ignored, which doesn't seem to me as a good fit for an additive operation. I would find at least as reasonable and intuitive the following definition that doesn't leak information: def sum_dicts(*dicts): from collections import defaultdict s = defaultdict(list) for d in dicts: for k,v in d.iteritems(): s[k].append(v) return s >>> d1 = {'a':2,'b':5} >>> d2 = {'a':2,'c':6,'z':3} >>> d3 = {'b':2,'c':5} >>> sum_dicts(d1,d2,d3) defaultdict(, {'a': [2, 2], 'c': [6, 5], 'b': [5, 2], 'z': [3]}) George From guido at python.org Tue Mar 17 17:36:10 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Mar 2009 09:36:10 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49BFBA89.3060406@trueblade.com> <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> Message-ID: On Tue, Mar 17, 2009 at 8:12 AM, Mark Dickinson wrote: > On Tue, Mar 17, 2009 at 2:58 PM, Eric Smith wrote: >> And what happens when the comma would be the first character? >> >> ,012,345 >> 0012,345 >> >> or something else? > > Options are: > > (A) ",012,345" > (B) "0012,345" Neither (A) nor (B) is acceptable. > (C) " 012,345" > (D) "0,012,345" > (E) write-in option here > > I vote for (D): ?it's one character too large, but the given precision > is only supposed to be a minimum anyway. ?We already end up > with a length-9 string when formatting 1234567. > > (D) is the minimum width string that: > ?doesn't look weird (like (A) and (B)), > ?has length at least 8, and > ?is still in the right basic format > > (C) would be my second choice, but I find the extra space padding > to be somewhat arbitrary (why a space? why not some other > padding character?) It's tough to choose between (C) and (D). I guess we'll have to look at use cases for leading zeros. I can think of two use cases for leading zeros are: (1) To avoid font-width issues -- many variable-width fonts are designed so that all digits have the same width, but their (default) space is much narrower. (2) To avoid fraud when printing certain documents -- it's easier to insert a '1' in front of a small number than to change a '0' into something else. Since both use cases are trying to avoid spaces, I think (D) is the winner here. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From josiah.carlson at gmail.com Tue Mar 17 18:36:19 2009 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Tue, 17 Mar 2009 10:36:19 -0700 Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> Message-ID: On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis wrote: > On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi > wrote: > >> 1. Add ability to use '+' operator for dicts >> >> I often wonder why list and tuple instances have '+' and '+=' >> operators but dicts don't? >> It's not that rare in my code (and code written by others, as it >> seems) that i have to write: >> >> a.update(b) >> return a >> >> I do understand that adding additional magic method may be >> inappropriate for dict, but I think it would be nice addition to a >> language. So, my proposal is that: >> >> x = a + b >> would become equivalent to >> x = dict(a, **b) >> >> a += b >> would become equivalent to >> a.update(b) > > That's one way to define dict addition but it's not the only, or even, > the best one. It's hard to put in words exactly why but I expect "a+b" > to take into account the full state of the operands, not just a part > of it. In your proposal the values of the first dict for the common > keys are effectively ignored, which doesn't seem to me as a good fit > for an additive operation. I would find at least as reasonable and > intuitive the following definition that doesn't leak information: > > def sum_dicts(*dicts): > ? ?from collections import defaultdict > ? ?s = defaultdict(list) > ? ?for d in dicts: > ? ? ? ?for k,v in d.iteritems(): > ? ? ? ? ? ?s[k].append(v) > ? ?return s > >>>> d1 = {'a':2,'b':5} >>>> d2 = {'a':2,'c':6,'z':3} >>>> d3 = {'b':2,'c':5} >>>> sum_dicts(d1,d2,d3) > defaultdict(, {'a': [2, 2], 'c': [6, 5], 'b': [5, 2], 'z': [3]}) Both of the ideas suffer from "+ is no longer commutative", which sort-of bothers me. I say sort-of, because I would actually prefer Andrii's semantics over yours, and if you prefer the elements from b, you use 'a + b', but if you prefer the elements from a, you use 'b + a'. Then again, I'm tending towards a -.75 on the entire idea; despite it being convenient, I can see non-comutativity as being confusing. As for the list slice popping...I'm tending towards a -1. While I can see the convenience in some cases, I'm just not sure it's compelling enough (especially because you need to generate the slice in advance of using it). As stated in the past...not all 2 line functions need to be built-in or syntax. I don't believe either of these are able to pass the "is it necessary as part of a compelling use-case?" question. - Josiah From guido at python.org Tue Mar 17 18:39:38 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Mar 2009 10:39:38 -0700 Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> Message-ID: Because there so many different ways to think about this, it's better not to guess and force the user to be explicit. On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis wrote: > On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi > wrote: > >> 1. Add ability to use '+' operator for dicts >> >> I often wonder why list and tuple instances have '+' and '+=' >> operators but dicts don't? >> It's not that rare in my code (and code written by others, as it >> seems) that i have to write: >> >> a.update(b) >> return a >> >> I do understand that adding additional magic method may be >> inappropriate for dict, but I think it would be nice addition to a >> language. So, my proposal is that: >> >> x = a + b >> would become equivalent to >> x = dict(a, **b) >> >> a += b >> would become equivalent to >> a.update(b) > > That's one way to define dict addition but it's not the only, or even, > the best one. It's hard to put in words exactly why but I expect "a+b" > to take into account the full state of the operands, not just a part > of it. In your proposal the values of the first dict for the common > keys are effectively ignored, which doesn't seem to me as a good fit > for an additive operation. I would find at least as reasonable and > intuitive the following definition that doesn't leak information: > > def sum_dicts(*dicts): > ? ?from collections import defaultdict > ? ?s = defaultdict(list) > ? ?for d in dicts: > ? ? ? ?for k,v in d.iteritems(): > ? ? ? ? ? ?s[k].append(v) > ? ?return s > >>>> d1 = {'a':2,'b':5} >>>> d2 = {'a':2,'c':6,'z':3} >>>> d3 = {'b':2,'c':5} >>>> sum_dicts(d1,d2,d3) > defaultdict(, {'a': [2, 2], 'c': [6, 5], 'b': [5, 2], 'z': [3]}) > > George > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From qrczak at knm.org.pl Tue Mar 17 18:55:53 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Tue, 17 Mar 2009 18:55:53 +0100 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <339525.47450.qm@web25805.mail.ukl.yahoo.com> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> Message-ID: <3f4107910903171055s33384a02jb32a843ab0ff6d1c@mail.gmail.com> On Tue, Mar 17, 2009 at 13:23, wrote: > ? "lst = [5,6] + same" synonymous with > ? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()" What about: lst = x + same ? -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From python at zesty.ca Tue Mar 17 18:58:47 2009 From: python at zesty.ca (Ka-Ping Yee) Date: Tue, 17 Mar 2009 10:58:47 -0700 (PDT) Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> Message-ID: On Tue, 17 Mar 2009, Josiah Carlson wrote: > On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis wrote: >> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi >> wrote: >> >>> 1. Add ability to use '+' operator for dicts >>> > > Both of the ideas suffer from "+ is no longer commutative", which > sort-of bothers me. I don't find that a convincing argument, since + is not commutative for lists or tuples either. Andrii's original proposal is the most natural interpretation -- notice that if x and y are dicts: dict(x.items()) gives x dict(x.items() + y.items()) gives x + y That looks perfectly consistent to me. George's counter-proposal doesn't make sense to me at all -- it messes up the types of all the values in the dict. And it's inconsistent with the built-in behaviour of + with other types: it doesn't add lists element-by-element, so it shouldn't add dicts element-by-element either. -- ?!ng From pyideas at rebertia.com Tue Mar 17 19:14:51 2009 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 17 Mar 2009 11:14:51 -0700 Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> Message-ID: <50697b2c0903171114n1a721492q9a62bcac07b7c0e@mail.gmail.com> On Tue, Mar 17, 2009 at 10:58 AM, Ka-Ping Yee wrote: > On Tue, 17 Mar 2009, Josiah Carlson wrote: > >> On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis >> wrote: >>> >>> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi >>> wrote: >>> >>>> 1. Add ability to use '+' operator for dicts >>>> >> >> Both of the ideas suffer from "+ is no longer commutative", which >> sort-of bothers me. > > I don't find that a convincing argument, since + is not commutative > for lists or tuples either. ?Andrii's original proposal is the most > natural interpretation -- notice that if x and y are dicts: > > ? ?dict(x.items()) gives x > > ? ?dict(x.items() + y.items()) gives x + y > > That looks perfectly consistent to me. > > George's counter-proposal doesn't make sense to me at all -- it > messes up the types of all the values in the dict. ?And it's > inconsistent with the built-in behaviour of + with other types: > it doesn't add lists element-by-element, so it shouldn't add > dicts element-by-element either. Not to put words into people's mouths, but it seems like the concern was really less over the non-commutativity and move over the fact that data from the first dict gets silently clobbered by the second dict; whereas in the list, tuple, and string cases, no data is ever lost in the process. Cheers, Chris -- I have a blog: http://blog.rebertia.com From python at rcn.com Tue Mar 17 19:17:02 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 11:17:02 -0700 Subject: [Python-ideas] dict '+' operator and slicing support for pop References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com><91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> Message-ID: >> a.update(b) >> return a Why take two short, simple lines with unequivocal meaning and then abbreviate them with something mysterious (or at least something with multiple possible interpretations)? Mappings exist in many languages now. Can you point to another language that has found it worthwhile to have both an update() method and an addition operator? Also, consider that dicts are one of our most basic APIs and many other objects model that API. It behooves us to keep that API as simple and thin as possible. IMO, this change would be gratuituous. None of the code presented so far is significantly improved. Essentially, we're looking at a trivial abbreviation, not an actual offering of new capabilities. -1 all the way around. Raymond From george.sakkis at gmail.com Tue Mar 17 19:26:44 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Tue, 17 Mar 2009 14:26:44 -0400 Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: <50697b2c0903171114n1a721492q9a62bcac07b7c0e@mail.gmail.com> References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> <50697b2c0903171114n1a721492q9a62bcac07b7c0e@mail.gmail.com> Message-ID: <91ad5bf80903171126t3c427d31p89338f7b6ac172f8@mail.gmail.com> On Tue, Mar 17, 2009 at 2:14 PM, Chris Rebert wrote: > On Tue, Mar 17, 2009 at 10:58 AM, Ka-Ping Yee wrote: >> On Tue, 17 Mar 2009, Josiah Carlson wrote: >> >>> On Tue, Mar 17, 2009 at 9:03 AM, George Sakkis >>> wrote: >>>> >>>> On Tue, Mar 17, 2009 at 11:00 AM, Andrii V. Mishkovskyi >>>> wrote: >>>> >>>>> 1. Add ability to use '+' operator for dicts >>>>> >>> >>> Both of the ideas suffer from "+ is no longer commutative", which >>> sort-of bothers me. >> >> I don't find that a convincing argument, since + is not commutative >> for lists or tuples either. ?Andrii's original proposal is the most >> natural interpretation -- notice that if x and y are dicts: >> >> ? ?dict(x.items()) gives x >> >> ? ?dict(x.items() + y.items()) gives x + y >> >> That looks perfectly consistent to me. >> >> George's counter-proposal doesn't make sense to me at all -- it >> messes up the types of all the values in the dict. ?And it's >> inconsistent with the built-in behaviour of + with other types: >> it doesn't add lists element-by-element, so it shouldn't add >> dicts element-by-element either. > > Not to put words into people's mouths, but it seems like the concern > was really less over the non-commutativity and move over the fact that > data from the first dict gets silently clobbered by the second dict; > whereas in the list, tuple, and string cases, no data is ever lost in > the process. Just to be clear, I'm between -0.5 and -1 to the whole idea; my counter-proposal was simply meant to point out the potential ambiguity in semantics and the fact that the original proposal silently loses data. George From greg.ewing at canterbury.ac.nz Tue Mar 17 21:39:55 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Mar 2009 08:39:55 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> Message-ID: <49C00A9B.4080509@canterbury.ac.nz> Mark Dickinson wrote: >>>>format('%014f', 123.456, grouping=1) > > '0,000,123.456000' > > That also means that the relationship between the field width (14 > in this case) and the string length (16) is somewhat obscured. I'd consider that part a bug that we shouldn't imitate. The field width should always be what you say it is, unless the value is too big to fit. -- Greg From python at rcn.com Tue Mar 17 21:40:51 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 13:40:51 -0700 Subject: [Python-ideas] Customizing format() Message-ID: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> I've been exploring how to customize our thousands separators and decimal separators and wanted to offer-up an idea. It arose when I was looking at Java's DecimalFormat class and its customization tool DecimalFormatSymbols http://java.sun.com/javase/6/docs/api/java/text/DecimalFormat.html . Also, I looked at how regular expression patterns provide options to change the meaning of its special characters using (?iLmsux). I. Simplest version -- Translation pairs format(1234, "8,.1f") --> ' 1,234.0' format(1234, "(,_)8,.1f") --> ' 1_234.0' format(1234, "(,_)(.,)8,.1f") --> ' 1_234,0' This approach is very easy to implement and it doesn't make life difficult for the parser which can continue to look for just a comma and period with their standardized meaning. It also fits nicely in our current framework and doesn't require any changes to the format() builtin. Of all the options, I find this one to be the easiest to read. Also, this version makes it easy to employ a couple of techniques to factor-out formatting decisions. Here's a gettext() style approach. def _(s): return '(,.)(.,)' + s . . . format(x, _('8.1f')) Here's another approach using implicit string concatenation: DEB = '(,_)' # style for debugging EXT = '(, )' # style for external display . . . format(x, DEB '8.1f') format(y, EXT '8d') There are probably many ways to factor-out the decision. We don't need to decide which is best, we just need to make it possible. One other thought, this approach makes it possible to customize all of the characters that are currently hardwired (including zero and space padding characters and the 'E' or 'e' exponent symbols). II. Javaesque version -- FormatSymbols object This is essentially the same idea as previous one but involves modifying the format() builtin to accept a symbols object and pass it to __format__ methods. This moves the work outside of the format string itself: DEB = FormatSymbols(comma='_') EXT = FormatSymbols(comma=' ') . . . format(x, '8.1f', DEB) format(y, '8d', EXT) The advantage is that this technique is easily extendable beyond simple symbol translations and could possibly allow specification of grouping sizes in hundreds and whatnot. It also looks more like a real program as opposed to a formatting mini-language. The disadvantage is that it is likely slower and it requires mucking with the currently dirt simple format() / __format__() protocol. It may also be harder to integrate with existing __format__ methods which are currently very string oriented. Raymond From greg.ewing at canterbury.ac.nz Tue Mar 17 21:52:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Mar 2009 08:52:20 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> Message-ID: <49C00D84.6020109@canterbury.ac.nz> Guido van Rossum wrote: > I agree that the given width should include the > commas, but I strongly feel that leading zeros should be comma-fied > just like everything else. I think we need some use cases before a proper decision can be made about this. If you were using comma-separated zero-filled numbers, what would your objective be, and what choice would best fulfill it? -- Greg From python at rcn.com Tue Mar 17 22:00:15 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 14:00:15 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1><5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1><5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com><49BF8641.5080609@trueblade.com><5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com><49BFA474.9040308@trueblade.com> <49C00D84.6020109@canterbury.ac.nz> Message-ID: <8C68FC2427504B9E90D340FD2664385B@RaymondLaptop1> [Guido van Rossum] >> I agree that the given width should include the >> commas, but I strongly feel that leading zeros should be comma-fied >> just like everything else. +1 [Greg Ewing] > I think we need some use cases before a proper > decision can be made about this. If you were using > comma-separated zero-filled numbers, what would > your objective be, and what choice would best > fulfill it? I gave one example of writing out numbers in columns and that makes it clear that putting commas in the leading zeros is the right thing to do (anything else looks unusably weird). Also, as Guido pointed-out, anyone specifying zero-padding is saying that they intend to not be showing spaces where digits would go. Our choice ought to respect that intention. Raymond From greg.ewing at canterbury.ac.nz Tue Mar 17 22:03:54 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Mar 2009 09:03:54 +1200 Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> Message-ID: <49C0103A.1070308@canterbury.ac.nz> George Sakkis wrote: > It's hard to put in words exactly why but I expect "a+b" > to take into account the full state of the operands, not just a part > of it. I think one expects a + operator to be somehow symmetrical with respect to its operands. The lopsidedness of dict updating violates this expectation, and so is better represented by an asymmetrical syntax. -- Greg From eric at trueblade.com Tue Mar 17 22:13:14 2009 From: eric at trueblade.com (Eric Smith) Date: Tue, 17 Mar 2009 17:13:14 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49BF8641.5080609@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <0E7352EEAFFB4EAEAE5B0121F95B050D@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> Message-ID: <49C0126A.3060405@trueblade.com> Eric Smith wrote: > Mark Dickinson wrote: >> One question from the PEP, which I've been too slow to read until >> this morning: should commas appear in the zero-filled part of a >> number? That is, should format(1234, "09,d") give '00001,234' >> or '0,001,234'? The PEP specifies that format(1234, "08,d") >> should give '0001,234', but that's something of a special case: >> ',001,234' isn't really a viable alternative. > > Hmm. No good answers here. I'd vote for not putting the commas in the > leading zeros. I don't think anyone would ever actually use this > combination, and putting commas there complicates things due to the > special case with the first digit. > > Plus, they're not inserted by the 'n' formatter, and no one has > complained (which might mean no one's using it, of course). > > In 2.6: > > >>> import locale > >>> locale.setlocale(locale.LC_ALL, 'en_US.UTF8') > 'en_US.UTF8' > >>> format(12345, '010n') > '000012,345' > >>> format(12345, '09n') > '00012,345' > >>> format(12345, '08n') > '0012,345' > >>> format(12345, '07n') > '012,345' > >>> format(12345, '06n') > '12,345' I think this is a bug that should be fixed in the same way we implement it for PEP 378. It's more complex for 'n', because you might have funny groupings (like very 3, then 2). But I hope our solution for PEP 378 will generalize to this case, too. Eric. From greg.ewing at canterbury.ac.nz Tue Mar 17 22:41:11 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 18 Mar 2009 09:41:11 +1200 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49BFBA89.3060406@trueblade.com> <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> Message-ID: <49C018F7.8080201@canterbury.ac.nz> Guido van Rossum wrote: > (1) To avoid font-width issues -- many > variable-width fonts are designed so that all digits have the same > width, but their (default) space is much narrower. That's a good point. This alone doesn't necessarily rule out (A), though. It could be considered a case of user stupidity if they specify a field width that results in a comma at the beginning and don't like the result. It doesn't necessarily rule out (C) either, since there will always be a space at the beginning unless the value overflows, and then all your alignment guarantees are blown away anyhow. (2) To avoid fraud > when printing certain documents -- it's easier to insert a '1' in > front of a small number than to change a '0' into something else. However it's easy to add a '1' before a string of leading zeroes if there's a sliver of space available, so it's better still to fill with some other character such as '*'. You need a cooperative font for that to work. -- Greg From tjreedy at udel.edu Tue Mar 17 22:43:10 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 17 Mar 2009 17:43:10 -0400 Subject: [Python-ideas] Customizing format() In-Reply-To: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > I've been exploring how to customize our thousands separators and decimal > separators and wanted to offer-up an idea. It arose when I was looking > at Java's DecimalFormat class and its customization tool > DecimalFormatSymbols > http://java.sun.com/javase/6/docs/api/java/text/DecimalFormat.html . > Also, I looked at how regular expression patterns provide options to change > the meaning of its special characters using (?iLmsux). > > I. Simplest version -- Translation pairs > > format(1234, "8,.1f") --> ' 1,234.0' > format(1234, "(,_)8,.1f") --> ' 1_234.0' > format(1234, "(,_)(.,)8,.1f") --> ' 1_234,0' > > This approach is very easy to implement and it doesn't make life difficult > for the parser which can continue to look for just a comma and period > with their standardized meaning. It also fits nicely in our current > framework > and doesn't require any changes to the format() builtin. Of all the > options, > I find this one to be the easiest to read. I strongly prefer suffix to prefix modification. The format gives the overall structure of the output, the rest are details, which a reader may not care so much about. > Also, this version makes it easy to employ a couple of techniques to > factor-out These techniques apply to any "augment the basic format with an affix" method. > formatting decisions. Here's a gettext() style approach. > > def _(s): > return '(,.)(.,)' + s > . . . > format(x, _('8.1f')) > > Here's another approach using implicit string concatenation: > > DEB = '(,_)' # style for debugging > EXT = '(, )' # style for external display > . . . > format(x, DEB '8.1f') > format(y, EXT '8d') > > There are probably many ways to factor-out the decision. We don't need to > decide which is best, we just need to make it possible. > > One other thought, this approach makes it possible to customize all of the > characters that are currently hardwired (including zero and space padding > characters and the 'E' or 'e' exponent symbols). Any "augment the format with affixes" method should do the same. I prefer at most a separator (;) between affixes rather than fences around them. I also prefer, mnemonic key letters to mark the start of each affix, such as in Guido's quick suggestion: Thousands, Decimal_point, Exponent, Grouping, Pad_char, Money, and so on. But I do not think '=' is needed. Since the replacement will almost always be a single non-captital letter char, I am not sure a separator is even needed, but it would make parsing much easier. G would be followed by one or more digits indicating grouping from Decimal_point leftward, with the last repeated. If grouping by 9s is not large enough, allow a-f to get grouping up to 15 ;-). Example above would be format(1234, '8.1f;T.;P,') > II. Javaesque version -- FormatSymbols object > > This is essentially the same idea as previous one but involves modifying > the format() builtin to accept a symbols object and pass it to > __format__ methods. This moves the work outside of the format string > itself: > > DEB = FormatSymbols(comma='_') > EXT = FormatSymbols(comma=' ') > . . . > format(x, '8.1f', DEB) > format(y, '8d', EXT) > > The advantage is that this technique is easily extendable beyond simple > symbol translations and could possibly allow specification of grouping > sizes in hundreds and whatnot. It also looks more like a real program > as opposed to a formatting mini-language. The disadvantage is that > it is likely slower and it requires mucking with the currently dirt simple > format() / __format__() protocol. It may also be harder to integrate > with existing __format__ methods which are currently very string oriented. I suggested in the thread in exposing the format parse result that the resulting structure (dict or named tuple) could become an alternative, wordy interface to the format functions. I think the mini-language itself should stay mini. Terry Jan Reedy From guido at python.org Tue Mar 17 22:50:46 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 17 Mar 2009 14:50:46 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49C018F7.8080201@canterbury.ac.nz> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49BFBA89.3060406@trueblade.com> <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> <49C018F7.8080201@canterbury.ac.nz> Message-ID: On Tue, Mar 17, 2009 at 2:41 PM, Greg Ewing wrote: > Guido van Rossum wrote: >> >> (1) To avoid font-width issues -- many >> variable-width fonts are designed so that all digits have the same >> width, but their (default) space is much narrower. > > That's a good point. > > This alone doesn't necessarily rule out (A), though. > It could be considered a case of user stupidity if they > specify a field width that results in a comma at the > beginning and don't like the result. (A) is ruled out on the basis of aesthetics alone. > It doesn't necessarily rule out (C) either, since there > will always be a space at the beginning unless the value > overflows, and then all your alignment guarantees are > blown away anyhow. > > ?(2) To avoid fraud >> >> when printing certain documents -- it's easier to insert a '1' in >> front of a small number than to change a '0' into something else. > > However it's easy to add a '1' before a string of leading > zeroes if there's a sliver of space available, so it's > better still to fill with some other character such as > '*'. You need a cooperative font for that to work. What I've seen is the '$' sign immediately in front, e.g. $001,000.00. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From python at rcn.com Tue Mar 17 23:25:18 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 15:25:18 -0700 Subject: [Python-ideas] Customizing format() References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> Message-ID: [Terry Reedy] > I strongly prefer suffix to prefix modification. Given the way that the formatting parsers are written, I think suffix would work just as well as prefix. Also, your idea may help with the mental parsing as well (because the rest of the format string uses the untranslated symbols so that translation pairs should be at the end). >> Also, this version makes it easy to employ a couple of techniques to >> factor-out > > These techniques apply to any "augment the basic format with an > affix" method. Right. > I also prefer, mnemonic key letters to mark the start of each affix, ... > format(1234, '8.1f;T.;P,') I think it's better to be explicit that periods are translated to commas and commas to periods. Introducing a new letter just adds more to more memory load and makes the notation more verbose. In the previous newgroup discussions, people reacted badly to letter mnemonics finding them to be so ugly that they would refuse to use them (remember the early proposal of format(x,"8T,.f)). Also, the translation pairs approach lets you swap other hardwired characters like the E or a 0 pad. Raymond From steve at pearwood.info Wed Mar 18 00:29:59 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 18 Mar 2009 10:29:59 +1100 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <339525.47450.qm@web25805.mail.ukl.yahoo.com> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> Message-ID: <200903181029.59485.steve@pearwood.info> On Tue, 17 Mar 2009 11:23:23 pm hwpuschm at yahoo.de wrote: > What I would like is to extend the augmented assignment > and make it easy to understand for naive readers. [...] > The following examples would be extensions: > ? "lst = [5,6] + same" synonymous with > ? ? ? "lst.reverse(); lst.extend([6,5]); lst.reverse()" > ? "inmutable = same*(same+1)"? synonymous with > ? ? ? "unused=inmutable+1; inmutable*=unused; del unused" > > There seems to be no really simple expression for the above > extensions Instead of the proposed "lst = [5,6] + same" or the obfuscated "lst.reverse(); lst.extend([6,5]); lst.reverse()", what about this simple assignment? lst = [5, 6] + lst Instead of the proposed "inmutable = same*(same+1)" or the obfuscated "unused=inmutable+1; inmutable*=unused; del unused", what about the simple: inmutable = inmutable*(inmutable+1) Since your claimed intention is to make it easy for naive users, why replace the standard idiom: xx += 5 with an assignment containing a mysterious "same"? Many of those naive users will surely assume "same" is a variable name, not a magic keyword, and spend much time looking for where it is assigned. I predict that if your idea goes ahead, we'll get dozens of questions "I can't find where the variable same gets its value from", and we'll have to explain that it is a magic variable that gets its value from the left hand side of the assignment. One last question -- what should happen here? x, y, z = (same, same+1, same+2) -- Steven D'Aprano From steve at pearwood.info Wed Mar 18 00:30:09 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 18 Mar 2009 10:30:09 +1100 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <49BFA3EA.70108@improva.dk> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com> <49BFA3EA.70108@improva.dk> Message-ID: <200903181030.09394.steve@pearwood.info> On Wed, 18 Mar 2009 12:21:46 am Jacob Holm wrote: > I believe that as soon as the left-hand side stops being a simple > variable and it is used in non-trivial expressions on the right-hand > side, using the keyword would help clarify the intent. What I mean > is that the examples you should be looking at are more like: > > A[n+1] = same*same + 1 > B[2*j].foo = frobnicate(same, same+1) > ... > > If you try expanding these into current python with minimal change in > semantics you will end up with something like > > _1 = n+1 > _2 = A[_1] > A[_1] = _2*_2 + 1 > del _1 > del _2 > > _1 = B[2*j] > _2 = _1.foo > _1.foo = frobnicate(_2, _2+1) > del _1 > del _2 > > which is much less readable. Of course it is, because it's obfuscated. What's with the leading underscore names? Inside a function, they're not accessible to outside callers, so the notion of "private" and "public" doesn't apply, and in module-level code you delete them at the end, so they won't be imported because they no longer exist. (BTW, there's no need to delete the names one at a time. "del _1, _2" does what you want.) What's wrong with the clear, simple and obvious? A[n+1] = A[n+1]**2 + 1 If you really care about calculating n+1 twice then just use a meaningful name instead of an obfuscated name. This clarifies the intent of the code, instead of hiding it: index = n+1 # or even just i A[index] = A[index]**2 + 1 Likewise: tmp = B[2*j].foo B[2*j].foo = frobnicate(tmp, tmp+1) Or any combination of standard idioms. If you really insist, you can even delete the temporary names afterwards, but why would you bother inside a function? -- Steven D'Aprano From jh at improva.dk Wed Mar 18 01:16:21 2009 From: jh at improva.dk (Jacob Holm) Date: Wed, 18 Mar 2009 01:16:21 +0100 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <200903181030.09394.steve@pearwood.info> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com> <49BFA3EA.70108@improva.dk> <200903181030.09394.steve@pearwood.info> Message-ID: <49C03D55.7050604@improva.dk> Steven D'Aprano wrote: > On Wed, 18 Mar 2009 12:21:46 am Jacob Holm wrote: > > >> I believe that as soon as the left-hand side stops being a simple >> variable and it is used in non-trivial expressions on the right-hand >> side, using the keyword would help clarify the intent. What I mean >> is that the examples you should be looking at are more like: >> >> A[n+1] = same*same + 1 >> B[2*j].foo = frobnicate(same, same+1) >> ... >> >> If you try expanding these into current python with minimal change in >> semantics you will end up with something like >> >> _1 = n+1 >> _2 = A[_1] >> A[_1] = _2*_2 + 1 >> del _1 >> del _2 >> >> _1 = B[2*j] >> _2 = _1.foo >> _1.foo = frobnicate(_2, _2+1) >> del _1 >> del _2 >> >> which is much less readable. >> > > Of course it is, because it's obfuscated. What's with the leading > underscore names? Inside a function, they're not accessible to outside > callers, so the notion of "private" and "public" doesn't apply, and in > module-level code you delete them at the end, so they won't be imported > because they no longer exist. (BTW, there's no need to delete the names > one at a time. "del _1, _2" does what you want.) > > What's wrong with the clear, simple and obvious? > > A[n+1] = A[n+1]**2 + 1 > > What is wrong is that it computes (n+1) twice, and it uses a different operator to avoid doing the __getitem__ twice. The whole point of the exercise was to get as close as possible to what I think the expression using "same" should mean. I tried to follow the common style for that kind of expansion as seen elsewhere on this list to make that clear. Obviously I failed. Jacob From python at rcn.com Wed Mar 18 01:25:42 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 17:25:42 -0700 Subject: [Python-ideas] Customizing format() References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> Message-ID: <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> Mark Dickinson's test code suggested a good, extensible approach to the problem. Here's the idea in a nutshell: format(value, format_spec='', conventions=None) 'calls value.__format__(format_spec, conventions)' Where conventions is an optional dictionary with formatting control values. Any value object can accept custom controls, but the names for standard ones would be taken from the standards provided by localeconv(): { 'decimal_point': '.', 'grouping': [3, 0], 'negative_sign': '-', 'positive_sign': '', 'thousands_sep': ','} The would let you store several locales using localeconv() and use them at will, thus solving the global variable and threading problems with locale: import locale loc = locale.getlocale() # get current locale locale.setlocale(locale.LC_ALL, 'de_DE') DE = locale.localeconv() locale.setlocale(locale.LC_ALL, 'en_US') US = locale.localeconv() locale.setlocale(locale.LC_ALL, loc) # restore saved locale . . . format(x, '8,.f', DE) format(y, '8,d', US) It also lets you write your own conventions on the fly: DEB = dict(thousands_sep='_') # style for debugging EXT = dict(thousands_sep=',') # style for external display . . . format(x, '8.1f', DEB) format(y, '8d', EXT) Raymond From python at rcn.com Wed Mar 18 01:34:16 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 17:34:16 -0700 Subject: [Python-ideas] Customizing format() References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> Message-ID: > Where conventions is an optional dictionary with formatting control values. Any value object can accept custom controls, but the > names for standard ones would be taken from the standards provided by localeconv(): Forgot to mention that this approach make life easier on people writing __format__ methods because it lets them re-use the work they've already done to implement the "n" type specifier. Also, this approach is very similar to the one taken in Java with its DecimalFormatSymbols object. The main differences are that they use a custom class instead of a dictionary, that we would use standard names that work well with localeconv(), and that our approach is extensible for use with custom formatters (i.e. the datetime module could have its own set of key/value pairs for formatting controls). Raymond From jh at improva.dk Wed Mar 18 01:43:52 2009 From: jh at improva.dk (Jacob Holm) Date: Wed, 18 Mar 2009 01:43:52 +0100 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <200903181029.59485.steve@pearwood.info> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> <200903181029.59485.steve@pearwood.info> Message-ID: <49C043C8.90708@improva.dk> Steven D'Aprano wrote: > One last question -- what should happen here? > > x, y, z = (same, same+1, same+2) > > > Obviously a typeerror, as you cannot add one or two to a tuple... :) But the same question for the statement x, y, z = (same, foo(same), bar(same)) has a simple obvious anwer (at least to my eyes). It should be equivalent to: tmp = x, y, z z, y, z = (tmp, foo(tmp), bar(tmp)) However, On rereading the proposal(s), I can see that all the operations using same are supposedly defined in terms of augmented assignment operators which really doesn't make any sense to me. (Augmented assignment is one of the very few things in python I find really unclean. I know practicality beats purity and all that, but it just doesn't sit right with me). It is quite possible that I have been interpreting this whole idea differently than the original author intended. If so, I apologize for the confusion. In any case I am at best -0.5 on it, because the benefit does not outweigh the cost of adding a new keyword. Jacob From tjreedy at udel.edu Wed Mar 18 02:50:14 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 17 Mar 2009 21:50:14 -0400 Subject: [Python-ideas] Customizing format() In-Reply-To: References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > > [Terry Reedy] >> I strongly prefer suffix to prefix modification. > > Given the way that the formatting parsers are written, > I think suffix would work just as well as prefix. Also, > your idea may help with the mental parsing as well > (because the rest of the format string uses the untranslated symbols so > that translation pairs should > be at the end). > > >>> Also, this version makes it easy to employ a couple of techniques to >>> factor-out >> >> These techniques apply to any "augment the basic format with an >> affix" method. > > Right. > > >> I also prefer, mnemonic key letters to mark the start of each affix, > ... >> format(1234, '8.1f;T.;P,') This should have been format(1234, '8.1f;T.;D,') > > > I think it's better to be explicit that periods are translated to commas > and commas to periods. Introducing a new letter just adds more to > more memory load and makes the notation more verbose. In the > previous newgroup discussions, people reacted badly to letter > mnemonics finding them to be so ugly that they would refuse to use them > (remember the early proposal of format(x,"8T,.f)). > > Also, the translation pairs approach lets you swap other hardwired > characters like the E or a 0 pad. So does the key letter approach. The pairs approach does not allow easy alteration of the grouping spec, because there is no hard-wired char to swap, unless you would allow something cryptic like {3,(4,2,3)) (for India, I believe). Even with the tranlation pair, one could use a separator rather than fences. format(1234, '8.1f;T.;D,') # could be format(1234, '8,.1f;,.;.,') The two approachs could even be mixed by using a char only when clearer, such as 'G' for grouping instead of '3' for the existing grouping value. I think whatever scheme adopted should be complete. Terry Jan Reedy From lie.1296 at gmail.com Wed Mar 18 03:27:26 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Wed, 18 Mar 2009 13:27:26 +1100 Subject: [Python-ideas] Keyword same in right hand side of assignments (rev) In-Reply-To: <49C03D55.7050604@improva.dk> References: <339525.47450.qm@web25805.mail.ukl.yahoo.com> <7528bcdd0903170541g67d52668kae447602a91660aa@mail.gmail.com> <49BFA3EA.70108@improva.dk> <200903181030.09394.steve@pearwood.info> <49C03D55.7050604@improva.dk> Message-ID: Jacob Holm wrote: > What is wrong is that it computes (n+1) twice, and it uses a different > operator to avoid doing the __getitem__ twice. The whole point of the > exercise was to get as close as possible to what I think the expression > using "same" should mean. I tried to follow the common style for that > kind of expansion as seen elsewhere on this list to make that clear. > Obviously I failed. > Unless in a very tight loop, I see no reason why computing n+1 and __getitem__ twice is a problem. And using temporary variable is sufficiently clear enough unless your temporary variable's name starts with _. From lie.1296 at gmail.com Wed Mar 18 03:42:27 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Wed, 18 Mar 2009 13:42:27 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49BFBA89.3060406@trueblade.com> <5c6f2a5d0903170812o10332a8cw9ef4caa3455e282b@mail.gmail.com> <49C018F7.8080201@canterbury.ac.nz> Message-ID: Guido van Rossum wrote: >> >> (2) To avoid fraud >>> when printing certain documents -- it's easier to insert a '1' in >>> front of a small number than to change a '0' into something else. >> However it's easy to add a '1' before a string of leading >> zeroes if there's a sliver of space available, so it's >> better still to fill with some other character such as >> '*'. You need a cooperative font for that to work. > > What I've seen is the '$' sign immediately in front, e.g. $001,000.00. > I think I'd rather see something like: $==1,000.00== I wouldn't use zeroes, if I were the bank. It is bad on the aesthetics, and too easy to fraud. From lie.1296 at gmail.com Wed Mar 18 03:47:22 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Wed, 18 Mar 2009 13:47:22 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49C00A9B.4080509@canterbury.ac.nz> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <385A4935485649C38210DC189A08C9BC@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49C00A9B.4080509@canterbury.ac.nz> Message-ID: Greg Ewing wrote: > Mark Dickinson wrote: > >>>>> format('%014f', 123.456, grouping=1) >> >> '0,000,123.456000' >> >> That also means that the relationship between the field width (14 >> in this case) and the string length (16) is somewhat obscured. > > I'd consider that part a bug that we shouldn't imitate. > The field width should always be what you say it is, > unless the value is too big to fit. > Should there be an option for using hard-width? If hard-width flag is on, then if the value is too big to fit, then the number will get trimmed instead of changing the width (and perhaps there would be prepend character). So: width: 4, number: 123456, ppchar "<456" So not to break table alignment... From python at rcn.com Wed Mar 18 05:56:12 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 17 Mar 2009 21:56:12 -0700 Subject: [Python-ideas] Customizing format() References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> Message-ID: Am curious whether you guys like this proposal? Raymond ----- Original Message ----- [Raymond Hettinger] > Mark Dickinson's test code suggested a good, extensible approach to the problem. Here's the idea in a nutshell: > > format(value, format_spec='', conventions=None) > 'calls value.__format__(format_spec, conventions)' > > Where conventions is an optional dictionary with formatting control values. Any value object can accept custom controls, but the > names for standard ones would be taken from the standards provided by localeconv(): > > { > 'decimal_point': '.', > 'grouping': [3, 0], > 'negative_sign': '-', > 'positive_sign': '', > 'thousands_sep': ','} > > The would let you store several locales using localeconv() and use them at will, thus solving the global variable and threading > problems with locale: > > import locale > loc = locale.getlocale() # get current locale > locale.setlocale(locale.LC_ALL, 'de_DE') > DE = locale.localeconv() > locale.setlocale(locale.LC_ALL, 'en_US') > US = locale.localeconv() > locale.setlocale(locale.LC_ALL, loc) # restore saved locale > > . . . > > format(x, '8,.f', DE) > format(y, '8,d', US) > > > It also lets you write your own conventions on the fly: > > DEB = dict(thousands_sep='_') # style for debugging > EXT = dict(thousands_sep=',') # style for external display > . . . > format(x, '8.1f', DEB) > format(y, '8d', EXT) > > > Raymond > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From mishok13 at gmail.com Wed Mar 18 08:52:35 2009 From: mishok13 at gmail.com (Andrii V. Mishkovskyi) Date: Wed, 18 Mar 2009 09:52:35 +0200 Subject: [Python-ideas] dict '+' operator and slicing support for pop In-Reply-To: References: <192840a00903170800u342e1435v3b2adbeec1661b6@mail.gmail.com> <91ad5bf80903170903y5f239189x249b72950803cff6@mail.gmail.com> Message-ID: <192840a00903180052y568c9108v2bfed293a4ac2d5f@mail.gmail.com> On Tue, Mar 17, 2009 at 8:17 PM, Raymond Hettinger wrote: > >>> a.update(b) >>> return a > > Why take two short, simple lines with unequivocal meaning > and then abbreviate them with something mysterious (or > at least something with multiple possible interpretations)? > > Mappings exist in many languages now. ?Can you point > to another language that has found it worthwhile to have > both an update() method and an addition operator? > > Also, consider that dicts are one of our most basic APIs > and many other objects model that API. ?It behooves us > to keep that API as simple and thin as possible. > > IMO, this change would be gratuituous. ?None of the code > presented so far is significantly improved. ?Essentially, we're > looking at a trivial abbreviation, not an actual offering of > new capabilities. Reasonable enough. > > -1 all the way around. Does that also mean -1 on list.pop() accepting slices proposal? > > > Raymond > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Wbr, Andrii V. Mishkovskyi. He's got a heart of a little child, and he keeps it in a jar on his desk. From denis.spir at free.fr Wed Mar 18 09:05:25 2009 From: denis.spir at free.fr (spir) Date: Wed, 18 Mar 2009 09:05:25 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49C00D84.6020109@canterbury.ac.nz> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <5890DED167E8455F90A3B96F3F779B78@RaymondLaptop1> <5c6f2a5d0903170201m2599c28alc3915a4d1da1dd92@mail.gmail.com> <49BF8641.5080609@trueblade.com> <5c6f2a5d0903170557w4d85be6du9be9529cb00aae52@mail.gmail.com> <49BFA474.9040308@trueblade.com> <49C00D84.6020109@canterbury.ac.nz> Message-ID: <20090318090525.708d29ba@o> Le Wed, 18 Mar 2009 08:52:20 +1200, Greg Ewing s'exprima ainsi: > Guido van Rossum wrote: > > I agree that the given width should include the > > commas, but I strongly feel that leading zeros should be comma-fied > > just like everything else. > > I think we need some use cases before a proper > decision can be made about this. If you were using > comma-separated zero-filled numbers, what would > your objective be, and what choice would best > fulfill it? > I think the point is just this: 0,000,000.89 1,234,567.89 looks right. 0000000.89 1,234,567.89 looks wrong. 0000000.89 1,234,567.89 looks wrong. 000000000.89 1,234,567.89 looks wrong. ------ la vita e estrany From solipsis at pitrou.net Wed Mar 18 11:42:47 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 18 Mar 2009 10:42:47 +0000 (UTC) Subject: [Python-ideas] Customizing format() References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> Message-ID: Raymond Hettinger writes: > > Am curious whether you guys like this proposal? I find it good for the builtin format() function, but how does it work for str.format()? From steve at pearwood.info Wed Mar 18 12:39:40 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 18 Mar 2009 22:39:40 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> Message-ID: <200903182239.40862.steve@pearwood.info> On Wed, 18 Mar 2009 01:42:27 pm Lie Ryan wrote: > Guido van Rossum wrote: > >> (2) To avoid fraud > >> > >>> when printing certain documents -- it's easier to insert a '1' in > >>> front of a small number than to change a '0' into something else. > >> > >> However it's easy to add a '1' before a string of leading > >> zeroes if there's a sliver of space available, so it's > >> better still to fill with some other character such as > >> '*'. You need a cooperative font for that to work. > > > > What I've seen is the '$' sign immediately in front, e.g. > > $001,000.00. > > I think I'd rather see something like: $==1,000.00== > > I wouldn't use zeroes, if I were the bank. It is bad on the > aesthetics, and too easy to fraud. What I've generally seen on cheques is $****1,000.00 -- Steven D'Aprano From steve at pearwood.info Wed Mar 18 12:45:11 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 18 Mar 2009 22:45:11 +1100 Subject: [Python-ideas] Customizing format() In-Reply-To: <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> Message-ID: <200903182245.11720.steve@pearwood.info> On Wed, 18 Mar 2009 11:25:42 am Raymond Hettinger wrote: > Mark Dickinson's test code suggested a good, extensible approach to > the problem. Here's the idea in a nutshell: > > format(value, format_spec='', conventions=None) > 'calls value.__format__(format_spec, conventions)' For what was supposed to be a nice, simple way of formatting numbers, it sure became confusing. So thank you for the nutshell. I like this idea, especially if it means we can simplify the format_spec. Can we have the format_spec in a nutshell too? > Where conventions is an optional dictionary with formatting control > values. Any value object can accept custom controls, but the names > for standard ones would be taken from the standards provided by > localeconv(): > > { > 'decimal_point': '.', > 'grouping': [3, 0], > 'negative_sign': '-', > 'positive_sign': '', > 'thousands_sep': ','} Presumably we value compatibility with localeconv()? If not, then perhaps a better name for 'thousands_sep' is 'group_sep', on account that if you group by something other than 3 it won't represent thousands. Would this allow you to format a float like this? 1,234,567.89012 34567 89012 (group by threes for the integer part, and by fives for the fractional part). Or is that out-of-scope for this proposal? +1 for a conventions dict. Good plan! -- Steven D'Aprano From eric at trueblade.com Wed Mar 18 12:45:55 2009 From: eric at trueblade.com (Eric Smith) Date: Wed, 18 Mar 2009 07:45:55 -0400 Subject: [Python-ideas] Customizing format() In-Reply-To: References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> Message-ID: <49C0DEF3.2070905@trueblade.com> Antoine Pitrou wrote: > Raymond Hettinger writes: >> Am curious whether you guys like this proposal? > > I find it good for the builtin format() function, but how does it work for > str.format()? I agree: I like it, but it's not enough. I use str.format() way more often than I hope to ever use builtin format(). If we make any change, I'd rather see it focused on the format mini-language. Eric. From ncoghlan at gmail.com Wed Mar 18 13:03:07 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Mar 2009 22:03:07 +1000 Subject: [Python-ideas] Customizing format() In-Reply-To: <49C0DEF3.2070905@trueblade.com> References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com> Message-ID: <49C0E2FB.6010301@gmail.com> Eric Smith wrote: > Antoine Pitrou wrote: >> Raymond Hettinger writes: >>> Am curious whether you guys like this proposal? >> >> I find it good for the builtin format() function, but how does it work >> for >> str.format()? > > I agree: I like it, but it's not enough. I use str.format() way more > often than I hope to ever use builtin format(). If we make any change, > I'd rather see it focused on the format mini-language. Perhaps we could add a new ! type to the formatting language that allows the developer to mark a particular argument as the conventions dictionary? Then you could do something like: # DE and US dicts as per Raymond's format() example fmt = "The value is {:,.5f}{!conv}" fmt.format(num, DE) fmt.format(num, US) fmt.format(num, dict(thousands_sep=''')) As with !a and !s, you could use any normal field specifier to select the conventions dictionary. Obviously, the formatting arguments would be ignored for that particular field. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Wed Mar 18 13:09:13 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 18 Mar 2009 22:09:13 +1000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <200903182239.40862.steve@pearwood.info> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> Message-ID: <49C0E469.2080207@gmail.com> Steven D'Aprano wrote: > What I've generally seen on cheques is $****1,000.00 Interestingly, str.format will actually be able to produce directly in 3.1: "${:*>,.2f}".format(value) ...although that makes seq[::-1] look positively coherent :) Wondering-who-will-ask-for-a-{!verbose}-string-formatting-flag'ly, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From python at rcn.com Wed Mar 18 13:44:53 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 18 Mar 2009 05:44:53 -0700 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> <49C0E469.2080207@gmail.com> Message-ID: <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> >> What I've generally seen on cheques is $****1,000.00 > > Interestingly, str.format will actually be able to produce directly in 3.1: > > "${:*>,.2f}".format(value) What we have already in SVN courtesy of Mark Dickinson: >>> from decimal import Decimal >>> value = Decimal(1000) >>> "${:*>12,.2f}".format(value) '$****1,000.00' Raymond From lie.1296 at gmail.com Wed Mar 18 14:41:56 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Thu, 19 Mar 2009 00:41:56 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> <49C0E469.2080207@gmail.com> <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> Message-ID: Raymond Hettinger wrote: > >>> What I've generally seen on cheques is $****1,000.00 >> >> Interestingly, str.format will actually be able to produce directly in >> 3.1: >> >> "${:*>,.2f}".format(value) > > What we have already in SVN courtesy of Mark Dickinson: > >>>> from decimal import Decimal >>>> value = Decimal(1000) >>>> "${:*>12,.2f}".format(value) > '$****1,000.00' > Anything but zeroes that isn't too similar to numeric character should be fine for "finance-related number". PS: On this side of the world, the commas and the dots are reversed so I would not dream any solution that doesn't encompass at least that (which doesn't require additional function wrapping). I'd personally prefer fully customizable separator, as my personal preference is using space and decimal commas PPS: I HAVE A HISTORY OF BEING ADMITTED TO A MENTAL INSTITUTION AFTER SEEING NUMBERS WITH COMMAS USED AS THOUSAND SEPARATOR. PPPS: The next statement is a lie. PPPPS: The mental institution thing is true. From lie.1296 at gmail.com Wed Mar 18 14:47:25 2009 From: lie.1296 at gmail.com (Lie Ryan) Date: Thu, 19 Mar 2009 00:47:25 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> <49C0E469.2080207@gmail.com> <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> Message-ID: Lie Ryan wrote: > Anything but zeroes that isn't too similar to numeric character should > be fine for "finance-related number". > > PS: On this side of the world, the commas and the dots are reversed so I > would not dream any solution that doesn't encompass at least that (which > doesn't require additional function wrapping). I'd personally prefer > fully customizable separator, as my personal preference is using space > and decimal commas > PPS: I HAVE A HISTORY OF BEING ADMITTED TO A MENTAL INSTITUTION AFTER > SEEING NUMBERS WITH COMMAS USED AS THOUSAND SEPARATOR. > PPPS: The next statement is a lie. > PPPPS: The mental institution thing is true. PPPPPS: The first postscript includes financial institution PPPPPPS: The fact that you can wrap the formatting in function call is not an excuse for not providing fully customizable separators. PPPPPPPS: The financial world != American financial institutions From solipsis at pitrou.net Wed Mar 18 14:57:50 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 18 Mar 2009 13:57:50 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> <49C0E469.2080207@gmail.com> <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> Message-ID: Lie Ryan writes: > > > PPS: I HAVE A HISTORY OF BEING ADMITTED TO A MENTAL INSTITUTION AFTER > > SEEING NUMBERS WITH COMMAS USED AS THOUSAND SEPARATOR. > > PPPS: The next statement is a lie. > > PPPPS: The mental institution thing is true. I am fully sympathetic. > PPPPPPPS: The financial world != American financial institutions Agreed, but they have the largest debts. Therefore, real-life examples of commas used as thousands separators should include a negative sign. From gerald.britton at gmail.com Wed Mar 18 18:10:52 2009 From: gerald.britton at gmail.com (Gerald Britton) Date: Wed, 18 Mar 2009 13:10:52 -0400 Subject: [Python-ideas] thoughts on generator.throw() Message-ID: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> Today I was reviewing changes in Python 2.5 and I noticed the generator throw() method for the first time. While thinking about what it does and why, a question arose in my mind: Why is it called "throw"? (Yes, I know that Java and possibly other languages use this keyword!) In Python, we have long had a "raise" statement to raise exceptions. I would have thought that the generator method would have been called "raise" as well. But then I saw that it would have been impossible to implement since "raise" is a Python keyword. *Then* I wondered why "raise" is a keyword and not a function. If it were a function you could use it easily in places where today you cannot: if 'foo' == 'bar' or raise(FooBar): # only proceed if 'foo' equals 'bar' otherwise raise FooBar exception is invalid syntax because raise is not a function. Now, I can get around it: def raise_(exception): raise exception ... if 'foo' == 'bar' or raise_(FooBar): ... I have a similar question about the "assert" statement. It could possibly benefit from being a function instead. Of course, changing this would break lots of code, but maybe not any more than making print a function as in 3.0. Thoughts? -- Gerald Britton From solipsis at pitrou.net Wed Mar 18 18:20:11 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 18 Mar 2009 17:20:11 +0000 (UTC) Subject: [Python-ideas] thoughts on generator.throw() References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> Message-ID: Gerald Britton writes: > > But then I > saw that it would have been impossible to implement since "raise" is a > Python keyword. *Then* I wondered why "raise" is a keyword and not a > function. If it were a function you could use it easily in places > where today you cannot: > > if 'foo' == 'bar' or raise(FooBar): # only proceed if 'foo' > equals 'bar' otherwise raise FooBar exception I find this horrible, awfully Perlish. Non-local control transfers should stick out clearly when reading source code, not be hidden at the end of a conditional. As for why raise is a keyword, I think there are several explanations: - raise is a control flow operation, as are "return", "continue", "break" and others. - raise has to create a traceback capturing the current frame stack, which is easier with a dedicated bytecode. - raise should be decently fast, which is easier with a dedicated bytecode. > I have a similar question about the "assert" statement. It could > possibly benefit from being a function instead. I think the point is that assert is entirely a no-op when the interpreter is run with "-O", while there would be a significant overhead if it was a regular function call. But I agree that the situation is less clear-cut than with the raise statement. Regards Antoine. From leif.walsh at gmail.com Wed Mar 18 18:42:42 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Wed, 18 Mar 2009 13:42:42 -0400 Subject: [Python-ideas] thoughts on generator.throw() In-Reply-To: References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> Message-ID: On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou wrote: > - raise should be decently fast, which is easier with a dedicated bytecode. Why does raise have to be decently fast? In my average case, at least, it's encountered at most once per program execution. Even if I was good about catching exceptions, the point is that they're _exceptional_ cases, so they shouldn't be happening very often. I'm not about to say raise should be a function, but I don't think it's got a huge speed requirement. -- Cheers, Leif From george.sakkis at gmail.com Wed Mar 18 19:07:47 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Wed, 18 Mar 2009 14:07:47 -0400 Subject: [Python-ideas] thoughts on generator.throw() In-Reply-To: References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> Message-ID: <91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com> On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh wrote: > On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou wrote: >> - raise should be decently fast, which is easier with a dedicated bytecode. > > Why does raise have to be decently fast? > > In my average case, at least, it's encountered at most once per > program execution. ?Even if I was good about catching exceptions, the > point is that they're _exceptional_ cases, so they shouldn't be > happening very often. That's not always true; StopIteration comes to mind. George From denis.spir at free.fr Wed Mar 18 19:23:13 2009 From: denis.spir at free.fr (spir) Date: Wed, 18 Mar 2009 19:23:13 +0100 Subject: [Python-ideas] logics (was:thoughts on generator.throw()) In-Reply-To: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> Message-ID: <20090318192313.7bc137f5@o> Le Wed, 18 Mar 2009 13:10:52 -0400, Gerald Britton s'exprima ainsi: > *Then* I wondered why "raise" is a keyword and not a > function. If it were a function you could use it easily in places > where today you cannot: > > if 'foo' == 'bar' or raise(FooBar): # only proceed if 'foo' > equals 'bar' otherwise raise FooBar exception > > is invalid syntax because raise is not a function. I'm very happy this is invalid syntax :-) I consider this kind of practice conceptual distortion. More precisely: an abuse of both (!) flow control and logical operator semantics. It reminds me of joyful (hum!) times with C routines written by "clever" people. Denis PS I would go much farther that python about logical types and operators. Lazy evaluation is ok, because the alternative is not simpler: if n != 0 and 1/n > threshold: But I'm not happy at all with the following: >>> (3==3) + 1 2 >>> 1 or True 1 I think logical operators (and or not) should accept only logical value. And logical values should not operate with numbers. ------ la vita e estrany From cmjohnson.mailinglist at gmail.com Wed Mar 18 19:33:56 2009 From: cmjohnson.mailinglist at gmail.com (Carl Johnson) Date: Wed, 18 Mar 2009 08:33:56 -1000 Subject: [Python-ideas] Customizing format() In-Reply-To: <49C0E2FB.6010301@gmail.com> References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com> <49C0E2FB.6010301@gmail.com> Message-ID: <3bdda690903181133g35ceb1fhfbe2e96577807123@mail.gmail.com> I haven't entirely been following this conversation, so I may be missing something, but what about something like: "Balance = ${balance:{minilang}}".format(balance=1.00, minilang=mini_formatter(thousands_sep=",", ...)) That way, even if the mini-language gets really confusing we'll have an easy to call function that manages it. I always thought it was weird that things co From cmjohnson.mailinglist at gmail.com Wed Mar 18 19:36:44 2009 From: cmjohnson.mailinglist at gmail.com (Carl Johnson) Date: Wed, 18 Mar 2009 08:36:44 -1000 Subject: [Python-ideas] Customizing format() In-Reply-To: <3bdda690903181133g35ceb1fhfbe2e96577807123@mail.gmail.com> References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com> <49C0E2FB.6010301@gmail.com> <3bdda690903181133g35ceb1fhfbe2e96577807123@mail.gmail.com> Message-ID: <3bdda690903181136h7dbf7095l50c85014bb29e63d@mail.gmail.com> Carl Johnson wrote: > I haven't entirely been following this conversation, so I may be > missing something, but what about something like: > > "Balance = ${balance:{minilang}}".format(balance=1.00, > minilang=mini_formatter(thousands_sep=",", ...)) > > That way, even if the mini-language gets really confusing we'll have > an easy to call function that manages it. I always thought it was > weird that things co Sorry, Google mail has been being weird lately, signing me out and suddenly sending mail, etc. ?So, I always thought it was weird that {}s could nest in the new format language, but if we have that capability, we may as well use it. -- Carl From arnodel at googlemail.com Wed Mar 18 20:55:41 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Wed, 18 Mar 2009 19:55:41 +0000 Subject: [Python-ideas] thoughts on generator.throw() In-Reply-To: <91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com> References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> <91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com> Message-ID: On 18 Mar 2009, at 18:07, George Sakkis wrote: > On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh > wrote: >> On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou >> wrote: >>> - raise should be decently fast, which is easier with a dedicated >>> bytecode. >> >> Why does raise have to be decently fast? >> >> In my average case, at least, it's encountered at most once per >> program execution. Even if I was good about catching exceptions, the >> point is that they're _exceptional_ cases, so they shouldn't be >> happening very often. > > That's not always true; StopIteration comes to mind. But StopIteration is not usually raised explicitly. -- Arnaud From leif.walsh at gmail.com Wed Mar 18 20:58:06 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Wed, 18 Mar 2009 15:58:06 -0400 (EDT) Subject: [Python-ideas] thoughts on generator.throw() In-Reply-To: Message-ID: On Wed, Mar 18, 2009 at 3:55 PM, Arnaud Delobelle wrote: > But StopIteration is not usually raised explicitly. He's got a point though, raise should be fast. -- Cheers, Leif -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 270 bytes Desc: OpenPGP digital signature URL: From python at rcn.com Wed Mar 18 21:04:12 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 18 Mar 2009 13:04:12 -0700 Subject: [Python-ideas] thoughts on generator.throw() References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com><91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com> Message-ID: >>>. Even if I was good about catching exceptions, the >>> point is that they're _exceptional_ cases, so they shouldn't be >>> happening very often. Not everyone programs that way. Python has long advertised exceptions for other than the exceptional. You're misapplying C++ lore to Python. Raymond From tjreedy at udel.edu Wed Mar 18 21:14:38 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 Mar 2009 16:14:38 -0400 Subject: [Python-ideas] Customizing format() In-Reply-To: <49C0DEF3.2070905@trueblade.com> References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com> Message-ID: Eric Smith wrote: > Antoine Pitrou wrote: >> Raymond Hettinger writes: >>> Am curious whether you guys like this proposal? >> >> I find it good for the builtin format() function, but how does it work >> for >> str.format()? > > I agree: I like it, but it's not enough. I use str.format() way more > often than I hope to ever use builtin format(). If we make any change, > I'd rather see it focused on the format mini-language. I agree. My impression was that format() was added mostly for consistency with the policy of having a 'public' interface to special methods, and that .__format__ was added to support str.format. Hence, any new capability of .__format__ must be accessible from format strings with replacement fields. tjr From tjreedy at udel.edu Wed Mar 18 21:33:49 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 Mar 2009 16:33:49 -0400 Subject: [Python-ideas] Customizing format() In-Reply-To: <49C0E2FB.6010301@gmail.com> References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com> <49C0E2FB.6010301@gmail.com> Message-ID: Nick Coghlan wrote: > Eric Smith wrote: >> I agree: I like it, but it's not enough. I use str.format() way more >> often than I hope to ever use builtin format(). If we make any change, >> I'd rather see it focused on the format mini-language. > > Perhaps we could add a new ! type to the formatting language that allows > the developer to mark a particular argument as the conventions > dictionary? Then you could do something like: > > # DE and US dicts as per Raymond's format() example > fmt = "The value is {:,.5f}{!conv}" A new conversion specifier should follow the current pattern and be a single letter, such as 'c' for 'custom' or 'd' for dict. If, as I would expect, str.format scans left to right and interprets and replaces each field spec as it goes, then the above would not work. So put the conversion field before the fields it applies to. This, of course, makes string formatting stateful. With a 'shift lock' field added, an 'unshift' field should also be added. This, though, has the problem that a blank 'field-name' will in 3.1 either be auto-numbered or flagged as an error (if there are other explicitly numbered fields). I am a little uneasy about 'replacement fields' that are not really replacement fields. > fmt.format(num, DE) > fmt.format(num, US) > fmt.format(num, dict(thousands_sep=''')) > > As with !a and !s, you could use any normal field specifier to select > the conventions dictionary. Obviously, the formatting arguments would be > ignored for that particular field. Terry Jan Reedy From python at rcn.com Wed Mar 18 21:37:47 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 18 Mar 2009 13:37:47 -0700 Subject: [Python-ideas] Customizing format() References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com><49C0E2FB.6010301@gmail.com> Message-ID: >> # DE and US dicts as per Raymond's format() example >> fmt = "The value is {:,.5f}{!conv}" > > A new conversion specifier should follow the current pattern and be a > single letter, such as 'c' for 'custom' or 'd' for dict. > > If, as I would expect, str.format scans left to right and interprets and > replaces each field spec as it goes, then the above would not work. So > put the conversion field before the fields it applies to. My interpretation is that the conv-dictionary applies to the whole string (not field-by-field) and that it can go at the end (because it doesn't affect parsing, rather it applies to the translation phase). Raymond From tjreedy at udel.edu Wed Mar 18 22:27:08 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 18 Mar 2009 17:27:08 -0400 Subject: [Python-ideas] Customizing format() In-Reply-To: References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com><49C0E2FB.6010301@gmail.com> Message-ID: Raymond Hettinger wrote: > > >>> # DE and US dicts as per Raymond's format() example >>> fmt = "The value is {:,.5f}{!conv}" >> >> A new conversion specifier should follow the current pattern and be a >> single letter, such as 'c' for 'custom' or 'd' for dict. >> >> If, as I would expect, str.format scans left to right and interprets >> and replaces each field spec as it goes, then the above would not >> work. So put the conversion field before the fields it applies to. > > My interpretation is that the conv-dictionary applies to the whole > string (not field-by-field) That was not specified. If so, then a statement like """A number such as {0:15.2f} can be formatted many ways: USA: {0:15,.2f), EU: {0:15f}, India: {0:15f), China {0:15f)" would not be possible. Why not allow extra flexibility? Unless the conversion is set by setting a global variable ala locale, the c-dict will be *used* field-by-field in each call to ob.__format__(fmt, conv), so there is no reason to force each call in a particular series to use the same conversion. > and that it can go at the end (because > it doesn't affect parsing, rather it applies to the translation phase). We agree that parsing out the conversion spec must happen before the translation it affects. If, as I supposed above (because of how I would think to write the code), parsing and translation are intermixed, then parsing the spec *after* translation will not work. Even if they are done in two batches, it would still be easy to rebind the c-dict var during the second-phase scan of the replacement fields. Terry Jan Reedy From python at rcn.com Wed Mar 18 22:44:24 2009 From: python at rcn.com (Raymond Hettinger) Date: Wed, 18 Mar 2009 14:44:24 -0700 Subject: [Python-ideas] Customizing format() References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com><49C0E2FB.6010301@gmail.com> Message-ID: <203D79CECB0B411AA2C2762C06900068@RaymondLaptop1> >> My interpretation is that the conv-dictionary applies to the whole >> string (not field-by-field) > > That was not specified. If so, then a statement like > """A number such as {0:15.2f} can be formatted many ways: > USA: {0:15,.2f), EU: {0:15f}, > India: {0:15f), China {0:15f)" > would not be possible. > > Why not allow extra flexibility? Unless the conversion is set by > setting a global variable ala locale, the c-dict will be *used* > field-by-field in each call to ob.__format__(fmt, conv), so there is no > reason to force each call in a particular series to use the same conversion. -1 Unattractive and unnecessary hyper-generalization. Raymond From aahz at pythoncraft.com Wed Mar 18 23:06:12 2009 From: aahz at pythoncraft.com (Aahz) Date: Wed, 18 Mar 2009 15:06:12 -0700 Subject: [Python-ideas] thoughts on generator.throw() In-Reply-To: References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> <91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com> Message-ID: <20090318220612.GA7221@panix.com> On Wed, Mar 18, 2009, Arnaud Delobelle wrote: > On 18 Mar 2009, at 18:07, George Sakkis wrote: >> On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh >> wrote: >>> >>> Why does raise have to be decently fast? >>> >>> In my average case, at least, it's encountered at most once per >>> program execution. Even if I was good about catching exceptions, the >>> point is that they're _exceptional_ cases, so they shouldn't be >>> happening very often. >> >> That's not always true; StopIteration comes to mind. > > But StopIteration is not usually raised explicitly. This is a standard Python idiom: try: for field in curr_fields: for item in record[field]: item = item.lower() for filter in excludes: if match(item, filter): raise Excluded except Excluded: continue -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Programming language design is not a rational science. Most reasoning about it is at best rationalization of gut feelings, and at worst plain wrong." --GvR, python-ideas, 2009-3-1 From ncoghlan at gmail.com Wed Mar 18 23:06:52 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Mar 2009 08:06:52 +1000 Subject: [Python-ideas] Customizing format() In-Reply-To: References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com> Message-ID: <49C1707C.7090105@gmail.com> Terry Reedy wrote: > I agree. My impression was that format() was added mostly for > consistency with the policy of having a 'public' interface to special > methods, and that .__format__ was added to support str.format. Hence, > any new capability of .__format__ must be accessible from format strings > with replacement fields. format() was also added because the PEP 3101 syntax is pretty heavyweight when it comes to formatting a single value: "%.2f" % (x) and "{0:.2f}".format(x) Being able to write format(".2f", x) instead meant dropping 4 characters (now 3 with str.format autonumbering) over the latter option. Agreed that any solution in this area needs to help with str.format() and not just format() though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From steve at pearwood.info Wed Mar 18 23:19:55 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 19 Mar 2009 09:19:55 +1100 Subject: [Python-ideas] thoughts on generator.throw() In-Reply-To: References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> <91ad5bf80903181107g23790346vaccf82933a80a570@mail.gmail.com> Message-ID: <200903190919.55427.steve@pearwood.info> On Thu, 19 Mar 2009 06:55:41 am Arnaud Delobelle wrote: > On 18 Mar 2009, at 18:07, George Sakkis wrote: > > On Wed, Mar 18, 2009 at 1:42 PM, Leif Walsh > > > > wrote: > >> On Wed, Mar 18, 2009 at 1:20 PM, Antoine Pitrou > >> > >> wrote: > >>> - raise should be decently fast, which is easier with a dedicated > >>> bytecode. > >> > >> Why does raise have to be decently fast? > >> > >> In my average case, at least, it's encountered at most once per > >> program execution. Even if I was good about catching exceptions, > >> the point is that they're _exceptional_ cases, so they shouldn't > >> be happening very often. > > > > That's not always true; StopIteration comes to mind. > > But StopIteration is not usually raised explicitly. It still has to be raised. It's not just StopIteration either, the iteration protocol also catches IndexError: >>> class C(object): ... def __getitem__(self, i): ... if i < 3: return i ... else: raise IndexError ... >>> c = C() >>> for i in c: ... print i ... 0 1 2 >>> -- Steven D'Aprano From ncoghlan at gmail.com Wed Mar 18 23:21:09 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Mar 2009 08:21:09 +1000 Subject: [Python-ideas] Customizing format() In-Reply-To: References: <3FF598B996764051A86758B98BE7B2D1@RaymondLaptop1> <567E18322D754B74B56013E8811D10AA@RaymondLaptop1> <49C0DEF3.2070905@trueblade.com> <49C0E2FB.6010301@gmail.com> Message-ID: <49C173D5.7090700@gmail.com> Terry Reedy wrote: > Nick Coghlan wrote: > A new conversion specifier should follow the current pattern and be a > single letter, such as 'c' for 'custom' or 'd' for dict. Because those characters already have other meanings in string formatting dictionary (as do many possible single digit codes). The suggested name "!conv" was chosen based on the existing localeconv() function name. > If, as I would expect, str.format scans left to right and interprets and > replaces each field spec as it goes, then the above would not work. So > put the conversion field before the fields it applies to. I believe you're currently right - I'm not sure how hard it would be to change it to a two step process (parse the whole string first into an internal parse tree then go through and format each identified field). As for why I formatted the example the way I did: the {!conv} isn't all that interesting, since it just says "I accept a conventions dictionary". Having it at the front of the format string would give it to much prominence. > This, of course, makes string formatting stateful. With a 'shift lock' > field added, an 'unshift' field should also be added. This, though, has > the problem that a blank 'field-name' will in 3.1 either be > auto-numbered or flagged as an error (if there are other explicitly > numbered fields). Aside from not producing any output, the !conv field would still have to obey all the rules for field naming/numbering. So if your format string used explicit numbering instead of auto-numbering then the !conv would need to be explicitly numbered as well. I agree that having "format fields which are not format fields" isn't ideal, but the alternative is likely to be something like yet-another-string-formatting-method which accepts a positional only conventions dictionary as its first argument. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Wed Mar 18 23:30:56 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Mar 2009 08:30:56 +1000 Subject: [Python-ideas] thoughts on generator.throw() In-Reply-To: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> Message-ID: <49C17620.6000204@gmail.com> Gerald Britton wrote: > Today I was reviewing changes in Python 2.5 and I noticed the > generator throw() method for the first time. While thinking about > what it does and why, a question arose in my mind: > > Why is it called "throw"? (Yes, I know that Java and possibly other > languages use this keyword!) In Python, we have long had a "raise" > statement to raise exceptions. I would have thought that the > generator method would have been called "raise" as well. But then I > saw that it would have been impossible to implement since "raise" is a > Python keyword. Actually, it was also called throw because it says "raise this exception over *there* (i.e inside the generator)". We're throwing the exception "over the fence" as it were. That was a rationalisation of a necessity (see the description in PEP 342), but still a good idea. > *Then* I wondered why "raise" is a keyword and not a > function. Because the compiler needs to see it and insert the appropriate commands into the bytecode to tell the interpreter to find the nearest exception handler or finally block and resume execution there. While you could probably figure out a way to do that without dedicated bytecode, I doubt it would do good things to the structure of the eval loop. > I have a similar question about the "assert" statement. It could > possibly benefit from being a function instead. Of course, changing > this would break lots of code, but maybe not any more than making > print a function as in 3.0. As others have said, so the compiler can drop it when optimisation is switched on. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Wed Mar 18 23:33:43 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Mar 2009 08:33:43 +1000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> <49C0E469.2080207@gmail.com> <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> Message-ID: <49C176C7.7090309@gmail.com> Antoine Pitrou wrote: > Agreed, but they have the largest debts. > Therefore, real-life examples of commas used as thousands separators should > include a negative sign. A. :) B. All I can suggest is to try to think of the "commas as separators in format()" situation as being in the same vein as that whole "let use English keywords where possible" idea :) Hopefully a way will be found to provide a less English-centric but still easy to use formatting system eventually, but in the meantime Python *is* a language that looks like English pseudocode... Cheers, Nick -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From rdmurray at bitdance.com Thu Mar 19 01:37:55 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 19 Mar 2009 00:37:55 +0000 (UTC) Subject: [Python-ideas] logics (was:thoughts on generator.throw()) References: <5d1a32000903181010i29509be6o2494c4cde125fe87@mail.gmail.com> <20090318192313.7bc137f5@o> Message-ID: spir wrote: > Le Wed, 18 Mar 2009 13:10:52 -0400, > I would go much farther that python about logical types and operators. > > Lazy evaluation is ok, because the alternative is not simpler: > if n != 0 and 1/n > threshold: > > But I'm not happy at all with the following: > >>> (3==3) + 1 > 2 > >>> 1 or True > 1 > > I think logical operators (and or not) should accept only logical value. And > logical values should not operate with numbers. I might be argued into agreeing with you about the first case, but it might be a logical consequence of the implementation of the second case. Or it might be an historical accident, since True used to be 1. (But the statement still gives that result in Python3, so unless it was just overlooked in the cleanup, someone must think it is a good idea.) But I would very definitely not want to give up the second example. Having the shortcut logical operators return the actual value that was last evaluated is just too darn useful :) -- R. David Murray http://www.bitdance.com From denis.spir at free.fr Thu Mar 19 10:12:20 2009 From: denis.spir at free.fr (spir) Date: Thu, 19 Mar 2009 10:12:20 +0100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49C176C7.7090309@gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> <49C0E469.2080207@gmail.com> <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> <49C176C7.7090309@gmail.com> Message-ID: <20090319101220.41e580c7@o> Le Thu, 19 Mar 2009 08:33:43 +1000, Nick Coghlan s'exprima ainsi: > B. All I can suggest is to try to think of the "commas as separators in > format()" situation as being in the same vein as that whole "let use > English keywords where possible" idea :) This is a wrong rationale. The readers of python keywords is the community of pythonistas (*); while the readers of documents produced by apps written in python can be any kind of people. "1,234,567.89" is more or less illegible for people not used to english conventions. Specifying the separator(s) is definitely a bad idea imo. I have not understood the proposal to be intended only for debug, but for all kinds of quick and/or unpublished developpment. Even in the first case, having numbers output in the format your eyes are used to is a nice & worthful help. Imagine you -- and all programmers, and millions of users -- would have to cope with numbers like "1.234.567,89" all the time only because someone decided (for any reason) that separators must be fixed, and this format is the obvious one. Denis (*) ditto about english naming, comments, & doc inside standard library ------ la vita e estrany From steve at pearwood.info Thu Mar 19 10:27:27 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 19 Mar 2009 20:27:27 +1100 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <20090319101220.41e580c7@o> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49C176C7.7090309@gmail.com> <20090319101220.41e580c7@o> Message-ID: <200903192027.27510.steve@pearwood.info> On Thu, 19 Mar 2009 08:12:20 pm spir wrote: > Le Thu, 19 Mar 2009 08:33:43 +1000, > > Nick Coghlan s'exprima ainsi: > > B. All I can suggest is to try to think of the "commas as > > separators in format()" situation as being in the same vein as that > > whole "let use English keywords where possible" idea :) > > This is a wrong rationale. The readers of python keywords is the > community of pythonistas (*); while the readers of documents produced > by apps written in python can be any kind of people. "1,234,567.89" > is more or less illegible for people not used to english conventions. > Specifying the separator(s) is definitely a bad idea imo. I have not > understood the proposal to be intended only for debug, but for all > kinds of quick and/or unpublished developpment. Even in the first > case, having numbers output in the format your eyes are used to is a > nice & worthful help. Imagine you -- and all programmers, and > millions of users -- would have to cope with numbers like > "1.234.567,89" all the time only because someone decided (for any > reason) that separators must be fixed, and this format is the obvious > one. It would be sub-optimal but hardly "more or less illegible". But then I'm not American and therefore I'm already used to people misspelling colour as "color", centre as "center", and biscuit as "cookie" *wink* Nevertheless, I agree that for output, we shouldn't hard-code the decimal and thousands separator as "." and "," respectively -- although as an English-speaker, I'd be happy for those choices to be the default. But surely with Raymond and Mark's idea about passing a dict derived from locale, this is no longer an issue? Are hard-coded separators still on the table? -- Steven D'Aprano From fredrik.johansson at gmail.com Thu Mar 19 10:59:08 2009 From: fredrik.johansson at gmail.com (Fredrik Johansson) Date: Thu, 19 Mar 2009 10:59:08 +0100 Subject: [Python-ideas] Builtin test function Message-ID: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com> There's been some discussion about automatic test discovery lately. Here's a random (not in any way thought through) idea: add a builtin function test() that runs tests associated with a given function, class, module, or object. Example: >>> import myproject >>> test(myproject.MainClass) ... >>> test(myproject) ... By default, test(obj) could simply run all doctests in docstrings attached to obj. For modules, it could also look for unittest.TestCase instances, and perhaps do some more advanced test discovery. test() could implement some keyword options to control exactly what and what not to do. There could perhaps also be a corresponding __test__ method/function for implementing custom test runners. Fredrik From robertc at robertcollins.net Thu Mar 19 11:23:57 2009 From: robertc at robertcollins.net (Robert Collins) Date: Thu, 19 Mar 2009 21:23:57 +1100 Subject: [Python-ideas] Builtin test function In-Reply-To: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com> References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com> Message-ID: <1237458237.15722.206.camel@lifeless-64> On Thu, 2009-03-19 at 10:59 +0100, Fredrik Johansson wrote: > There's been some discussion about automatic test discovery lately. > Here's a random (not in any way thought through) idea: add a builtin > function test() that runs tests associated with a given function, > class, module, or object. This takes out all of the [useful] configuration for output - parallel testing, distributed testing, testing from an IDE etc. I'd love to see something like bzr's load_tests module scope hook honoured by the default test loader. It makes test discovery compatible with test customisation. I'd be happy to put a patch together. -Rob -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part URL: From steve at pearwood.info Thu Mar 19 11:48:53 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 19 Mar 2009 21:48:53 +1100 Subject: [Python-ideas] Builtin test function In-Reply-To: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com> References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com> Message-ID: <200903192148.54461.steve@pearwood.info> On Thu, 19 Mar 2009 08:59:08 pm Fredrik Johansson wrote: > There's been some discussion about automatic test discovery lately. > Here's a random (not in any way thought through) idea: add a builtin > function test() that runs tests associated with a given function, > class, module, or object. Improved testing is always welcome, but why a built-in? I know testing is important, but is it so common and important that we need it at our fingertips, so to speak, and can't even import a module first before running tests? What's the benefit to making it a built-in instead of part of a test module? -- Steven D'Aprano From ncoghlan at gmail.com Thu Mar 19 12:16:14 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Mar 2009 21:16:14 +1000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <200903192027.27510.steve@pearwood.info> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49C176C7.7090309@gmail.com> <20090319101220.41e580c7@o> <200903192027.27510.steve@pearwood.info> Message-ID: <49C2297E.302@gmail.com> Steven D'Aprano wrote: > But surely with Raymond and Mark's idea about passing a dict derived > from locale, this is no longer an issue? Are hard-coded separators > still on the table? That's a separate discussion, not part of PEP 377. The comma in PEP 377 is hardcoded, just like the decimal point. If formatting becomes more configurable it will be via a new PEP. What I don't get here is that anyone writing "quick and dirty" scripts that still needed locale appropriate output appropriate for non-developer end users* already couldn't use %-formatting or str.format for the task. The decimal point was wrong and there was no way at all to insert a thousands separator. If it's only a matter of localisation, then the locale module can do the job and the affected developers are probably already using it. If it's a matter of internationalisation, then that involves a lot more than just a comma here and there, and again, affected developers will already be using an appropriate tool. The PEP provides a quick way to make big numbers more readable when the intended audience is either the developer themselves (i.e. debugging messages), or an audience of IT types (e.g. system administrators). Yes, it is inadequate in many situations for formatting strings for display to non-developer end users - that isn't a new problem, and PEP 377 doesn't make it any worse than it already was. Cheers, Nick. *(Note that such scripts actually sound neither quick nor dirty to me - as soon as you're producing output for non-developers you have to pay far more attention to the formatting and other presentation aspects, whether those readers are native English speakers or not) -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From rdmurray at bitdance.com Thu Mar 19 12:42:55 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Thu, 19 Mar 2009 11:42:55 +0000 (UTC) Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <200903182239.40862.steve@pearwood.info> <49C0E469.2080207@gmail.com> <8FFC56AA656847D6B599C3544F6560EB@RaymondLaptop1> <49C176C7.7090309@gmail.com> <20090319101220.41e580c7@o> Message-ID: spir wrote: > Le Thu, 19 Mar 2009 08:33:43 +1000, Nick Coghlan > s'exprima ainsi: > > > B. All I can suggest is to try to think of the "commas as separators in > > format()" situation as being in the same vein as that whole "let use > > English keywords where possible" idea :) > > This is a wrong rationale. The readers of python keywords is the community of > pythonistas (*); while the readers of documents produced by apps written in > python can be any kind of people. "1,234,567.89" is more or less illegible But the thing currently approved, using ',' to indicated that thousands separators should be used, is _exactly_ like the keyword situation. It's something that the programmer types and reads. Controlling what character actually gets used in the output is a separate issue that still needs to be addressed, to my understanding. For now, we are defaulting to English, just like usual ;) -- R. David Murray http://www.bitdance.com From eric at trueblade.com Thu Mar 19 12:50:17 2009 From: eric at trueblade.com (Eric Smith) Date: Thu, 19 Mar 2009 07:50:17 -0400 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49C2297E.302@gmail.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49C176C7.7090309@gmail.com> <20090319101220.41e580c7@o> <200903192027.27510.steve@pearwood.info> <49C2297E.302@gmail.com> Message-ID: <49C23179.2090703@trueblade.com> > That's a separate discussion, not part of PEP 377. The comma in PEP 377 > is hardcoded, just like the decimal point. If formatting becomes more > configurable it will be via a new PEP. For the record, it's PEP 378. Eric. From ncoghlan at gmail.com Thu Mar 19 13:04:44 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 19 Mar 2009 22:04:44 +1000 Subject: [Python-ideas] Rough draft: Proposed format specifier for a thousands separator (discussion moved from python-dev) In-Reply-To: <49C23179.2090703@trueblade.com> References: <56EE433347824B66BC1D5FF4E3F47A2C@RaymondLaptop1> <49C176C7.7090309@gmail.com> <20090319101220.41e580c7@o> <200903192027.27510.steve@pearwood.info> <49C2297E.302@gmail.com> <49C23179.2090703@trueblade.com> Message-ID: <49C234DC.7010401@gmail.com> Eric Smith wrote: >> That's a separate discussion, not part of PEP 377. The comma in PEP 377 >> is hardcoded, just like the decimal point. If formatting becomes more >> configurable it will be via a new PEP. > > For the record, it's PEP 378. Sorry about that - got my PEP numbers mixed up (377 is floating around in my brain since I still have to update it with Guido's rejection). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From jh at improva.dk Thu Mar 19 15:38:26 2009 From: jh at improva.dk (Jacob Holm) Date: Thu, 19 Mar 2009 15:38:26 +0100 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49AB1F90.7070201@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> Message-ID: <49C258E2.8050505@improva.dk> Hi Greg Greg Ewing wrote: > I've made another couple of tweaks to the formal semantics > (so as not to over-specify when the iterator methods are > looked up). > > Latest version of the PEP, together with the prototype > implementation and other related material, is available > here: > > http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ > I am working on my own patch, based on rev2 of yours from the above link and the algorithm I have been going on about. It is currently working, and is even slightly faster than yours in every test I have (much faster in some, that was the whole point). I still need to do a bit of cleanup before I throw it to the wolves though... Anyway, I have a few questions/comments to your patch. 1. There is a small refcounting bug in your gen_iternext function. On success, it returns without decref'ing "yf". 2. In the comment for "gen_undelegate" you mention "certain recursive situations" where a generator may lose its frame before we get a chance to clear f_yieldfrom. Can you elaborate? I can't think of any, and haven't been able to catch any with asserts in a debug-build using my own patch. However, if they exist I will need to handle it somehow and knowing what they are would certainly help. 3. It looks like you are not calling "close" properly from "next", "send" and "throw". This makes no difference when delegating to a generator (the missing close would be a no-op), but would be an issue when delegating to a non-generator. 4. It looks like your "gen_close" does not try to throw a GeneratorExit before calling close when delegating to a non-generator. I think it should to match the description of "close" in PEP342 and the expansion in your PEP. Other than that, great work. It would have taken me ages to figure out all the necessary changes to the grammar, parser, ... and so on by myself. In fact I probably wouldn't even have tried. I hope this helps, and promise to publish my own version of the patch once I think it is fit for public consumption. Best regards - Jacob From mrs at mythic-beasts.com Fri Mar 20 00:12:49 2009 From: mrs at mythic-beasts.com (Mark Seaborn) Date: Thu, 19 Mar 2009 23:12:49 +0000 (GMT) Subject: [Python-ideas] CapPython's use of unbound methods In-Reply-To: References: <20090312.202410.846948621.mrs@localhost.localdomain> Message-ID: <20090319.231249.343185657.mrs@localhost.localdomain> Guido van Rossum wrote: > On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn wrote: > > Suppose we have an object x with a private attribute, "_field", > > defined by a class Foo: > > > > class Foo(object): > > > > ? ?def __init__(self): > > ? ? ? ?self._field = "secret" > > > > x = Foo() > > Can you add some principals to this example? Who wrote the Foo class > definition? Does CapPython have access to the source code for Foo? To > the class object? OK, suppose we have two principals, Alice and Bob. Alice receives a string from Bob. Alice instantiates the string using CapPython's safe_eval() function, getting back a module object that contains a function object. Alice passes the function an object x. Alice's intention is that the function should not be able to get hold of the contents of x._field, no matter what string Bob supplies. To make this more concrete, this is what Alice executes, with source_from_bob defined in a string literal for the sake of example: source_from_bob = """ class C: def f(self): return self._field def entry_point(x): C.f(x) # potentially gets the secret object in Python 3.0 """ import safeeval secret = object() class Foo(object): def __init__(self): self._field = secret x = Foo() module = safeeval.safe_eval(source_from_bob, safeeval.Environment()) module.entry_point(x) In this example, Bob's code is not given access to the class object Foo. Furthermore, Bob should not be able to get access to the class Foo from the instance x. The type() builtin is not considered to be safe in CapPython so it is not included in the default environment. Bob's code is not given access to the source code for class Foo. But even if Bob is aware of Alice's source code, it should not affect whether Bob can get hold of the secret object. By the way, you can try out the example by getting the code from the Bazaar repository: bzr branch http://bazaar.launchpad.net/%7Emrs/cappython/trunk cappython > > However, in Python 3.0, the CapPython code can do this: > > > > class C(object): > > > > ? ?def f(self): > > ? ? ? ?return self._field > > > > C.f(x) # returns "secret" > > > > Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is > > not being called on an instance of C. > > In Python 2.x I could write > > class C(Foo): > def f(self): > return self._field In the example above, Bob's code is not given access to Foo, so Bob cannot do this. But you are right, if Bob's code were passed Foo as well as x, Bob could do this. Suppose Alice wanted to give Bob access to class Foo, perhaps so that Bob could create derived classes. It is still possible for Alice to do that safely, if Alice defines Foo differently. Alice can pass the secret object to Foo's constructor instead of having the class definition get its reference to the secret object from an enclosing scope: class Foo(object): def __init__(self, arg): self._field = arg secret = object() x = Foo(secret) module = safeeval.safe_eval(source_from_bob, safeeval.Environment()) module.entry_point(x, Foo) Bob can create his own objects derived from Foo, but cannot use his access to Foo to break encapsulation of instance x. Foo is now authorityless, in the sense that it does not capture "secret" from its enclosing environment, unlike the previous definition. > or alternatively > > class C(x.__class__): > The verifier would reject x.__class__, so this is not possible. > > Guido said, "I don't understand where the function object f gets its > > magic powers". > > > > The answer is that function definitions directly inside class > > statements are treated specially by the verifier. > > Hm, this sounds like a major change in language semantics, and if I > were Sun I'd sue you for using the name "Python" in your product. :-) Damn, the makers of Typed Lambda Calculus had better watch out for legal action from the makers of Lambda Calculus(tm) too... :-) Is it really a major change in semantics if it's just a subset? ;-) To some extent the verifier's check of only accessing private attributes through self is just checking a coding style that I already follow when writing Python code (except sometimes for writing test cases). Of course some of the verifier's checks, such as only allowing attribute assignments through self, are a lot more draconian than coding style checks. > > If you wrote the same function definition at the top level: > > > > def f(var): > > ? ?return var._field # rejected > > > > the attribute access would be rejected by the verifier, because "var" > > is not a self variable, and private attributes may only be accessed > > through self variables. > > > > I renamed the variable in the example, > > What do you mean by this? I just mean that I applied alpha conversion. def f(self): return self._field is equivalent to def f(var): return var._field Whether these function definitions are accepted by the verifier depends on their context. > Do you also catch things like > > g = getattr > s = 'field'.replace('f', '_f') > > print g(x, s) > > ? The default environment doesn't provide the real getattr() function. It provides a wrapped version that rejects private attribute names. Mark From greg.ewing at canterbury.ac.nz Fri Mar 20 01:23:48 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Mar 2009 12:23:48 +1200 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C258E2.8050505@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> Message-ID: <49C2E214.1040003@canterbury.ac.nz> Jacob Holm wrote: > 1. There is a small refcounting bug in your gen_iternext function. On > success, it returns without decref'ing "yf". Thanks, I'll fix that. > 2. In the comment for "gen_undelegate" you mention "certain recursive > situations" where a generator may lose its frame before we get a > chance to clear f_yieldfrom. Can you elaborate? I can't remember the details, but I definitely ran into one during development, which is why I added that function. Have you tried running all of my tests? > 3. It looks like you are not calling "close" properly from "next", > "send" and "throw". I'm not sure what you mean by that. Can you provide an example that doesn't behave as expected? > 4. It looks like your "gen_close" does not try to throw a > GeneratorExit before calling close when delegating to a > non-generator. I'm not sure what you mean here either. Regardless of the type of sub-iterator, it should end up getting to the part which does if (!PyErr_Occurred()) PyErr_SetNone(PyExc_GeneratorExit); Again, and example that doesn't behave properly would help. -- Greg From jh at improva.dk Fri Mar 20 02:04:42 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 20 Mar 2009 02:04:42 +0100 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C2E214.1040003@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> Message-ID: <49C2EBAA.9020106@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: > >> 2. In the comment for "gen_undelegate" you mention "certain recursive >> situations" where a generator may lose its frame before we get a >> chance to clear f_yieldfrom. Can you elaborate? > > I can't remember the details, but I definitely ran into one > during development, which is why I added that function. Have > you tried running all of my tests? Yup. All tests pass, except for your test19 where my traceback is different. > --- expected/test19.py.out 2009-02-22 09:51:26.000000000 +0100 +++ > actual/test19.py.out 2009-03-20 01:50:28.000000000 +0100 @@ -7,8 +7,8 > @@ Traceback (most recent call last): File "test19.py", line 20, in > for y in gi: - File "test19.py", line 16, in g2 - yield from > gi File "test19.py", line 9, in g1 yield from g2() + File "test19.py", > line 16, in g2 + yield from gi ValueError: generator already executing I am not quite sure why that is, but I actually think mine is better. >> 3. It looks like you are not calling "close" properly from "next", >> "send" and "throw". > > I'm not sure what you mean by that. Can you provide an > example that doesn't behave as expected? Sure, see below. >> 4. It looks like your "gen_close" does not try to throw a >> GeneratorExit before calling close when delegating to a >> non-generator. > > I'm not sure what you mean here either. Regardless of the > type of sub-iterator, it should end up getting to the > part which does > > if (!PyErr_Occurred()) > PyErr_SetNone(PyExc_GeneratorExit); > > Again, and example that doesn't behave properly would > help. > Of course. Here is a demonstration/test... class iterator(object): """Simple iterator that counts to n while writing what is done to it""" def __init__(self, n): self.ctr = iter(xrange(n)) def __iter__(self): return self def close(self): print "Close" def next(self): print "Next" return self.ctr.next() def send(self, val): print "Send", val return self.ctr.next() def throw(self, *args): print "Throw:", args return self.ctr.next() def generator(n): yield from iterator(n) g = generator(1) g.next() try: g.next() except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(1) g.next() try: g.send(1) except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(1) g.next() try: g.throw(ValueError) except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(2) g.next() try: g.next() except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(2) g.next() try: g.send(1) except Exception, e: print type(e) else: print 'No exception' del g print '--' g = generator(2) g.next() try: g.throw(ValueError) except Exception, e: print type(e) else: print 'No exception' del g print '--' And here is the output I would expect based on the relevant PEPs. Next Next Close -- Next Send 1 Close -- Next Throw: (,) Close -- Next Next No exception Throw: (,) Close -- Next Send 1 No exception Throw: (,) Close -- Next Throw: (,) No exception Throw: (,) Close -- However, when I run this using your patch, the first 3 "Close" messages, and the 3 "GeneratorExit" messages are missing. Did that help? - Jacob From jh at improva.dk Fri Mar 20 02:07:34 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 20 Mar 2009 02:07:34 +0100 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C2EBAA.9020106@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> Message-ID: <49C2EC56.4020704@improva.dk> Sorry about the garbled diff... Here is the real diff between expected and actual output when I run my patch on test19. - Jacob --- expected/test19.py.out 2009-02-22 09:51:26.000000000 +0100 +++ actual/test19.py.out 2009-03-20 02:06:52.000000000 +0100 @@ -7,8 +7,8 @@ Traceback (most recent call last): File "test19.py", line 20, in for y in gi: - File "test19.py", line 16, in g2 - yield from gi File "test19.py", line 9, in g1 yield from g2() + File "test19.py", line 16, in g2 + yield from gi ValueError: generator already executing From greg.ewing at canterbury.ac.nz Fri Mar 20 02:33:47 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Mar 2009 13:33:47 +1200 Subject: [Python-ideas] Revision to yield-from implementation Message-ID: <49C2F27B.1020102@canterbury.ac.nz> I have uploaded a small revision to my prototype implementation of the yield-from statement to fix a small refcounting bug. http://www.cosc.canterbury.ac.nz/greg.ewing/python/yield-from/ -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 20 05:09:13 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 20 Mar 2009 17:09:13 +1300 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C2EBAA.9020106@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> Message-ID: <49C316E9.1090103@canterbury.ac.nz> Jacob Holm wrote: > Of course. Here is a demonstration/test... > > However, when I run this using your patch, the first 3 "Close" messages, > and the 3 "GeneratorExit" messages are missing. I don't understand why you expect to get the output you present. Can you explain your reasoning with reference to the relevant sections of the relevant PEPs that you mention? -- Greg From jh at improva.dk Fri Mar 20 10:33:58 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 20 Mar 2009 10:33:58 +0100 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C316E9.1090103@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> Message-ID: <49C36306.4040002@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: > >> Of course. Here is a demonstration/test... >> >> However, when I run this using your patch, the first 3 "Close" >> messages, and the 3 "GeneratorExit" messages are missing. > > I don't understand why you expect to get the output > you present. Can you explain your reasoning with > reference to the relevant sections of the relevant > PEPs that you mention? > Starting with "Close". The only reason I excpect *any* "Close" message is that the expansion in your PEP explicitly calls close in the finally clause. It makes no distinction between different ways of exiting the block, so I'd expect one call for each time it is exited. The "GeneratorExit", I expect due to the description of close in PEP 342: def close(self): try: self.throw(GeneratorExit) except (GeneratorExit, StopIteration): pass else: raise RuntimeError("generator ignored GeneratorExit") When the generator is closed (due to the del g lines in the example), this says to throw a GeneratorExit and handle the result. If we do this manually, the throw will be delegated to the iterator, which will print the "Throw: (,)" message. Do I make sense yet? - Jacob From gerald.britton at gmail.com Fri Mar 20 14:54:09 2009 From: gerald.britton at gmail.com (Gerald Britton) Date: Fri, 20 Mar 2009 09:54:09 -0400 Subject: [Python-ideas] Interactive trace function Message-ID: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com> Years ago I worked for a while in REXX. It was a fairly advanced scripting language for its time and we used it on some substantial mainframe projects. One thing that I liked and miss in Python is the "trace" statement. Basically, you could insert a "trace" statement in a REXX program and then, when you ran it, it would step through the program a line at a time. This gave you the chance to follow your code through tricky sections and display variables after each step. I know it saved me hours of troubleshooting of tricky problems. I would like to know if anyone has ever proposed something similar for Python. It would work something like this: 1. In an interactive session, you could issue a trace command. Thereafter, whatever you ran would be done a step at a time, with a terminal prompt after every statement for you to print anything you like that would help you understand the state of the program at that point. 2. From the command line, you could add a --trace option, or something like it, to ask Python to launch the program interactively with trace enabled, which would work as described above. 3. If you have a problematic piece of code, you could insert a trace statement just before the troubled section. Then when you ran the program, when it came to the trace statement, it would begin an interactive trace at that point as described above. (You would have to start your program from the command line for this to make sense.) Has something like this ever come up before? Is there a way to do this today? -- Gerald Britton From jh at improva.dk Fri Mar 20 15:26:23 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 20 Mar 2009 15:26:23 +0100 Subject: [Python-ideas] Interactive trace function In-Reply-To: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com> References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com> Message-ID: <49C3A78F.3040707@improva.dk> Gerald Britton wrote: > Years ago I worked for a while in REXX. It was a fairly advanced > scripting language for its time and we used it on some substantial > mainframe projects. One thing that I liked and miss in Python is the > "trace" statement. > > Basically, you could insert a "trace" statement in a REXX program and > then, when you ran it, it would step through the program a line at a > time. This gave you the chance to follow your code through tricky > sections and display variables after each step. I know it saved me > hours of troubleshooting of tricky problems. > > I would like to know if anyone has ever proposed something similar for > Python. It would work something like this: > > 1. In an interactive session, you could issue a trace command. > Thereafter, whatever you ran would be done a step at a time, with a > terminal prompt after every statement for you to print anything you > like that would help you understand the state of the program at that > point. > > 2. From the command line, you could add a --trace option, or something > like it, to ask Python to launch the program interactively with trace > enabled, which would work as described above. > > 3. If you have a problematic piece of code, you could insert a trace > statement just before the troubled section. Then when you ran the > program, when it came to the trace statement, it would begin an > interactive trace at that point as described above. (You would have to > start your program from the command line for this to make sense.) > > Has something like this ever come up before? Is there a way to do this today? > > > How about: import pdb; pdb.set_trace() I use that for debugging all the time... - Jacob From paul.bedaride at gmail.com Fri Mar 20 15:42:51 2009 From: paul.bedaride at gmail.com (paul bedaride) Date: Fri, 20 Mar 2009 15:42:51 +0100 Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack In-Reply-To: References: Message-ID: Yes it's true you can do easily the pack part with zip(*[iter(l)]*size) you can do the slicing with zip(*[l[i:len(l)-(slice-1-i)] for i in range(slice)]) And you could also do the twice but you get something more complicated. It's also true that with izip you could get iterator. I use this pack function a lot of times in my code, and it's more readable, than the zip version. After, the thing it's to know if people really use this kind of function on list of it's just me (that's it's totally possible). paul bedaride On Fri, Mar 20, 2009 at 3:07 PM, Isaac Morland wrote: > On Fri, 20 Mar 2009, paul bedaride wrote: > >> I propose a new function for list for pack values of a list and >> sliding over them: >> >> then we can do things like this: >> for i, j, k in pack(range(10), 3, partialend=False): >> ? print i, j, k >> >> I propose this because i need a lot of times pack and slide function >> over list and this one >> combine the two in a generator way. > > See the Python documentation for zip(): > > http://docs.python.org/library/functions.html#zip > > And this article in which somebody independently rediscovers the idea: > > http://drj11.wordpress.com/2009/01/28/my-python-dream-about-groups/ > > Summary: except for the "partialend" parameter, this can already be done in > a single line. ?It is not for me to say whether this nevertheless would be > useful as a library routine (if only perhaps to make it easy to specify > "partialend" explicitly). > > It seems to me that sometimes one would want izip instead of zip. ?And I > think you could get the effect of partialend=True in 2.6 by using > izip_longest (except with an iterator result rather than a list). > >> def pack(l, size=2, slide=2, partialend=True): >> ? lenght = len(l) >> ? for p in range(0,lenght-size,slide): >> ? ? ? def packet(): >> ? ? ? ? ? for i in range(size): >> ? ? ? ? ? ? ? yield l[p+i] >> ? ? ? yield packet() >> ? p = p + slide >> ? if partialend or lenght-p == size: >> ? ? ? def packet(): >> ? ? ? ? ? for i in range(lenght-p): >> ? ? ? ? ? ? ? yield l[p+i] >> ? ? ? yield packet() > > Isaac Morland ? ? ? ? ? ? ? ? ? CSCF Web Guru > DC 2554C, x36650 ? ? ? ? ? ? ? ?WWW Software Specialist > From tom at vector-seven.com Fri Mar 20 15:38:39 2009 From: tom at vector-seven.com (Thomas Lee) Date: Sat, 21 Mar 2009 01:38:39 +1100 Subject: [Python-ideas] A read-only, dict-like optparse.Value Message-ID: <49C3AA6F.3080908@vector-seven.com> Hi folks, Would anybody support the idea of read-only dict-like behaviour of "options" for the following code: ==== from optparse import OptionParser parser = OptionParser() parser.add_option("--host", dest="host" default="localhost") parser.add_option("--port", dest="port", default=1234) parser.add_option("--path", dest="path", default="/tmp") options, args = parser.parse_args() ==== As it is, you have to "know" what possible attributes are present on the options (effectively the set of "dest" attributes) -- I often implement something like the following because recently I've had to use command line options in a bunch of format strings: def make_options_dict(options): known_options = ("host", "port", "path") return dict(zip(known_options, [getattr(options, attr) for attr in known_options])) I don't mind having to do this, but having to hard code the options in there feels a bit nasty. Just as useful for my particular use case (i.e passing the options "dict" to a format string) would be something along the lines of options.todict() or dict(options). Even a way to know the set of "dest" attributes that are defined on "options" would be cleaner. e.g. options_dict = dict(zip(options.all(), [getattr(options, attr) for attr in options.all()])) Where options.all() returns all the option "dest" attribute names. Or something to that effect. Any thoughts? Cheers, T From rdmurray at bitdance.com Fri Mar 20 17:55:04 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 20 Mar 2009 16:55:04 +0000 (UTC) Subject: [Python-ideas] A read-only, dict-like optparse.Value References: <49C3AA6F.3080908@vector-seven.com> Message-ID: Thomas Lee wrote: > Hi folks, > > Would anybody support the idea of read-only dict-like behaviour of > "options" for the following code: > > ==== > > from optparse import OptionParser > parser = OptionParser() > parser.add_option("--host", dest="host" default="localhost") > parser.add_option("--port", dest="port", default=1234) > parser.add_option("--path", dest="path", default="/tmp") > options, args = parser.parse_args() > > ==== I presume you know that dest is redundant there? I ask because you wanted to avoid retyping the option names later :) > As it is, you have to "know" what possible attributes are present on the > options (effectively the set of "dest" attributes) -- I often implement > something like the following because recently I've had to use command > line options in a bunch of format strings: > > def make_options_dict(options): > known_options = ("host", "port", "path") > return dict(zip(known_options, [getattr(options, attr) for attr in > known_options])) > > I don't mind having to do this, but having to hard code the options in > there feels a bit nasty. Just as useful for my particular use case (i.e > passing the options "dict" to a format string) would be something along > the lines of options.todict() or dict(options). Even a way to know the > set of "dest" attributes that are defined on "options" would be cleaner. > e.g. Well, given the implementation of optparse, you could do: options.__dict__.items() But exposing the full dictionary interface on options strikes me as a reasonable idea. I don't see any particular reason to make it read-only, either. (NB: The Values class has some...interesting...methods that I wasn't aware of that look somewhat intriguing. And they aren't read-only.) > options_dict = dict(zip(options.all(), [getattr(options, attr) for attr > in options.all()])) > > Where options.all() returns all the option "dest" attribute names. > > Or something to that effect. > > Any thoughts? I don't see any reason not to just duck type the options object as dictionary-like and use the normal dictionary method names to access the information you want. Off the cuff it seems like a good idea to expose an interface to this information. Hmm. Then I could do globals().update(options.items()), which would simplify some of my code :) (Whether or not that is a good idea is a different question!) -- R. David Murray http://www.bitdance.com From python at rcn.com Fri Mar 20 18:01:15 2009 From: python at rcn.com (Raymond Hettinger) Date: Fri, 20 Mar 2009 10:01:15 -0700 Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack References: Message-ID: <9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1> >> I propose a new function for list for pack values of a list and >> sliding over them: >> >> then we can do things like this: >> for i, j, k in pack(range(10), 3, partialend=False): >> print i, j, k . . . >> def pack(l, size=2, slide=2, partialend=True): >> lenght = len(l) >> for p in range(0,lenght-size,slide): >> def packet(): >> for i in range(size): >> yield l[p+i] >> yield packet() >> p = p + slide >> if partialend or lenght-p == size: >> def packet(): >> for i in range(lenght-p): >> yield l[p+i] >> yield packet() This has been discussed before and rejected. There were several considerations. The itertools recipes already include simple patterns for grouper() and pairwise() that are easy to use as primitives in your code or to serve as models for variants. The design of pack() itself is questionable. It attempts to be a Swiss Army Knife by parameterizing all possible variations (length of window, length to slide, and how to handle end-cases). This design makes the tool harder to learn and use, and it makes the implementation more complex. That complexity isn't necessary. Use cases would typically fall into grouper cases where the window length equals the slide length or into cases that slide one element at a time. You don't win anything by combining the two cases except for more making the tool harder to learn and use. The pairwise() recipe could be generalized to larger windows, but seemed like less of a good idea after closely examining potential use cases. For cases that used a larger window, there always seemed to be a better solution than extending pairwise(). For instance, a twenty-day moving average is better implemented with a deque(maxlen=20) and a running sum than with an iterator returning tuples of length twenty -- that approach does a lot of unnecessary work shifting elements in the tuple, turning an O(n) process into an O(m*n) process. For short windows, like pairwise() itself, the issue is not one of total running time; instead, the problem is that almost every proposed use case was better coded as a simple Python loop, saving the value previous values with a step like: oldvalue = value. Having pairwise() or tripletwise() tended to be a distraction away from better solutions. Also, the pure python approach was more general as it allowed accumulations: total += value. While your proposed function has been re-invented a number of times, that doesn't mean it's a good idea. It is more an exercise in what can be done, not in what should be done. Raymond From rdmurray at bitdance.com Fri Mar 20 18:02:27 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 20 Mar 2009 17:02:27 +0000 (UTC) Subject: [Python-ideas] Interactive trace function References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com> <49C3A78F.3040707@improva.dk> Message-ID: Jacob Holm wrote: > Gerald Britton wrote: > > Years ago I worked for a while in REXX. It was a fairly advanced > > scripting language for its time and we used it on some substantial > > mainframe projects. One thing that I liked and miss in Python is the > > "trace" statement. > > > > Basically, you could insert a "trace" statement in a REXX program and > > then, when you ran it, it would step through the program a line at a > > time. This gave you the chance to follow your code through tricky > > sections and display variables after each step. I know it saved me > > hours of troubleshooting of tricky problems. > > > > I would like to know if anyone has ever proposed something similar for > > Python. It would work something like this: > > > > 1. In an interactive session, you could issue a trace command. > > Thereafter, whatever you ran would be done a step at a time, with a > > terminal prompt after every statement for you to print anything you > > like that would help you understand the state of the program at that > > point. > > > > 2. From the command line, you could add a --trace option, or something > > like it, to ask Python to launch the program interactively with trace > > enabled, which would work as described above. > > > > 3. If you have a problematic piece of code, you could insert a trace > > statement just before the troubled section. Then when you ran the > > program, when it came to the trace statement, it would begin an > > interactive trace at that point as described above. (You would have to > > start your program from the command line for this to make sense.) > > > > Has something like this ever come up before? Is there a way to do this today? > > > > > > > How about: > > import pdb; pdb.set_trace() > > I use that for debugging all the time... I just learned about this one, which is also sometimes useful (when you _don't_ want the interactive prompt, you just want to see the sequence of execution): python -m trace -t -- R. David Murray http://www.bitdance.com From paul.bedaride at gmail.com Fri Mar 20 18:32:32 2009 From: paul.bedaride at gmail.com (paul bedaride) Date: Fri, 20 Mar 2009 18:32:32 +0100 Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack In-Reply-To: <9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1> References: <9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1> Message-ID: Now I discover itertools I thing your are right, but maybe the pack function could be rename iwinslice (at the end it's its real name), and add it to itertools ?? paul bedaride On Fri, Mar 20, 2009 at 6:01 PM, Raymond Hettinger wrote: > >>> I propose a new function for list for pack values of a list and >>> sliding over them: >>> >>> then we can do things like this: >>> for i, j, k in pack(range(10), 3, partialend=False): >>> print i, j, k > > . . . >>> >>> def pack(l, size=2, slide=2, partialend=True): >>> lenght = len(l) >>> for p in range(0,lenght-size,slide): >>> def packet(): >>> for i in range(size): >>> yield l[p+i] >>> yield packet() >>> p = p + slide >>> if partialend or lenght-p == size: >>> def packet(): >>> for i in range(lenght-p): >>> yield l[p+i] >>> yield packet() > > This has been discussed before and rejected. > > There were several considerations. ?The itertools recipes already include > simple patterns for grouper() and pairwise() that are easy > to use as primitives in your code or to serve as models for variants. > The design of pack() itself is questionable. ?It attempts to be a Swiss Army > Knife by parameterizing all possible variations > (length of window, length to slide, and how to handle end-cases). > This design makes the tool harder to learn and use, and it makes > the implementation more complex. > That complexity isn't necessary. ?Use cases would typically fall > into grouper cases where the window length equals the slide > length or into cases that slide one element at a time. ?You don't > win anything by combining the two cases except for more making > the tool harder to learn and use. > > The pairwise() recipe could be generalized to larger windows, > but seemed like less of a good idea after closely examining potential > use cases. ?For cases that used a larger window, there always > seemed to be a better solution than extending pairwise(). ?For > instance, a twenty-day moving average is better implemented with > a deque(maxlen=20) and a running sum than with an iterator > returning tuples of length twenty -- that approach does a lot of > unnecessary work shifting elements in the tuple, turning an > O(n) process into an O(m*n) process. > > For short windows, like pairwise() itself, the issue is not one of > total running time; instead, the problem is that almost every proposed use > case was better coded as a simple Python loop, > saving the value previous values with a step like: ?oldvalue = value. > Having pairwise() or tripletwise() tended to be a distraction away > from better solutions. ?Also, the pure python approach was more > general as it allowed accumulations: ?total += value. > While your proposed function has been re-invented a number of > times, that doesn't mean it's a good idea. ?It is more an exercise > in what can be done, not in what should be done. > > > Raymond > From gerald.britton at gmail.com Fri Mar 20 18:53:36 2009 From: gerald.britton at gmail.com (Gerald Britton) Date: Fri, 20 Mar 2009 13:53:36 -0400 Subject: [Python-ideas] Interactive trace function In-Reply-To: References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com> <49C3A78F.3040707@improva.dk> Message-ID: <5d1a32000903201053y752849edw96397d9e74b7dba1@mail.gmail.com> Thanks for the tip! btw, did you get the --ignore-dir option to work? I'm trying to run this on a large system (hundreds of modules) and ignore most of the modules and packages to focus on just a few. However, it processes all of them anyway On Fri, Mar 20, 2009 at 1:02 PM, R. David Murray wrote: > Jacob Holm wrote: >> Gerald Britton wrote: >> > Years ago I worked for a while in REXX. ?It was a fairly advanced >> > scripting language for its time and we used it on some substantial >> > mainframe projects. ?One thing that I liked and miss in Python is the >> > "trace" statement. >> > >> > Basically, you could insert a "trace" statement in a REXX program and >> > then, when you ran it, it would step through the program a line at a >> > time. ?This gave you the chance to follow your code through tricky >> > sections and display variables after each step. ?I know it saved me >> > hours of troubleshooting of tricky problems. >> > >> > I would like to know if anyone has ever proposed something similar for >> > Python. ?It would work something like this: >> > >> > 1. In an interactive session, you could issue a trace command. >> > Thereafter, whatever you ran would be done a step at a time, with a >> > terminal prompt after every statement for you to print anything you >> > like that would help you understand the state of the program at that >> > point. >> > >> > 2. From the command line, you could add a --trace option, or something >> > like it, to ask Python to launch the program interactively with trace >> > enabled, which would work as described above. >> > >> > 3. If you have a problematic piece of code, you could insert a trace >> > statement just before the troubled section. Then when you ran the >> > program, when it came to the trace statement, it would begin an >> > interactive trace at that point as described above. (You would have to >> > start your program from the command line for this to make sense.) >> > >> > Has something like this ever come up before? ?Is there a way to do this today? >> > >> > >> > >> How about: >> >> import pdb; pdb.set_trace() >> >> I use that for debugging all the time... > > I just learned about this one, which is also sometimes useful (when you > _don't_ want the interactive prompt, you just want to see the sequence > of execution): > > ? ?python -m trace -t > > -- > R. David Murray ? ? ? ? ? http://www.bitdance.com > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Gerald Britton From rdmurray at bitdance.com Fri Mar 20 19:02:50 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 20 Mar 2009 14:02:50 -0400 (EDT) Subject: [Python-ideas] Interactive trace function In-Reply-To: <5d1a32000903201053y752849edw96397d9e74b7dba1@mail.gmail.com> References: <5d1a32000903200654y58670e8dja7340bb0757c276f@mail.gmail.com> <49C3A78F.3040707@improva.dk> <5d1a32000903201053y752849edw96397d9e74b7dba1@mail.gmail.com> Message-ID: On Fri, 20 Mar 2009 at 13:53, Gerald Britton wrote: > On Fri, Mar 20, 2009 at 1:02 PM, R. David Murray wrote: >> I just learned about this one, which is also sometimes useful (when you >> _don't_ want the interactive prompt, you just want to see the sequence >> of execution): >> >> ? ?python -m trace -t > > Thanks for the tip! btw, did you get the --ignore-dir option to work? > I'm trying to run this on a large system (hundreds of modules) and > ignore most of the modules and packages to focus on just a few. > However, it processes all of them anyway [top posting fixed for clarity] Haven't tried that option, I've only used it for small programs so far. -- R. David Murray http://www.bitdance.com From josiah.carlson at gmail.com Fri Mar 20 20:09:11 2009 From: josiah.carlson at gmail.com (Josiah Carlson) Date: Fri, 20 Mar 2009 12:09:11 -0700 Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack In-Reply-To: References: <9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1> Message-ID: iwinslice() is just as bad of a name as any of the others. I have seen the equivalent of window(iterator, size=2, step=1), which works as you would expect (both as the output, as well as the implementation), with size and step both limited to 5 (because if you are doing things with more than 5 items at a time...you probably really want something else, and in certain cases, you can use multiple window calls to compose larger groups). I'd be a -0 on the feature, because as Raymond says, it's trivial to implement with a deque. And as I've said before, not all x line functions should be built-in. - Josiah On Fri, Mar 20, 2009 at 10:32 AM, paul bedaride wrote: > Now I discover itertools I thing your are right, but maybe the pack > function could be > rename iwinslice (at the end it's its real name), and add it to itertools ?? > > paul bedaride > > On Fri, Mar 20, 2009 at 6:01 PM, Raymond Hettinger wrote: >> >>>> I propose a new function for list for pack values of a list and >>>> sliding over them: >>>> >>>> then we can do things like this: >>>> for i, j, k in pack(range(10), 3, partialend=False): >>>> print i, j, k >> >> . . . >>>> >>>> def pack(l, size=2, slide=2, partialend=True): >>>> lenght = len(l) >>>> for p in range(0,lenght-size,slide): >>>> def packet(): >>>> for i in range(size): >>>> yield l[p+i] >>>> yield packet() >>>> p = p + slide >>>> if partialend or lenght-p == size: >>>> def packet(): >>>> for i in range(lenght-p): >>>> yield l[p+i] >>>> yield packet() >> >> This has been discussed before and rejected. >> >> There were several considerations. ?The itertools recipes already include >> simple patterns for grouper() and pairwise() that are easy >> to use as primitives in your code or to serve as models for variants. >> The design of pack() itself is questionable. ?It attempts to be a Swiss Army >> Knife by parameterizing all possible variations >> (length of window, length to slide, and how to handle end-cases). >> This design makes the tool harder to learn and use, and it makes >> the implementation more complex. >> That complexity isn't necessary. ?Use cases would typically fall >> into grouper cases where the window length equals the slide >> length or into cases that slide one element at a time. ?You don't >> win anything by combining the two cases except for more making >> the tool harder to learn and use. >> >> The pairwise() recipe could be generalized to larger windows, >> but seemed like less of a good idea after closely examining potential >> use cases. ?For cases that used a larger window, there always >> seemed to be a better solution than extending pairwise(). ?For >> instance, a twenty-day moving average is better implemented with >> a deque(maxlen=20) and a running sum than with an iterator >> returning tuples of length twenty -- that approach does a lot of >> unnecessary work shifting elements in the tuple, turning an >> O(n) process into an O(m*n) process. >> >> For short windows, like pairwise() itself, the issue is not one of >> total running time; instead, the problem is that almost every proposed use >> case was better coded as a simple Python loop, >> saving the value previous values with a step like: ?oldvalue = value. >> Having pairwise() or tripletwise() tended to be a distraction away >> from better solutions. ?Also, the pure python approach was more >> general as it allowed accumulations: ?total += value. >> While your proposed function has been re-invented a number of >> times, that doesn't mean it's a good idea. ?It is more an exercise >> in what can be done, not in what should be done. >> >> >> Raymond >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From g.brandl at gmx.net Fri Mar 20 20:14:43 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 20 Mar 2009 20:14:43 +0100 Subject: [Python-ideas] A read-only, dict-like optparse.Value In-Reply-To: <49C3AA6F.3080908@vector-seven.com> References: <49C3AA6F.3080908@vector-seven.com> Message-ID: Thomas Lee schrieb: > Hi folks, > > Would anybody support the idea of read-only dict-like behaviour of > "options" for the following code: > > ==== > > from optparse import OptionParser > parser = OptionParser() > parser.add_option("--host", dest="host" default="localhost") > parser.add_option("--port", dest="port", default=1234) > parser.add_option("--path", dest="path", default="/tmp") > options, args = parser.parse_args() > > ==== > > As it is, you have to "know" what possible attributes are present on the > options (effectively the set of "dest" attributes) -- I often implement > something like the following because recently I've had to use command > line options in a bunch of format strings: > > def make_options_dict(options): > known_options = ("host", "port", "path") > return dict(zip(known_options, [getattr(options, attr) for attr in > known_options])) Perhaps options_dict = vars(options) already does what you want? optparse seems to set nonpresent attributes for options to None. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From jh at improva.dk Fri Mar 20 20:32:46 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 20 Mar 2009 20:32:46 +0100 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C36306.4040002@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> Message-ID: <49C3EF5E.1050807@improva.dk> Jacob Holm wrote: > The "GeneratorExit", I expect due to the description of close in PEP 342: > > def close(self): > try: > self.throw(GeneratorExit) > except (GeneratorExit, StopIteration): > pass > else: > raise RuntimeError("generator ignored GeneratorExit") > > When the generator is closed (due to the del g lines in the example), > this says to throw a GeneratorExit and handle the result. If we do > this manually, the throw will be delegated to the iterator, which will > print the "Throw: (,)" message. > It turns out I was wrong about the GeneratorExit. What I missed is that starting from 2.6, GeneratorExit no longer subclasses Exception, and so it wouldn't be thrown at the iterator. So move along, nothing to see here ... :) - Jacob From fredrik.johansson at gmail.com Fri Mar 20 21:03:55 2009 From: fredrik.johansson at gmail.com (Fredrik Johansson) Date: Fri, 20 Mar 2009 21:03:55 +0100 Subject: [Python-ideas] Builtin test function In-Reply-To: <200903192148.54461.steve@pearwood.info> References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com> <200903192148.54461.steve@pearwood.info> Message-ID: <3d0cebfb0903201303q12b27e2atc5597a4b012286ee@mail.gmail.com> On Thu, Mar 19, 2009 at 11:48 AM, Steven D'Aprano wrote: > On Thu, 19 Mar 2009 08:59:08 pm Fredrik Johansson wrote: >> There's been some discussion about automatic test discovery lately. >> Here's a random (not in any way thought through) idea: add a builtin >> function test() that runs tests associated with a given function, >> class, module, or object. > > Improved testing is always welcome, but why a built-in? > > I know testing is important, but is it so common and important that we > need it at our fingertips, so to speak, and can't even import a module > first before running tests? What's the benefit to making it a built-in > instead of part of a test module? It would just be a convenience, and I'm just throwing the idea out. The advantage would be a uniform and very simple interface for testing any module, without having to know whether I should import doctest, unittest or something else (and having to remember the commands used by each framework). It would certainly not be a replacement for more advanced test frameworks. Fredrik From rdmurray at bitdance.com Fri Mar 20 22:02:33 2009 From: rdmurray at bitdance.com (R. David Murray) Date: Fri, 20 Mar 2009 21:02:33 +0000 (UTC) Subject: [Python-ideas] file.read() doesn't read the whole file References: <676ac298-0a44-4820-80dd-166a0363d45f@y38g2000prg.googlegroups.com> Message-ID: Sreejith K wrote: > I'm using the above codes in a pthon-fuse's file class's read > function. The offset and length are 0 and 4096 respectively for my > test inputs. When I open a file and read the 4096 bytes from offset, > only a few lines are printed, not the whole file. Actually the file is > only a few bytes. But when I tried reading from the Interactive mode > of python it gave the whole file. > > Is there any problem using read() method in fuse-python ? > > Also statements like break and continue behaves weirdly in fuse > functions. Any help is appreciated.... If you think break and continue are behaving differently in a python program that is providing a fuse filesystem implementation, then your understanding of what your code is doing is faulty in some fashion. The fact that python is being called from fuse isn't going to change the semantics of the language. So I think you need to do some debugging to understand what's actually going on when your code gets called. As someone else suggested, if you are perceiving that the data read is short because of what you see at the os level when reading the data from the fuse-plus-your-application filesystem, that is after your python code returns the data to the fuse infrastructure, then that is probably where your problem is and not in the python read itself. (I'm assuming here that the read in question is taking place in a python method called from fuse and is reading real data from a real file...if that assumption is wrong and you are actually reading from a file _provided through_ fuse, then you need to look to your fuse file system implementation for answers.) -- R. David Murray http://www.bitdance.com From tom at vector-seven.com Fri Mar 20 23:21:42 2009 From: tom at vector-seven.com (Thomas Lee) Date: Sat, 21 Mar 2009 09:21:42 +1100 Subject: [Python-ideas] A read-only, dict-like optparse.Value In-Reply-To: References: <49C3AA6F.3080908@vector-seven.com> Message-ID: <49C416F6.3040009@vector-seven.com> Thanks Georg, this is pretty much exactly what I was looking for. Somehow I had never heard of the vars builtin before! Regards, Tom Georg Brandl wrote: > Thomas Lee schrieb: > >> Hi folks, >> >> Would anybody support the idea of read-only dict-like behaviour of >> "options" for the following code: >> >> ==== >> >> from optparse import OptionParser >> parser = OptionParser() >> parser.add_option("--host", dest="host" default="localhost") >> parser.add_option("--port", dest="port", default=1234) >> parser.add_option("--path", dest="path", default="/tmp") >> options, args = parser.parse_args() >> >> ==== >> >> As it is, you have to "know" what possible attributes are present on the >> options (effectively the set of "dest" attributes) -- I often implement >> something like the following because recently I've had to use command >> line options in a bunch of format strings: >> >> def make_options_dict(options): >> known_options = ("host", "port", "path") >> return dict(zip(known_options, [getattr(options, attr) for attr in >> known_options])) >> > > Perhaps > > options_dict = vars(options) > > already does what you want? optparse seems to set nonpresent attributes > for options to None. > > Georg > > From greg.ewing at canterbury.ac.nz Sat Mar 21 01:35:02 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Mar 2009 12:35:02 +1200 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C3EF5E.1050807@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> Message-ID: <49C43636.9080402@canterbury.ac.nz> > Jacob Holm wrote: > >> The "GeneratorExit", I expect due to the description of close in PEP 342: >> >> def close(self): >> try: >> self.throw(GeneratorExit) >> except (GeneratorExit, StopIteration): >> pass >> else: >> raise RuntimeError("generator ignored GeneratorExit") Hmmm... well, my PEP kind of supersedes that when a yield-from is in effect, because it specifies that the subiterator is finalized first by attempting to call its 'close' method, not by throwing GeneratorExit into it. After that, GeneratorExit is used to finalize the delegating generator. The reasoning is that GeneratorExit is an implementation detail of generators, not something iterators in general should be expected to deal with. -- Greg From jh at improva.dk Sat Mar 21 02:04:05 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 21 Mar 2009 02:04:05 +0100 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C43636.9080402@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> Message-ID: <49C43D05.3010903@improva.dk> Greg Ewing wrote: >> Jacob Holm wrote: >> >>> The "GeneratorExit", I expect due to the description of close in PEP >>> 342: >>> >>> def close(self): >>> try: >>> self.throw(GeneratorExit) >>> except (GeneratorExit, StopIteration): >>> pass >>> else: >>> raise RuntimeError("generator ignored GeneratorExit") > > Hmmm... well, my PEP kind of supersedes that when a yield-from > is in effect, because it specifies that the subiterator is > finalized first by attempting to call its 'close' method, not > by throwing GeneratorExit into it. After that, GeneratorExit is > used to finalize the delegating generator. > > The reasoning is that GeneratorExit is an implementation > detail of generators, not something iterators in general should > be expected to deal with. > As already mentioned in another mail to this list (maybe you missed it?), the expansion in your PEP actually has the behaviour you expect for the GeneratorExit example because GeneratorExit doesn't inherit from Exception. No need to redefine anything here. Your patch is right, I was wrong, end of story... The other mismatch, concerning the missing "close" calls to the iterator, I still believe to be an issue. It is debatable whether the issue is mostly with the PEP or the implementation, but they don't match up as it is... - Jacob From greg.ewing at canterbury.ac.nz Sat Mar 21 02:44:20 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 21 Mar 2009 13:44:20 +1200 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C43D05.3010903@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> Message-ID: <49C44674.5030107@canterbury.ac.nz> Jacob Holm wrote: > the expansion in your PEP actually has the behaviour you expect > for the GeneratorExit example because GeneratorExit doesn't inherit from > Exception. That's an accident, though, and it's possible I should have specified BaseException there. I still consider the explanation I gave to be the true one. > The other mismatch, concerning the missing "close" calls to the > iterator, I still believe to be an issue. Can you elaborate on that? I thought a first you were expecting the implicit close of the generator that happens before it's deallocated to be passed on to the subiterator, but some of your examples seem to have the close happening *before* the del gen, so I'm confused. -- Greg From ncoghlan at gmail.com Sat Mar 21 05:35:12 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 21 Mar 2009 14:35:12 +1000 Subject: [Python-ideas] [Python-Dev] Proposal: new list function: pack In-Reply-To: References: <9D6E7ADA1C5A475B8B9BBA01C3F94447@RaymondLaptop1> Message-ID: <49C46E80.3060808@gmail.com> Josiah Carlson wrote: > iwinslice() is just as bad of a name as any of the others. > > I have seen the equivalent of window(iterator, size=2, step=1), which > works as you would expect (both as the output, as well as the > implementation), with size and step both limited to 5 (because if you > are doing things with more than 5 items at a time...you probably > really want something else, and in certain cases, you can use multiple > window calls to compose larger groups). Oops, I didn't realise this thread had moved over here, so I just repeated what yourself and Raymond said over on python-dev. Oh well... > I'd be a -0 on the feature, because as Raymond says, it's trivial to > implement with a deque. And as I've said before, not all x line > functions should be built-in. That does raise the possibility of adding "iterator windowing done right" by including a deque based implementation in itertools (or at least in the itertools recipes page). For example, the following continuously yields the same deque, but each time the contents represent a new window onto the underlying data: from collections import deque def window (iterable, size=2, step=1, overlap=0): itr = iter(iterable) new_per_window = size - overlap contents = deque(islice(itr, 0, size*step, step), size) while True: yield contents new_data = list(islice(itr, 0, new_per_window*step, step)) if len(new_data) < new_per_window: break contents.extend(new_data) (There are other ways of doing it that involve less data copying, but the above way seems to be the most straightforward) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From jh at improva.dk Sat Mar 21 12:58:51 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 21 Mar 2009 12:58:51 +0100 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C44674.5030107@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> Message-ID: <49C4D67B.4010109@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: >> the expansion in your PEP actually has the behaviour you expect for >> the GeneratorExit example because GeneratorExit doesn't inherit from >> Exception. > > That's an accident, though, and it's possible I should have > specified BaseException there. I still consider the explanation > I gave to be the true one. In that case, I think a clarification in the PEP would be in order. I like the fact that the PEP-342 description of close does the right thing though. If you want BaseException instead of Exception in the PEP, maybe you could replace the: except Exception, _e: line with: except GeneratorExit: raise except BaseException, _e: This would make it clearer that the behavior of close is intentional, and would still allow delegating the throw of any exception not inheriting from GeneratorExit to the subiterator. > >> The other mismatch, concerning the missing "close" calls to the >> iterator, I still believe to be an issue. > > Can you elaborate on that? I thought a first you were expecting > the implicit close of the generator that happens before it's > deallocated to be passed on to the subiterator, but some of your > examples seem to have the close happening *before* the del gen, > so I'm confused. > Yes, I can see that the use of implicit close in that example was a mistake, and that I should have added a few more output lines to clarify the intent. The close is definitely intended to happen before the del in the examples. I have a better example here, with inline comments explaining what I think should happen at critical points (and why): class iterator(object): """Simple iterator that counts to n while writing what is done to it""" def __init__(self, n): self.ctr = iter(xrange(n)) def __iter__(self): return self def close(self): print "Close" def next(self): print "Next" return self.ctr.next() # no send method! # no throw method! def generator(n): try: print "Getting first value from iterator" result = yield from iterator(n) print "Iterator returned", result finally: print "Generator closing" g = generator(1) g.next() try: print "Calling g.next()" # This causes a StopIteration in iterator.next(). After grabbing # the value in the "except StopIteration" clause of the PEP # expansion, the "finally" clause calls iterator.close(). Any # other exception raised by next (or by send or throw if the # iterator had those) would also be handled by the finally # clause. For well-behaved iterators, these calls to close would # be no-ops, but they are still required by the PEP as written. g.next() except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' g = generator(1) g.next() try: print "Calling g.send(42)" # This causes an AttributeError when looking up the "send" method. # The finally clause from the PEP expansion makes sure # iterator.close() is called. This call is *not* expected to be a # no-op. g.send(42) except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' g = generator(1) g.next() try: print "Calling g.throw(ValueError)" # Since iterator does not have a "throw" method, the ValueError is # raised directly in the yield-from expansion in the generator. # The finally clause ensures that iterator.close() is called. # This call is *not* expected to be a no-op. g.throw(ValueError) except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' g = generator(1) g.next() try: print "Calling g.throw(StopIteration(42))" # The iterator still does not have a "throw" method, so the # StopIteration is raised directly in the yield-from expansion. # Then the exception is caught and converted to a value for the # yield-from expression. Before the generator sees the value, the # finally clause makes sure that iterator.close() is called. This # call is *not* expected to be a no-op. g.throw(StopIteration(42)) except Exception, e: print type(e) else: print 'No exception' # This close should be a no-op. The exception we just caught should # have already closed the generator. g.close() print '--' There is really four examples here. The first one is essentially the same as last time, I just expanded the output a bit. The next two examples are corner cases where the missing close makes a real difference, even for well-behaved iterators (this is not the case in the first example). The fourth example catches a bug in the current version of my patch, and shows a potentially interesting use of an iterator without a send method in a yield-from expression. The issue i have with your patch is that iterator.close() is not called in any of the four examples, even though my reading of the PEP suggests it should be. (I have confirmed that my reading matches the PEP by manually replacing the yield-from in the generator with the expansion from the PEP, just to be sure...) The expected output is: Getting first value from iterator Next Calling g.next() Next Close Iterator returned None Generator closing -- Getting first value from iterator Next Calling g.send(42) Close Generator closing -- Getting first value from iterator Next Calling g.throw(ValueError) Close Generator closing -- Getting first value from iterator Next Calling g.throw(StopIteration(42)) Close Iterator returned 42 Generator closing -- From greg.ewing at canterbury.ac.nz Sat Mar 21 22:43:54 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Mar 2009 09:43:54 +1200 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C4D67B.4010109@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> Message-ID: <49C55F9A.6070305@canterbury.ac.nz> Jacob Holm wrote: > # This causes a StopIteration in iterator.next(). After grabbing > # the value in the "except StopIteration" clause of the PEP > # expansion, the "finally" clause calls iterator.close(). Okay, I see what you mean now. That's a bug in the expansion. Once an iterator has raised StopIteration, it has presumably already finalized itself, so calling its close() method shouldn't be necessary, and I hadn't intended that it should be called in that case. I'll update the PEP accordingly. -- Greg From greg.ewing at canterbury.ac.nz Sun Mar 22 00:15:56 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Mar 2009 11:15:56 +1200 Subject: [Python-ideas] Yield-From: Revamped expansion In-Reply-To: <49C55F9A.6070305@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> Message-ID: <49C5752C.2080704@canterbury.ac.nz> I'm thinking about replacing the expansion with the following, which hopefully fixes a couple of concerns that were raised recently without breaking anything else. Can anyone see any remaining ways in which it doesn't match the textual description in the Proposal section? (It still isn't *quite* right, because it doesn't distinguish between a GeneratorExit explicitly thrown in and one resulting from calling close() on the delegating generator. I may need to revise the text and/or my implementation on that point, because I want the inline-expansion interpretation to hold.) _i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except GeneratorExit: _m = getattr(_i, 'close', None) if _m is not None: _m() raise except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r -- Greg From ncoghlan at gmail.com Sun Mar 22 00:22:00 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 22 Mar 2009 09:22:00 +1000 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C55F9A.6070305@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> Message-ID: <49C57698.7030808@gmail.com> Greg Ewing wrote: > Jacob Holm wrote: > >> # This causes a StopIteration in iterator.next(). After grabbing >> # the value in the "except StopIteration" clause of the PEP >> # expansion, the "finally" clause calls iterator.close(). > > Okay, I see what you mean now. That's a bug in the expansion. > Once an iterator has raised StopIteration, it has presumably > already finalized itself, so calling its close() method > shouldn't be necessary, and I hadn't intended that it should > be called in that case. close() *should* still be called in that case - the current expansion in the PEP is correct. It is the *iterator's* job to make sure that multiple calls to close() (or calling close() on a finished iterator) don't cause problems. The syntax shouldn't be trying to second guess whether or not calling close() is necessary or not - it should just be calling it, period. >>> def gen(): ... yield 1 ... >>> g = gen() >>> g.next() 1 >>> g.next() Traceback (most recent call last): File "", line 1, in StopIteration >>> g.close() >>> g.close() >>> g2 = gen() >>> g.close() >>> g.close() >>> g3 = gen() >>> g3.next() 1 >>> g.close() >>> g.close() Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sun Mar 22 00:28:27 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 22 Mar 2009 09:28:27 +1000 Subject: [Python-ideas] Yield-From: Revamped expansion In-Reply-To: <49C5752C.2080704@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C5752C.2080704@canterbury.ac.nz> Message-ID: <49C5781B.30102@gmail.com> Greg Ewing wrote: > I'm thinking about replacing the expansion with the > following, which hopefully fixes a couple of concerns > that were raised recently without breaking anything else. > > Can anyone see any remaining ways in which it doesn't > match the textual description in the Proposal section? > > (It still isn't *quite* right, because it doesn't > distinguish between a GeneratorExit explicitly thrown > in and one resulting from calling close() on the > delegating generator. I may need to revise the text > and/or my implementation on that point, because I want > the inline-expansion interpretation to hold.) > > _i = iter(EXPR) > try: > _u = _i.next() > except StopIteration, _e: > _r = _e.value > else: > while 1: > try: > _v = yield _u > except GeneratorExit: > _m = getattr(_i, 'close', None) > if _m is not None: > _m() > raise > except BaseException, _e: > _m = getattr(_i, 'throw', None) > if _m is not None: > _u = _m(_e) > else: > raise > else: > try: > if _v is None: > _u = _i.next() > else: > _u = _i.send(_v) > except StopIteration, _e: > _r = _e.value > break > RESULT = _r > I'd adjust the inner exception handlers to exploit the fact that SystemExit and GeneratorExit don't inherit from BaseException: _i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except Exception, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise except: # Covers SystemExit, GeneratorExit and # anything else that doesn't inherit # from Exception _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r I think Antoine and PJE are right that the PEP needs some more actual use cases though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From greg.ewing at canterbury.ac.nz Sun Mar 22 09:15:42 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Mar 2009 20:15:42 +1200 Subject: [Python-ideas] Revised**7 PEP on Yield-From In-Reply-To: <49C57698.7030808@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> Message-ID: <49C5F3AE.4060402@canterbury.ac.nz> Nick Coghlan wrote: > The syntax shouldn't be trying to second guess > whether or not calling close() is necessary or not - it should just be > calling it, period. But *why* should it be called? Just as calling close() after the iterator has finished shouldn't do any harm, *not* doing so shouldn't do any harm either, and some implementation strategies (my current one included) would have to go out of their way to call close() in that case. -- Greg From greg.ewing at canterbury.ac.nz Sun Mar 22 09:23:11 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Mar 2009 20:23:11 +1200 Subject: [Python-ideas] Yield-From: Revamped expansion In-Reply-To: <49C5781B.30102@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C5752C.2080704@canterbury.ac.nz> <49C5781B.30102@gmail.com> Message-ID: <49C5F56F.7060001@canterbury.ac.nz> Nick Coghlan wrote: > I'd adjust the inner exception handlers to exploit the fact that > SystemExit and GeneratorExit don't inherit from BaseException: But then anything thrown in that didn't inherit from Exception would bypass giving the subiterator a chance to handle it, which doesn't seem right. The more I think about this, the more I'm wondering whether I shouldn't ever try to call close() on the subiterator at all, and just rely on it to finalize itself when it's deallocated. That would solve all problems concerning when and if close() calls should be made (the answer would be "never"). It would also avoid the problem of a partially exhausted iterator that's still in use by something else getting prematurely finalized, which is another thing that's been bothering me. Here's another expansion based on that idea. When we've finished with the subiterator for whatever reason -- it raised StopIteration, something got thrown in, we got closed ourselves, etc. -- we simply drop our reference to it. If that causes it to be deallocated, it's responsible for cleaning itself up however it sees fit. _i = iter(EXPR) try: try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break finally: del _i RESULT = _r > I think Antoine and PJE are right that the PEP needs some more actual > use cases though. The examples I have are a bit big to put in the PEP itself, but I can include some links. -- Greg From greg.ewing at canterbury.ac.nz Sun Mar 22 10:26:08 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 22 Mar 2009 21:26:08 +1200 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C5F3AE.4060402@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> Message-ID: <49C60430.7030108@canterbury.ac.nz> I'm having trouble making up my mind how GeneratorExit should be handled. My feeling is that GeneratorExit is a peculiarity of generators that other kinds of iterators shouldn't have to know about. So, if you close() a generator, that shouldn't imply throwing GeneratorExit into the subiterator -- rather, the subiterator should simply be dropped and then the delegating generator finalized as usual. If the subiterator happens to be another generator, dropping the last reference to it will cause it to be closed, in which case it will raise its own GeneratorExit. Other kinds of iterators can finalize themselves however they see fit, and don't need to pretend they're generators and understand GeneratorExit. For consistency, this implies that a GeneratorExit explicitly thrown in using throw() shouldn't be forwarded to the subiterator either, even if it has a throw() method. To do otherwise would require making a distinction that can't be expressed in the Python expansion. Also, it seems elegant to preserve the property that if g is a generator then g.close() and g.throw(GeneratorExit) are exactly equivalent. What do people think about this? -- Greg From ncoghlan at gmail.com Sun Mar 22 12:55:50 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 22 Mar 2009 21:55:50 +1000 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C60430.7030108@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> Message-ID: <49C62746.7080200@gmail.com> Greg Ewing wrote: > To do otherwise would require making a distinction that > can't be expressed in the Python expansion. Also, it > seems elegant to preserve the property that if g is a > generator then g.close() and g.throw(GeneratorExit) are > exactly equivalent. > > What do people think about this? That whole question is why I suggested rephrasing the question of which exceptions are passed to the subiterator in Exception vs BaseException terms. The only acknowledged direct subclasses of BaseException are KeyboardInterrupt, SystemExit and GeneratorExit. The purpose of those exceptions is to say "drop what you're doing and bail out any which way you can". Terminating the outermost generator in those cases and letting the subiterators clean up as best they can sounds like a perfectly reasonable option to me. The alternative is to catch BaseException and throw absolutely everything (including GeneratorExit) into the subiterator. The in-between options that you're describing would appear to just complicate the semantics to no great purpose. Note that you may also be pursuing a false consistency here, since g.close() has never been equivalent to g.throw(GeneratorExit), as the latter propagates the exception back into the current scope while the former suppresses it (example was run using 2.5.2): >>> def gen(): yield ... >>> g = gen() >>> g.next() >>> g.close() >>> g2 = gen() >>> g2.next() >>> g2.throw(GeneratorExit) Traceback (most recent call last): File "", line 1, in File "", line 1, in gen GeneratorExit -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From jh at improva.dk Sun Mar 22 13:42:36 2009 From: jh at improva.dk (Jacob Holm) Date: Sun, 22 Mar 2009 13:42:36 +0100 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C62746.7080200@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C62746.7080200@gmail.com> Message-ID: <49C6323C.1020400@improva.dk> Nick Coghlan wrote: > Greg Ewing wrote: > >> To do otherwise would require making a distinction that >> can't be expressed in the Python expansion. Also, it >> seems elegant to preserve the property that if g is a >> generator then g.close() and g.throw(GeneratorExit) are >> exactly equivalent. >> >> What do people think about this? >> > > That whole question is why I suggested rephrasing the question of which > exceptions are passed to the subiterator in Exception vs BaseException > terms. The only acknowledged direct subclasses of BaseException are > KeyboardInterrupt, SystemExit and GeneratorExit. The purpose of those > exceptions is to say "drop what you're doing and bail out any which way > you can". Terminating the outermost generator in those cases and letting > the subiterators clean up as best they can sounds like a perfectly > reasonable option to me. The alternative is to catch BaseException and > throw absolutely everything (including GeneratorExit) into the > subiterator. The in-between options that you're describing would appear > to just complicate the semantics to no great purpose. > Well, since GeneratorExit is specifically about generators, I don't see a problem in special-casing that one and just let everything else be thrown at the subgenerator. I would also be Ok with just throwing everything (including GeneratorExit) there, as that makes the implementation of throw a bit simpler. > Note that you may also be pursuing a false consistency here, since > g.close() has never been equivalent to g.throw(GeneratorExit), as the > latter propagates the exception back into the current scope while the > former suppresses it (example was run using 2.5.2): > I believe that the "exact equivalence" Greg was talking about is the description of close from PEP 342. It is nice that the semantics of close can be described so easily in terms of throw. I like the idea of not having an explicit close in the expansion at all. In most cases the refcounting will take care of it anyway (at least in CPython), and when there are multiple references you might actually want to not close. Code that needs it can add the explicit close themselves by putting the yield-from in a try...finally or a with... block. - Jacob From jh at improva.dk Sun Mar 22 15:35:46 2009 From: jh at improva.dk (Jacob Holm) Date: Sun, 22 Mar 2009 15:35:46 +0100 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C60430.7030108@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> Message-ID: <49C64CC2.1050608@improva.dk> Greg Ewing wrote: > I'm having trouble making up my mind how GeneratorExit > should be handled. > > My feeling is that GeneratorExit is a peculiarity of > generators that other kinds of iterators shouldn't have > to know about. They don't, see below. > So, if you close() a generator, that > shouldn't imply throwing GeneratorExit into the > subiterator -- rather, the subiterator should simply > be dropped and then the delegating generator finalized > as usual. > > If the subiterator happens to be another generator, > dropping the last reference to it will cause it to > be closed, in which case it will raise its own > GeneratorExit. This is only true in CPython, but that shouldn't be a problem. If you really need the subiterator to be closed at that point, wrapping the yield-from in the appropriate try...finally... or with... block will do the trick. > Other kinds of iterators can finalize > themselves however they see fit, and don't need to > pretend they're generators and understand > GeneratorExit. They don't have to understand GeneratorExit at all. As long as they know how to clean up after themselves when thrown an exception they cannot handle, things will just work. GeneratorExit is no different from SystemExit or KeyboardInterrupt in that regard. > > For consistency, this implies that a GeneratorExit > explicitly thrown in using throw() shouldn't be > forwarded to the subiterator either, even if it has > a throw() method. I agree that if close() doesn't throw the GeneratorExit to the subiterator, then throw() shouldn't either. > > To do otherwise would require making a distinction that > can't be expressed in the Python expansion. Also, it > seems elegant to preserve the property that if g is a > generator then g.close() and g.throw(GeneratorExit) are > exactly equivalent. Not exactly equivalent, but related in the simple way described in PEP 342. > > What do people think about this? > If I understand you correctly, what you want can be described by the following expansion: _i = iter(EXPR) try: _u = _i.next() while 1: try: _v = yield _u except GeneratorExit: raise except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: RESULT = _e.value finally: _i = _u = _v = _e = _m = None del _i, _u, _v, _e, _m (except for minor details like the possible method caching). I like this version because it makes it easier to share subiterators if you need to. The explicit close in the earlier proposals meant that as soon as one generator delegating to the shared iterator was closed, the shared one would be as well. No, I don't have a concrete use case for this, but I think it is the least surprising behavior we could choose for closing shared subiterators. As mentioned above, you can still explicitly request that the subiterator be closed with the delegating generator by wrapping the yield-from in a try...finally... or with... block. If I understand Nick correctly, he would like to drop the "except GeneratorExit: raise" part, and possibly change BaseException to Exception. I don't like the idea of just dropping the "except GeneratorExit: raise", as that brings us back in the situation where shared subiterators are less useful. If we also change BaseException to Exception, the only difference is that it will no longer be possible to throw exceptions like SystemExit and KeyboardInterrupt that don't inherit from Exception to a subiterator. Again, I don't have a concrete use case, but I think putting an arbitrary restriction like that in a language construct is a bad idea. One example where this would cause surprises is if you split part of a generator function (that for one reason or another need to handle these exceptions) into a separate generator and calls it using yield from. Throwing an exception to the refactored generator could then have different meaning than before the refactoring, and there would be no easy way to fix this. Just my 2 cents... - Jacob From ncoghlan at gmail.com Sun Mar 22 22:08:57 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 23 Mar 2009 07:08:57 +1000 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C64CC2.1050608@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C64CC2.1050608@improva.dk> Message-ID: <49C6A8E9.4010705@gmail.com> Jacob Holm wrote: > If I understand Nick correctly, he would like to drop the "except > GeneratorExit: raise" part, and possibly change BaseException to > Exception. I don't like the idea of just dropping the "except > GeneratorExit: raise", as that brings us back in the situation where > shared subiterators are less useful. If we also change BaseException to > Exception, the only difference is that it will no longer be possible to > throw exceptions like SystemExit and KeyboardInterrupt that don't > inherit from Exception to a subiterator. Note that as of 2.6, GeneratorExit doesn't inherit from Exception either - it now inherits directly from BaseException, just like the other two terminal exceptions: Python 2.6+ (trunk:66863M, Oct 9 2008, 21:32:59) >>> BaseException.__subclasses__() [, , , ] All I'm saying is that if GeneratorExit doesn't get passed down then neither should SystemExit nor KeyboardInterrupt, while if the latter two *do* get passed down, then so should GeneratorExit. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From guido at python.org Sun Mar 22 23:31:15 2009 From: guido at python.org (Guido van Rossum) Date: Sun, 22 Mar 2009 15:31:15 -0700 Subject: [Python-ideas] CapPython's use of unbound methods In-Reply-To: <20090319.231249.343185657.mrs@localhost.localdomain> References: <20090312.202410.846948621.mrs@localhost.localdomain> <20090319.231249.343185657.mrs@localhost.localdomain> Message-ID: On Thu, Mar 19, 2009 at 4:12 PM, Mark Seaborn wrote: > Guido van Rossum wrote: > >> On Thu, Mar 12, 2009 at 1:24 PM, Mark Seaborn wrote: >> > Suppose we have an object x with a private attribute, "_field", >> > defined by a class Foo: >> > >> > class Foo(object): >> > >> > ? ?def __init__(self): >> > ? ? ? ?self._field = "secret" >> > >> > x = Foo() >> >> Can you add some principals to this example? Who wrote the Foo class >> definition? Does CapPython have access to the source code for Foo? To >> the class object? > > OK, suppose we have two principals, Alice and Bob. ?Alice receives a > string from Bob. ?Alice instantiates the string using CapPython's > safe_eval() function, getting back a module object that contains a > function object. ?Alice passes the function an object x. ?Alice's > intention is that the function should not be able to get hold of the > contents of x._field, no matter what string Bob supplies. > > To make this more concrete, this is what Alice executes, with > source_from_bob defined in a string literal for the sake of example: > > source_from_bob = """ > class C: > ? ?def f(self): > ? ? ? ?return self._field > def entry_point(x): > ? ?C.f(x) # potentially gets the secret object in Python 3.0 > """ > > import safeeval > > secret = object() > > class Foo(object): > ? ?def __init__(self): > ? ? ? ?self._field = secret > > x = Foo() > module = safeeval.safe_eval(source_from_bob, safeeval.Environment()) > module.entry_point(x) > > > In this example, Bob's code is not given access to the class object > Foo. ?Furthermore, Bob should not be able to get access to the class > Foo from the instance x. ?The type() builtin is not considered to be > safe in CapPython so it is not included in the default environment. > > Bob's code is not given access to the source code for class Foo. ?But > even if Bob is aware of Alice's source code, it should not affect > whether Bob can get hold of the secret object. OK, I think I understand all this, except I don't have much of an idea of what subset of the language Bob is allowed to used. > By the way, you can try out the example by getting the code from the > Bazaar repository: > bzr branch http://bazaar.launchpad.net/%7Emrs/cappython/trunk cappython If you don't mind I will try to avoid downloading your source a little longer. >> > However, in Python 3.0, the CapPython code can do this: >> > >> > class C(object): >> > >> > ? ?def f(self): >> > ? ? ? ?return self._field >> > >> > C.f(x) # returns "secret" >> > >> > Whereas in Python 2.x, C.f(x) would raise a TypeError, because C.f is >> > not being called on an instance of C. >> >> In Python 2.x I could write >> >> class C(Foo): >> ? def f(self): >> ? ? return self._field > > In the example above, Bob's code is not given access to Foo, so Bob > cannot do this. ?But you are right, if Bob's code were passed Foo as > well as x, Bob could do this. > > Suppose Alice wanted to give Bob access to class Foo, perhaps so that > Bob could create derived classes. ?It is still possible for Alice to > do that safely, if Alice defines Foo differently. ?Alice can pass the > secret object to Foo's constructor instead of having the class > definition get its reference to the secret object from an enclosing > scope: > > class Foo(object): > > ? ?def __init__(self, arg): > ? ? ? ?self._field = arg > > secret = object() > x = Foo(secret) > module = safeeval.safe_eval(source_from_bob, safeeval.Environment()) > module.entry_point(x, Foo) > > > Bob can create his own objects derived from Foo, but cannot use his > access to Foo to break encapsulation of instance x. ?Foo is now > authorityless, in the sense that it does not capture "secret" from its > enclosing environment, unlike the previous definition. > > >> or alternatively >> >> class C(x.__class__): >> ? > > The verifier would reject x.__class__, so this is not possible. > > >> > Guido said, "I don't understand where the function object f gets its >> > magic powers". >> > >> > The answer is that function definitions directly inside class >> > statements are treated specially by the verifier. >> >> Hm, this sounds like a major change in language semantics, and if I >> were Sun I'd sue you for using the name "Python" in your product. :-) > > Damn, the makers of Typed Lambda Calculus had better watch out for > legal action from the makers of Lambda Calculus(tm) too... :-) ?Is it > really a major change in semantics if it's just a subset? ;-) Well yes. The empty subset is also a subset. :-) More seriously, IIUC you are disallowing all use of attribute names starting with underscores, which not only invalidates most Python code in practical use (though you might not care about that) but also disallows the use of many features that are considered part of the language, such as access to __dict__ and many other introspective attributes. > To some extent the verifier's check of only accessing private > attributes through self is just checking a coding style that I already > follow when writing Python code (except sometimes for writing test > cases). You might wish this to be true, but for most Python programmers, it isn't. Introspection is a commonly-used part of the language (probably more so than in Java). So is the use of attribute names starting with a single underscore outside the class tree, e.g. by "friend" functions. > Of course some of the verifier's checks, such as only allowing > attribute assignments through self, are a lot more draconian than > coding style checks. That also sounds like a rather serious hindrance to writing Python as most people think of it. >> > If you wrote the same function definition at the top level: >> > >> > def f(var): >> > ? ?return var._field # rejected >> > >> > the attribute access would be rejected by the verifier, because "var" >> > is not a self variable, and private attributes may only be accessed >> > through self variables. >> > >> > I renamed the variable in the example, >> >> What do you mean by this? > > I just mean that I applied alpha conversion. BTW that's a new term for me. :-) > def f(self): > ? ?return self._field > > is equivalent to > > def f(var): > ? ?return var._field This equivalence is good. > Whether these function definitions are accepted by the verifier > depends on their context. But this isn't. Are you saying that the verifier accepts the use of self._foo in a method? That would make the scenario of potentially passing a class defined by Alice into Bob's code much harder to verify -- now suddenly Alice has to know about a lot of things before she can be sure that she doesn't leave open a backdoor for Bob. >> Do you also catch things like >> >> g = getattr >> s = 'field'.replace('f', '_f') >> >> print g(x, s) >> >> ? > > The default environment doesn't provide the real getattr() function. > It provides a wrapped version that rejects private attribute names. Do you have a web page describing the precise list of limitations you apply in your "subset" of Python? Does it support import of some form? -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jh at improva.dk Mon Mar 23 01:07:12 2009 From: jh at improva.dk (Jacob Holm) Date: Mon, 23 Mar 2009 01:07:12 +0100 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C6A8E9.4010705@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C64CC2.1050608@improva.dk> <49C6A8E9.4010705@gmail.com> Message-ID: <49C6D2B0.9060405@improva.dk> Hi Nick Nick Coghlan wrote: > Jacob Holm wrote: > >> If I understand Nick correctly, he would like to drop the "except >> GeneratorExit: raise" part, and possibly change BaseException to >> Exception. I don't like the idea of just dropping the "except >> GeneratorExit: raise", as that brings us back in the situation where >> shared subiterators are less useful. If we also change BaseException to >> Exception, the only difference is that it will no longer be possible to >> throw exceptions like SystemExit and KeyboardInterrupt that don't >> inherit from Exception to a subiterator. >> > > Note that as of 2.6, GeneratorExit doesn't inherit from Exception either > - it now inherits directly from BaseException, just like the other two > terminal exceptions: > I know this. > All I'm saying is that if GeneratorExit doesn't get passed down then > neither should SystemExit nor KeyboardInterrupt, while if the latter two > *do* get passed down, then so should GeneratorExit. > I also know this, and I disagree. You are saying that because they have the thing in commen that they do *not* inherit from Exception we should treat them the same. This is like saying that anything that is not a shade of green should be treated as red, completely ignoring the possibility of other colors. I like to see GeneratorExit handled as a special case by yield-from, because: 1. It already has a special meaning in generators as the exception raised in the generator when close is called. 2. It *enables* certain uses of yield-from that would require much more more work to handle otherwise. I am thinking of the ability to have multiple generators yield from the same iterator. Being able to close one generator without closing the shared iterator seems like a good thing. 3. While the GeneratorExit is not propagated directly, its expected effect of finalizing the subiterator *is*. At least in CPython, and assuming the subiterator does its finalization in a __del__ method, and that the generator holds the only reference. If the subiterator is actually a generator, it will even look like the GeneratorExit was propagated, due to the PEP 342 definition of close. I don't like the idea of only throwing exceptions that inherit from Exception to the subiterator, because it makes the following two generators behave differently when thrown a non-Exception exception. def generatorA(): try: x = yield except BaseException, e: print type(e) raise def generatorB(): return (yield from generatorA()) The PEP is clearly intended to make them act identically. Quoting from the PEP: "When the iterator is another generator, the effect is the same as if the body of the subgenerator were inlined at the point of the yield from expression". Treating only GeneratorExit special allows them to behave exactly the same (in CPython). If you only propagate exceptions that inherit from Exception, you would have to write something like: def generatorC(): g = generatorA() while 1: try: return (yield from g) except Exception: # This exception comes from g, so just reraise raise except BaseException, e: yield g.throw(e) # this exception was not propagated by yield-from, do it manually to get the same effect. I don't mind that the expansion as written in the PEP becomes very slightly more complicated, as long as it makes the code using it simpler to reason about. - Jacob From benjamin at python.org Mon Mar 23 01:13:26 2009 From: benjamin at python.org (Benjamin Peterson) Date: Mon, 23 Mar 2009 00:13:26 +0000 (UTC) Subject: [Python-ideas] a identity function? Message-ID: I've found as I write more and more decorators I need an identity function often. For example I might write: def replace_maybe(reason): if reason == "good reason": return lambda x: x def decorator(func): # do fancy stuff here return decorator I hate lambdas, so usually I write def _id(x): return x It'd be nice to have a shortcut in the stdlib, though. Would this go well in the operator or functools modules well? From python at rcn.com Mon Mar 23 01:27:26 2009 From: python at rcn.com (Raymond Hettinger) Date: Sun, 22 Mar 2009 17:27:26 -0700 Subject: [Python-ideas] a identity function? References: Message-ID: [Benjamin Peterson] > I've found as I write more and more decorators I need an identity function > often. For example I might write: > > def replace_maybe(reason): > if reason == "good reason": > return lambda x: x > def decorator(func): > # do fancy stuff here > return decorator > > I hate lambdas, so usually I write > > def _id(x): > return x > > It'd be nice to have a shortcut in the stdlib, though. Would this go well in the > operator or functools modules well? -1 I and Paul Rubin considered this a long time ago. It stayed on the todo list for a while and then fell by the wayside as its downsides became apparent. One problem is that many of the places whether it is tempting to use an identity function, it is just a slower way to do something that we should have used a simple if-statement for. In your example, there is no cost, but it is terrible to end-up with variants of map(func, iterable) where func defaults to lambda x: x. The other issue is that different signatures were needed for different tasks. identity = lambda *args: args identity = lambda *args: args[0] if args else None identity = lambda x: x Better to let people write their own trivial pass-throughs and think about the signature and time costs. Raymond From greg.ewing at canterbury.ac.nz Mon Mar 23 01:49:49 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 23 Mar 2009 13:49:49 +1300 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C6A8E9.4010705@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C64CC2.1050608@improva.dk> <49C6A8E9.4010705@gmail.com> Message-ID: <49C6DCAD.3060701@canterbury.ac.nz> Nick Coghlan wrote: > All I'm saying is that if GeneratorExit doesn't get passed down then > neither should SystemExit nor KeyboardInterrupt That would violate the inlining principle, though. An inlined generator is going to get all exceptions regardless of what they inherit from. >, while if the latter two > *do* get passed down, then so should GeneratorExit. Whereas that would mean a shared subiterator would get prematurely finalized when closing the delegating generator. So there seems to be no choice about this -- we must pass on all exceptions except GeneratorExit, and we must *not* pass on GeneratorExit itself. -- Greg From dangyogi at gmail.com Mon Mar 23 02:23:19 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sun, 22 Mar 2009 21:23:19 -0400 Subject: [Python-ideas] Yield-From: Revamped expansion In-Reply-To: <49C5781B.30102@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C5752C.2080704@canterbury.ac.nz> <49C5781B.30102@gmail.com> Message-ID: <49C6E487.7040002@gmail.com> Nick Coghlan wrote: > > I'd adjust the inner exception handlers to exploit the fact that > SystemExit and GeneratorExit don't inherit from BaseException: > > [...] > except: > # Covers SystemExit, GeneratorExit and > # anything else that doesn't inherit > # from Exception > _m = getattr(_i, 'close', None) > if _m is not None: > _m() > raise > This feels better to me too. Though it seems that _i.throw would be more appropriate than _i.close (except call _i.close is there is no _i.throw -- is it possible to have a close and not a throw?). I like the idea that "finally" (in try/finally) means finally and not "maybe finally" (which boils down to finally in CPython due to the reference counting collector, but maybe finally in Jython, IronPython or Pypy). -bruce frederiksen From dangyogi at gmail.com Mon Mar 23 03:40:21 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Sun, 22 Mar 2009 22:40:21 -0400 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C60430.7030108@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> Message-ID: <49C6F695.1050100@gmail.com> Greg Ewing wrote: > > My feeling is that GeneratorExit is a peculiarity of > generators that other kinds of iterators shouldn't have > to know about. So, if you close() a generator, that > shouldn't imply throwing GeneratorExit into the > subiterator -- [...] It can only be "thrown into the subiterator" if the subiterator is a generator (i.e., has a throw method) -- in which case, it knows about GeneratorExit. So the hasattr(_i, 'throw') test already covers this case. > > > If the subiterator happens to be another generator, > dropping the last reference to it will cause it to > be closed, [...] NO, NO, NO. Unless you are prepared to say that programs written to this spec are *not* expected to run on any other version of Python other than CPython. CPython is the *only* version with a reference counting collector. And encouraging Python programmers to rely on this invites trouble when they try to port to any other version of Python. I know. I've been there, and have the T-shirt. And it's not pretty. The errors that you get when your finally clauses and context managers aren't run can be quite mysterious. And God help that person if they haven't slept with PEP 342 under their pillow! > Other kinds of iterators can finalize > themselves however they see fit, and don't need to > pretend they're generators and understand > GeneratorExit. Your PEP currently does not demand that other iterators "pretend they're generators and understand GeneratorExit". Non-generator iterators don't have throw or close methods and will remain blissfully ignorant of these finer points as the PEP stands now. So this is not a problem. > > For consistency, this implies that a GeneratorExit > explicitly thrown in using throw() shouldn't be > forwarded to the subiterator either, even if it has > a throw() method. > > To do otherwise would require making a distinction that > can't be expressed in the Python expansion. Also, it > seems elegant to preserve the property that if g is a > generator then g.close() and g.throw(GeneratorExit) are > exactly equivalent. Yes, g.close and g.throw(GeneratorExit) are equivalent. So you should be able to translate a close into a throwing GeneratorExit or vice versa. But if the subiterator doesn't have the first method that you look for (let's say you pick throw), then you should call the other method (if it has that one instead). Finally, on your previous post, you say: > It would also avoid the problem of a partially exhausted > iterator that's still in use by something else getting > prematurely finalized, which is another thing that's been > bothering me. This is a valid point. But consider: 1. The delegating generator has no way to stop the subgenerator prematurely when it uses the yield from. So the yield from can only be stopped prematurely by the delegating generator's caller. And then the subgenerator would have to be communicated between the caller to the delegating generator somehow (e.g, passed in as a parameter) so that the caller could continue to use it. (And the subgenerator has to be a generator, not a plain iterator). Though possible, this kind of a use case would be used very rarely compared to the use case of the yield from being the final place the subgenerator is used. 2. If finalization of the subgenerator needs to be prevented, it can be wrapped in a plain iterator wrapper that doesn't define throw or close. class no_finalize: def __init__(self, gen): self.gen = gen def __iter__(self): return self def __next__(self): return next(self.gen) def send(self, x): return self.gen.send(x) g = subgen(...) yield from no_finalize(g) ... use g As I see it, you are faced with two options: 1. Define "yield from" in a way that it will work the same in all implementations of Python and will work for the 98% use case without any extra boilerplate code, and only require extra boilerplate (as above) for the 2% use case. or 2. Define "yield from" in a way that will have quite different behavior (for reasons very obscure to most programmers) on the different implementations of Python (due to the different implementation of garbage collectors), require boilerplate code to be portable for the 98% use case (e.g., adding a "with closing(subgen())" around the yield from); but not require any boilerplate code for portability in the 2% use case. The only argument I can think in favor of option 2, is that's what the "for" statement ended up with. But that was only because changing the "for" statement to option 1 would break the legacy 2% use cases... IMHO option 1 is the better choice. -bruce frederiksen From denis.spir at free.fr Mon Mar 23 13:19:06 2009 From: denis.spir at free.fr (spir) Date: Mon, 23 Mar 2009 13:19:06 +0100 Subject: [Python-ideas] CapPython's use of unbound methods In-Reply-To: References: <20090312.202410.846948621.mrs@localhost.localdomain> <20090319.231249.343185657.mrs@localhost.localdomain> Message-ID: <20090323131906.6da1e6ad@o> Le Sun, 22 Mar 2009 15:31:15 -0700, Guido van Rossum s'exprima ainsi: > > To some extent the verifier's check of only accessing private > > attributes through self is just checking a coding style that I already > > follow when writing Python code (except sometimes for writing test > > cases). > > You might wish this to be true, but for most Python programmers, it > isn't. Introspection is a commonly-used part of the language (probably > more so than in Java). So is the use of attribute names starting with > a single underscore outside the class tree, e.g. by "friend" > functions. Just a side note. In a language that does not hold a notion of private attribute as core feature, "morphologic" (name forming) convention is a great help. I have long thought a more formal way of introducing public interface -- if only a simple declarative line at top of class def -- would be better, but I recently changed my mind. I think now the privacy vs "publicity" opposition is rather relative, vague, changing. Let's take the case of any toolset/library code with several classes communicating with each other. In most cases, some attributes will be both hidden to client code but exposed to other objects of the toolset. So there are already 3 levels of privacy. If we now introduce tools of the toolset and pure client interface classes we add two levels... Privacy is relative, so to say conventional; in addition to relative levels, there are also qualitative differences in privacy. Some languages (esp. Java) invent hardcoded languages features in a hopeless trial to formalize all of these distinctions. The python way of just saying "let this alone, unless you really know what you intend to do" is probably better to cope with such unclear and variable notions. Denis ------ la vita e estrany From jh at improva.dk Mon Mar 23 14:09:07 2009 From: jh at improva.dk (Jacob Holm) Date: Mon, 23 Mar 2009 14:09:07 +0100 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C6F695.1050100@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> Message-ID: <49C789F3.30301@improva.dk> Bruce Frederiksen wrote: > Greg Ewing wrote: > [...] >> If the subiterator happens to be another generator, >> dropping the last reference to it will cause it to >> be closed, [...] > NO, NO, NO. Unless you are prepared to say that programs written to > this spec are *not* expected to run on any other version of Python > other than CPython. CPython is the *only* version with a reference > counting collector. And encouraging Python programmers to rely on this > invites trouble when they try to port to any other version of Python. > I know. I've been there, and have the T-shirt. And it's not pretty. > The errors that you get when your finally clauses and context managers > aren't run can be quite mysterious. And God help that person if they > haven't slept with PEP 342 under their pillow! Ok, got it. Relying on refcounting is bad. > [...] >> It would also avoid the problem of a partially exhausted >> iterator that's still in use by something else getting >> prematurely finalized, which is another thing that's been >> bothering me. > This is a valid point. But consider: > > 1. The delegating generator has no way to stop the subgenerator > prematurely when it uses the yield from. So the yield from can only be > stopped prematurely by the delegating generator's caller. And then the > subgenerator would have to be communicated between the caller to the > delegating generator somehow (e.g, passed in as a parameter) so that > the caller could continue to use it. (And the subgenerator has to be a > generator, not a plain iterator). "...subgenerator has to be a generator" is not entirely true. For example, if the subiterator doesn't have send, you can send a non-None value to the generator and that will raise an AttributeError at the yield from. If it doesn't have throw, you can even throw a StopIteration with a value to get that value as the result of the yield-from expression, which might be useful in a twisted sort of way. In both cases, the subiterator will only be closed if the yield-from expression actually closes it. So it is definitely possible to get a non-generator prematurely finalized. > Though possible, this kind of a use case would be used very rarely > compared to the use case of the yield from being the final place the > subgenerator is used. That I agree with. > > 2. If finalization of the subgenerator needs to be prevented, it can > be wrapped in a plain iterator wrapper that doesn't define throw or > close. > > class no_finalize: > def __init__(self, gen): > self.gen = gen > def __iter__(self): > return self > def __next__(self): > return next(self.gen) > def send(self, x): > return self.gen.send(x) > > g = subgen(...) > yield from no_finalize(g) > ... use g Well, if the subiterator is a generator that itself uses yield-from, the need to wrap it would destroy all possible speed benefits of using yield-from. So if there *is* a valid use case for yielding from a shared generator, this is not really a solution unless you don't care about speed. > > As I see it, you are faced with two options: > > 1. Define "yield from" in a way that it will work the same in all > implementations of Python and will work for the 98% use case without > any extra boilerplate code, and only require extra boilerplate (as > above) for the 2% use case. or I can live with that. This essentially means using the expansion in the PEP (with "except Exception, _e" replaced by "except BaseException, _e", to get the inlining property we all want). The decision to use explicit close will make what could have been a 2% use case much less attractive. Note that with explicit close, my argument for special-casing GeneratorExit by adding "except GeneratorExit: raise" weakens. The GeneratorExit will be delegated to the deepest generator/iterator with a throw method. As long as the iterators don't swallow the exception, they will be closed from the finally clause in the expansion. If one of them *does* swallow the exception, the outermost generator will raise a RuntimeError. The only difference that special-casing GeneratorExit would make is that 1) if the final iterator is not a generator, it won't see a GeneratorExit, and 2) if one of the iterators swallow the exception, the rest would still be closed and you might get a better traceback for the RuntimeError. > > 2. Define "yield from" in a way that will have quite different > behavior (for reasons very obscure to most programmers) on the > different implementations of Python (due to the different > implementation of garbage collectors), require boilerplate code to be > portable for the 98% use case (e.g., adding a "with closing(subgen())" > around the yield from); but not require any boilerplate code for > portability in the 2% use case. > > The only argument I can think in favor of option 2, is that's what the > "for" statement ended up with. But that was only because changing the > "for" statement to option 1 would break the legacy 2% use cases... There is also the question of speed as mentioned above, but that argument is not all that strong... > > IMHO option 1 is the better choice. If relying on refcounting is as bad as you say, then I agree. - Jacob From dangyogi at gmail.com Mon Mar 23 21:27:47 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Mon, 23 Mar 2009 16:27:47 -0400 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C789F3.30301@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> Message-ID: <49C7F0C3.10904@gmail.com> Jacob Holm wrote: > Bruce Frederiksen wrote: >> This is a valid point. But consider: >> >> 1. The delegating generator has no way to stop the subgenerator >> prematurely when it uses the yield from. So the yield from can only >> be stopped prematurely by the delegating generator's caller. And then >> the subgenerator would have to be communicated between the caller to >> the delegating generator somehow (e.g, passed in as a parameter) so >> that the caller could continue to use it. (And the subgenerator has >> to be a generator, not a plain iterator). > "...subgenerator has to be a generator" is not entirely true. For > example, if the subiterator doesn't have send, you can send a non-None > value to the generator and that will raise an AttributeError at the > yield from. If it doesn't have throw, you can even throw a > StopIteration with a value to get that value as the result of the > yield-from expression, which might be useful in a twisted sort of way. > In both cases, the subiterator will only be closed if the yield-from > expression actually closes it. So it is definitely possible to get a > non-generator prematurely finalized. But non-generators don't have a close (or throw) method. They lack the concept of "finalization". Only generators have these extra methods. So using a subiterator in yield from isn't an issue here. (Or am I missing something)? > Well, if the subiterator is a generator that itself uses yield-from, > the need to wrap it would destroy all possible speed benefits of using > yield-from. So if there *is* a valid use case for yielding from a > shared generator, this is not really a solution unless you don't care > about speed. Yes, there is a performance penalty in this case. If the wrapper were written in C, then I would think that the penalty would be negligible. Perhaps offer a C wrapper in a standard library?? > Note that with explicit close, my argument for special-casing > GeneratorExit by adding "except GeneratorExit: raise" weakens. The > GeneratorExit will be delegated to the deepest generator/iterator with > a throw method. As long as the iterators don't swallow the exception, > they will be closed from the finally clause in the expansion. If one > of them *does* swallow the exception, the outermost generator will > raise a RuntimeError. Another case where close differs from throw(GeneratorExit). Close is define in PEP 342 to raise RuntimeError if GeneratorExit is swallowed. Should the delegating generator, then, be calling close rather throw for GeneratorExit so that the RuntimeError is raised closer to cause of the exception? Or does this violate the "inlining" goal of the current PEP? -bruce frederiksen From greg.ewing at canterbury.ac.nz Tue Mar 24 00:07:13 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 24 Mar 2009 11:07:13 +1200 Subject: [Python-ideas] Yield-From: GeneratorExit? In-Reply-To: <49C7F0C3.10904@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> Message-ID: <49C81621.9040600@canterbury.ac.nz> Bruce Frederiksen wrote: > But non-generators don't have a close (or throw) method. They lack the > concept of "finalization". Any object could require explicit finalization in the absence of refcounting, so "close" isn't peculiar to generators. > Should the delegating generator, then, be calling close rather throw for > GeneratorExit so that the RuntimeError is raised closer to cause of the > exception? Or does this violate the "inlining" goal of the current PEP? Yes, it would violate the inlining principle. -- Greg From greg.ewing at canterbury.ac.nz Tue Mar 24 00:24:53 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 24 Mar 2009 11:24:53 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49C81621.9040600@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> Message-ID: <49C81A45.1070803@canterbury.ac.nz> We have a decision to make. It appears we can have *one* of the following, but not both: (1) In non-refcounting implementations, subiterators are finalized promptly when the delegating generator is explicitly closed. (2) Subiterators are not prematurely finalized when other references to them exist. Since in the majority of intended use cases the subiterator won't be shared, (1) seems like the more important guarantee to uphold. Does anyone disagree with that? Guido, what do you think? -- Greg From python at rcn.com Tue Mar 24 17:03:31 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 24 Mar 2009 09:03:31 -0700 Subject: [Python-ideas] [Python-Dev] About adding a new iterator methodcalled "shuffled" References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> Message-ID: > On Tue, Mar 24, 2009, Roy Hyunjin Han wrote: >> >> I know that Python has iterator methods called "sorted" and "reversed" and >> these are handy shortcuts. >> >> Why not add a new iterator method called "shuffled"? You can already write: sorted(s, key=lambda x: random()) But nobody does that. So you have a good indication that the proposed method isn't need. Raymond From python at rcn.com Tue Mar 24 17:25:14 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 24 Mar 2009 09:25:14 -0700 Subject: [Python-ideas] [Python-Dev] About adding a new iteratormethodcalled "shuffled" References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><20090324155828.GA15670@panix.com> Message-ID: <7D86CAF9592C415EBA74EAD5320439CC@RaymondLaptop1> > You can already write: > > sorted(s, key=lambda x: random()) > > But nobody does that. So you have a good > indication that the proposed method isn't need. s/need/needed From solipsis at pitrou.net Tue Mar 24 19:29:56 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 24 Mar 2009 18:29:56 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?=5BPython-Dev=5D_About_adding_a_new_iter?= =?utf-8?q?ator=09methodcalled=09=22shuffled=22?= References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> Message-ID: Raymond Hettinger writes: > > You can already write: > > sorted(s, key=lambda x: random()) > > But nobody does that. So you have a good > indication that the proposed method isn't need. On the other hand, sorting is O(n.log(n)), which is probably sub-optimal for shuffling (I don't know how shuffle() is internally implemented, but ISTM that it shouldn't take more than O(n)). Note that I'm not supporting the original proposal: shuffle() is not used enough to warrant such a shortcut. Regards Antoine. From bmintern at gmail.com Tue Mar 24 19:34:35 2009 From: bmintern at gmail.com (Brandon Mintern) Date: Tue, 24 Mar 2009 14:34:35 -0400 Subject: [Python-ideas] [Python-Dev] About adding a new iterator methodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> Message-ID: <4c0fccce0903241134i43b7a77ds2375eaf4b485d323@mail.gmail.com> from random import shuffle shuffle(s) I think it's convenient enough as is. Brandon From dickinsm at gmail.com Tue Mar 24 19:44:51 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Tue, 24 Mar 2009 18:44:51 +0000 Subject: [Python-ideas] [Python-Dev] About adding a new iterator methodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> Message-ID: <5c6f2a5d0903241144v68baa9dcp16275b3c88fc6d53@mail.gmail.com> On Tue, Mar 24, 2009 at 6:29 PM, Antoine Pitrou wrote: > On the other hand, sorting is O(n.log(n)), which is probably sub-optimal for > shuffling (I don't know how shuffle() is internally implemented, but ISTM that > it shouldn't take more than O(n)). I assumed that the OP was suggesting something of the form: def shuffled(L): while L: i = random.randrange(len(L)) yield L[i] L.pop(i) fixed up somehow so that it's only O(1) to yield each element; in effect, an itertools version of random.sample. I could see uses for this in cases where you only want a few randomly chosen elements from a large list, but don't necessarily know in advance how many elements you need. Mark From steve at pearwood.info Tue Mar 24 21:20:00 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Mar 2009 07:20:00 +1100 Subject: [Python-ideas] =?iso-8859-1?q?About_adding_a_new_iterator_methodc?= =?iso-8859-1?q?alled_=22shuffled=22?= In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> Message-ID: <200903250720.00433.steve@pearwood.info> On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote: > > On Tue, Mar 24, 2009, Roy Hyunjin Han wrote: > >> I know that Python has iterator methods called "sorted" and > >> "reversed" and these are handy shortcuts. > >> > >> Why not add a new iterator method called "shuffled"? > > You can already write: > > sorted(s, key=lambda x: random()) > > But nobody does that. So you have a good > indication that the proposed method isn't needed. That's nice -- not as readable as random.shuffle(s) but still nice. And fast too: on my PC, it is about twice as fast as random.shuffle() for "reasonable" sized lists (tested up to one million items). I don't think randomly shuffling a list is anywhere near common enough a task that it should be a built-in, so -1 on the OP's request, but since we're on the topic, I wonder whether the random.shuffle() implementation should use Raymond's idiom rather than the current Fisher-Yates shuffle? The advantage of F-Y is that it is O(N) instead of O(N*log N) for sorting, but the constant factor makes it actually significantly slower in practice. In addition, the F-Y shuffle is limited by the period of the random number generator: given a period P, it can randomize lists of length n where n! < P. For lists larger than n items, some permutations are unreachable. In the current implementation of random, n equals 2080. I *think* Raymond's idiom suffers from the same limitation, it's hard to imagine that it doesn't, but can anyone confirm this? (In any case, if you're shuffling lists with more than 2080 items, and you care about the statistical properties of the result (as opposed to just "make it somewhat mixed up"), then the current implementation isn't good enough and you'll need to use your own shuffle routine.) Are there any advantages of the current F-Y implementation? It seems to me that Raymond's idiom is no worse statistically, and significantly faster in practice, so it should be the preferred implementation. Thoughts? -- Steven D'Aprano From guido at python.org Tue Mar 24 22:05:13 2009 From: guido at python.org (Guido van Rossum) Date: Tue, 24 Mar 2009 14:05:13 -0700 Subject: [Python-ideas] Builtin test function In-Reply-To: <3d0cebfb0903201303q12b27e2atc5597a4b012286ee@mail.gmail.com> References: <3d0cebfb0903190259jd02c918h31da14d4adb0b73b@mail.gmail.com> <200903192148.54461.steve@pearwood.info> <3d0cebfb0903201303q12b27e2atc5597a4b012286ee@mail.gmail.com> Message-ID: I think what you are really looking for is a standard API for finding the tests associated with a module, given the module object (or perhaps its full name), perhaps combined with a standard API for running the tests found. I don't think running tests is of such all-importance to warrant adding a built-in function that wraps both the test finding and test running APIs. But whatever you do, don't call it 'test' -- that name is overloaded too much as it is. --Guido On Fri, Mar 20, 2009 at 1:03 PM, Fredrik Johansson wrote: > On Thu, Mar 19, 2009 at 11:48 AM, Steven D'Aprano wrote: >> On Thu, 19 Mar 2009 08:59:08 pm Fredrik Johansson wrote: >>> There's been some discussion about automatic test discovery lately. >>> Here's a random (not in any way thought through) idea: add a builtin >>> function test() that runs tests associated with a given function, >>> class, module, or object. >> >> Improved testing is always welcome, but why a built-in? >> >> I know testing is important, but is it so common and important that we >> need it at our fingertips, so to speak, and can't even import a module >> first before running tests? What's the benefit to making it a built-in >> instead of part of a test module? > > It would just be a convenience, and I'm just throwing the idea out. > > The advantage would be a uniform and very simple interface for testing any > module, without having to know whether I should import doctest, > unittest or something else (and having to remember the commands > used by each framework). It would certainly not be a replacement for more > advanced test frameworks. > > Fredrik > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Tue Mar 24 22:13:26 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Mar 2009 07:13:26 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49C81A45.1070803@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> Message-ID: <49C94CF6.5070301@gmail.com> Greg Ewing wrote: > We have a decision to make. It appears we can have > *one* of the following, but not both: > > (1) In non-refcounting implementations, subiterators > are finalized promptly when the delegating generator > is explicitly closed. > > (2) Subiterators are not prematurely finalized when > other references to them exist. > > Since in the majority of intended use cases the > subiterator won't be shared, (1) seems like the more > important guarantee to uphold. Does anyone disagree > with that? If you choose (2), then (1) is trivial to implement in code that uses the new expression in combination with existing support for deterministic finalisation. For example: with contextlib.closing(make_subiter()) as subiter: yield from subiter On the other hand, if you choose (1), then it is impossible to use that construct in combination with any other existing constructs to avoid finalisation - you have to write out the equivalent code from the PEP by hand, leaving out the finalisation parts. So I think dropping the implicit finalisation is the better option - it simplifies the new construct, and plays well with explicit finalisation when that is what people want. However, I would also recommend *not* special casing GeneratorExit in that case: just pass it down using throw. Note that non-generator iterators that want "throw" to mean the same thing as "close" can do that easily enough: def throw(self, *args): self.close() reraise(*args) (reraise itself would just do the dance to check how many arguments there were and use the appropriate form of "raise" to reraise the exception) Hmm, that does suggest another issue with the PEP however: it only calls the subiterator's throw with the value of the thrown in exception. It should be using the 3 argument form to avoid losing any passed in traceback information. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From steve at pearwood.info Tue Mar 24 22:40:11 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Mar 2009 08:40:11 +1100 Subject: [Python-ideas] About adding a new iterator methodcalled "shuffled" In-Reply-To: <200903250720.00433.steve@pearwood.info> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903250720.00433.steve@pearwood.info> Message-ID: <200903250840.11674.steve@pearwood.info> On Wed, 25 Mar 2009 07:20:00 am Steven D'Aprano wrote: > On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote: > > > On Tue, Mar 24, 2009, Roy Hyunjin Han wrote: > > >> I know that Python has iterator methods called "sorted" and > > >> "reversed" and these are handy shortcuts. > > >> > > >> Why not add a new iterator method called "shuffled"? > > > > You can already write: > > > > sorted(s, key=lambda x: random()) > > > > But nobody does that. So you have a good > > indication that the proposed method isn't needed. > > That's nice -- not as readable as random.shuffle(s) but still nice. > And fast too: on my PC, it is about twice as fast as random.shuffle() > for "reasonable" sized lists (tested up to one million items). Ah crap. Ignore the above: I made an embarrassing error in my test (neglected to actually call random inside the lambda) and so my timings were completely wrong. The current random.shuffle() is marginally faster even for small lists (500 items) so I withdraw my suggestion that it be replaced. -- Steven D'Aprano From jh at improva.dk Tue Mar 24 22:44:56 2009 From: jh at improva.dk (Jacob Holm) Date: Tue, 24 Mar 2009 22:44:56 +0100 Subject: [Python-ideas] [Python-Dev] About adding a new iterator methodcalled "shuffled" In-Reply-To: <5c6f2a5d0903241144v68baa9dcp16275b3c88fc6d53@mail.gmail.com> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> <5c6f2a5d0903241144v68baa9dcp16275b3c88fc6d53@mail.gmail.com> Message-ID: <49C95458.1040403@improva.dk> Mark Dickinson wrote: > On Tue, Mar 24, 2009 at 6:29 PM, Antoine Pitrou wrote: > >> On the other hand, sorting is O(n.log(n)), which is probably sub-optimal for >> shuffling (I don't know how shuffle() is internally implemented, but ISTM that >> it shouldn't take more than O(n)). >> It doesn't. > > I assumed that the OP was suggesting something of the form: > > def shuffled(L): > while L: > i = random.randrange(len(L)) > yield L[i] > L.pop(i) > > fixed up somehow so that it's only O(1) to yield each element; in effect, > an itertools version of random.sample. Like this, for example: def shuffled(L): L = list(L) # make a copy, so we don't mutate the argument while L: i = random.randrange(len(L)) yield L[i] L[i] = L[-1] L.pop() Or this: def shuffled(L): D = {} # use a dict to store modified values so we don't have to mutate the argument for j in xrange(len(L)-1, -1, -1): i = random.randrange(j+1) yield D.get(i, L[i]) D[i] = D.get(j, L[j]) The second is a bit slower but avoids copying the whole list up front, which should be better for the kind of uses you mention. And yes, I think it is necessary that it doesn't modify its argument. > I could see uses for this > in cases where you only want a few randomly chosen elements > from a large list, but don't necessarily know in advance how many > elements you need So could I, but I don't mind too much having to write it myself when I need it. - Jacob From terry at jon.es Tue Mar 24 22:55:31 2009 From: terry at jon.es (Terry Jones) Date: Tue, 24 Mar 2009 22:55:31 +0100 Subject: [Python-ideas] About adding a new iterator methodcalled "shuffled" In-Reply-To: Your message at 08:40:11 on Wednesday, 25 March 2009 References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903250720.00433.steve@pearwood.info> <200903250840.11674.steve@pearwood.info> Message-ID: <18889.22227.482403.145267@jon.es> >>>>> "Steven" == Steven D'Aprano writes: Steven> On Wed, 25 Mar 2009 07:20:00 am Steven D'Aprano wrote: >> On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote: >> > > On Tue, Mar 24, 2009, Roy Hyunjin Han wrote: >> > >> I know that Python has iterator methods called "sorted" and >> > >> "reversed" and these are handy shortcuts. >> > >> >> > >> Why not add a new iterator method called "shuffled"? >> > >> > You can already write: >> > >> > sorted(s, key=lambda x: random()) >> > >> > But nobody does that. So you have a good >> > indication that the proposed method isn't needed. >> >> That's nice -- not as readable as random.shuffle(s) but still nice. And >> fast too: on my PC, it is about twice as fast as random.shuffle() for >> "reasonable" sized lists (tested up to one million items). Note that using sorting to shuffle is likely very inefficient. The sort takes O(n lg n) comparisons whereas you can do a perfect Fischer-Yates (aka Knuth) shuffle with <= n swaps. The model of computation here is different (comparisons vs swaps), but there is a vast literature on number of swaps done by sorting algorithms. In any case there's almost certainly no reason to use anything other than the standard Knuth shuffle, which is presumably what random.shuffle implements. Terry From python at rcn.com Tue Mar 24 23:20:39 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 24 Mar 2009 15:20:39 -0700 Subject: [Python-ideas] About adding a new iterator methodcalled"shuffled" References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><200903250720.00433.steve@pearwood.info><200903250840.11674.steve@pearwood.info> <18889.22227.482403.145267@jon.es> Message-ID: > Note that using sorting to shuffle is likely very inefficient. Who cares? The OP's goal was to save a few programmer clock cycles so he could in-line what we already get from random.shuffle(). His request is use case challenged (very few programs would benefit and those would only save a line or two). If he actually cares about O(n) time then it's trivial to write: s = list(iterable) random.shuffle(s) for elem in s: . . . But if he want's to mush it on one-line, I gave a workable alternative. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at improva.dk Tue Mar 24 23:22:11 2009 From: jh at improva.dk (Jacob Holm) Date: Tue, 24 Mar 2009 23:22:11 +0100 Subject: [Python-ideas] About adding a new iterator methodcalled "shuffled" In-Reply-To: <18889.22227.482403.145267@jon.es> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903250720.00433.steve@pearwood.info> <200903250840.11674.steve@pearwood.info> <18889.22227.482403.145267@jon.es> Message-ID: <49C95D13.5030201@improva.dk> Terry Jones wrote: > Steven> On Wed, 25 Mar 2009 07:20:00 am Steven D'Aprano wrote: > >>> On Wed, 25 Mar 2009 03:03:31 am Raymond Hettinger wrote: >>> >>>>> On Tue, Mar 24, 2009, Roy Hyunjin Han wrote: >>>>> >>>>>> I know that Python has iterator methods called "sorted" and >>>>>> "reversed" and these are handy shortcuts. >>>>>> >>>>>> Why not add a new iterator method called "shuffled"? >>>>>> >>>> You can already write: >>>> >>>> sorted(s, key=lambda x: random()) >>>> >>>> But nobody does that. So you have a good >>>> indication that the proposed method isn't needed. >>>> >>> That's nice -- not as readable as random.shuffle(s) but still nice. And >>> fast too: on my PC, it is about twice as fast as random.shuffle() for >>> "reasonable" sized lists (tested up to one million items). >>> > > Note that using sorting to shuffle is likely very inefficient. > > The sort takes O(n lg n) comparisons whereas you can do a perfect > Fischer-Yates (aka Knuth) shuffle with <= n swaps. The model of > computation here is different (comparisons vs swaps), but there is a vast > literature on number of swaps done by sorting algorithms. In any case > there's almost certainly no reason to use anything other than the standard > Knuth shuffle, which is presumably what random.shuffle implements. > > It is, I just checked. Other than implementing it in C, I don't see any way of significantly speeding this up. - Jacob From terry at jon.es Tue Mar 24 23:46:22 2009 From: terry at jon.es (Terry Jones) Date: Tue, 24 Mar 2009 23:46:22 +0100 Subject: [Python-ideas] About adding a new iterator methodcalled"shuffled" In-Reply-To: Your message at 15:20:39 on Tuesday, 24 March 2009 References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903250720.00433.steve@pearwood.info> <200903250840.11674.steve@pearwood.info> <18889.22227.482403.145267@jon.es> Message-ID: <18889.25278.144509.586820@jon.es> >>>>> "Raymond" == Raymond Hettinger writes: >> Note that using sorting to shuffle is likely very inefficient. Raymond> Who cares? The OP's goal was to save a few programmer clock Raymond> cycles so he could in-line what we already get from Raymond> random.shuffle(). Who cares? Jeez... did I say something to get your hackles up? I'm not sure if I see the original posting, but the one you first reference in the mailing list archives doesn't say anything about saving clock cycles. Supposing that is what he was after, posting a cute but O(n lg n) alternative without saying it's highly inefficient is directly counter to what you say he was looking for. The reason I even said anything was because someone (Roy?) then said "that's nice". That's like someone saying oh, you could do it like this with bubblesort, someone else saying "that's nice", and there the record stands, awaiting future generations of uneducated programmers. Anyway, apologies if you don't care or for commenting out loud on something that was perhaps obvious to everyone. BTW, I hadn't noticed Antoine's earlier message amounting to the same thing. He seems to care too :-) Terry From greg.ewing at canterbury.ac.nz Wed Mar 25 01:48:14 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 25 Mar 2009 12:48:14 +1200 Subject: [Python-ideas] About adding a new iterator methodcalled "shuffled" In-Reply-To: <200903250720.00433.steve@pearwood.info> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> <200903250720.00433.steve@pearwood.info> Message-ID: <49C97F4E.8010200@canterbury.ac.nz> Steven D'Aprano wrote: > In addition, the F-Y shuffle is limited by the period of the random > number generator: *All* shuffling algorithms are limited by that. Think about it: A shuffling algorithm is a function from a random number to a permutation. There's no way you can get more permutations out than there are random numbers to put in. -- Greg From terry at jon.es Wed Mar 25 02:06:10 2009 From: terry at jon.es (Terry Jones) Date: Wed, 25 Mar 2009 02:06:10 +0100 Subject: [Python-ideas] About adding a new iterator methodcalled "shuffled" In-Reply-To: Your message at 12:48:14 on Wednesday, 25 March 2009 References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> <200903250720.00433.steve@pearwood.info> <49C97F4E.8010200@canterbury.ac.nz> Message-ID: <18889.33666.764025.595818@jon.es> >>>>> "Greg" == Greg Ewing writes: Greg> *All* shuffling algorithms are limited by that. Greg> Think about it: A shuffling algorithm is a function from a random Greg> number to a permutation. There's no way you can get more permutations Greg> out than there are random numbers to put in. Hi Greg Maybe we should put a note to that effect in random.shuffle.__doc__ :-) http://mail.python.org/pipermail/python-dev/2006-June/065815.html Regards, Terry From python at rcn.com Wed Mar 25 02:59:27 2009 From: python at rcn.com (Raymond Hettinger) Date: Tue, 24 Mar 2009 18:59:27 -0700 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><20090324155828.GA15670@panix.com><200903250720.00433.steve@pearwood.info><49C97F4E.8010200@canterbury.ac.nz> <18889.33666.764025.595818@jon.es> Message-ID: <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1> > Greg> Think about it: A shuffling algorithm is a function from a random > Greg> number to a permutation. There's no way you can get more permutations > Greg> out than there are random numbers to put in. If our random number generator can produce more possible shuffles than there are atoms in the universe, I say you don't worry about it. Raymond From greg.ewing at canterbury.ac.nz Wed Mar 25 07:38:26 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 25 Mar 2009 18:38:26 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49C94CF6.5070301@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> Message-ID: <49C9D162.5040907@canterbury.ac.nz> Nick Coghlan wrote: > Greg Ewing wrote: > >>(1) In non-refcounting implementations, subiterators >>are finalized promptly when the delegating generator >>is explicitly closed. >> >>(2) Subiterators are not prematurely finalized when >>other references to them exist. > > If you choose (2), then (1) is trivial to implement > > with contextlib.closing(make_subiter()) as subiter: > yield from subiter That's a fairly horrendous thing to expect people to write around all their yield-froms, though. It also means we would have to say that the inlining principle only holds for refcounting implementations. Maybe we should just give up trying to accommodate shared subiterators. Is it worth complicating everything for the sake of something that's not really part of the intended set of use cases? > Hmm, that does suggest another issue with the PEP however: it only calls > the subiterator's throw with the value of the thrown in exception. It > should be using the 3 argument form to avoid losing any passed in > traceback information. Good point, I'll update the expansion accordingly. -- Greg From solipsis at pitrou.net Wed Mar 25 12:06:06 2009 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 25 Mar 2009 11:06:06 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?About_adding_a_new_iterator_methodcalled?= =?utf-8?b?CSJzaHVmZmxlZCI=?= References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> <200903250720.00433.steve@pearwood.info> <49C97F4E.8010200@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > > In addition, the F-Y shuffle is limited by the period of the random > > number generator: > > *All* shuffling algorithms are limited by that. > > Think about it: A shuffling algorithm is a function > from a random number to a permutation. There's no > way you can get more permutations out than there are > random numbers to put in. The period of the generator should be (much) larger than the number of possible random numbers, because of the generator's internal state. (I'm not sure I understood your sentence as you meant it though) From ncoghlan at gmail.com Wed Mar 25 13:17:54 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Mar 2009 22:17:54 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49C9D162.5040907@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> Message-ID: <49CA20F2.7040207@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: >> Greg Ewing wrote: >> >>> (1) In non-refcounting implementations, subiterators >>> are finalized promptly when the delegating generator >>> is explicitly closed. >>> >>> (2) Subiterators are not prematurely finalized when >>> other references to them exist. >> >> If you choose (2), then (1) is trivial to implement >> >> with contextlib.closing(make_subiter()) as subiter: >> yield from subiter > > That's a fairly horrendous thing to expect people to > write around all their yield-froms, though. It also > means we would have to say that the inlining principle > only holds for refcounting implementations. > > Maybe we should just give up trying to accommodate > shared subiterators. Is it worth complicating > everything for the sake of something that's not > really part of the intended set of use cases? Consider what happens if you replace the 'yield from' with the basic form of iterator delegation that exists now: for x in make_subiter(): yield x Is such code wrong in any way? No it isn't. Failing to finalise the object of iteration is the *normal* case. If for some reason it is important in a given application to finalise it properly (e.g. the subiter opens a database connection or file and we want to ensure they are closed promptly no matter what else happens), only *then* does deterministic finalisation come into play: with closing(make_subiter()) as subiter: for x in subiter: yield x That is, I now believe the 'normal' case for 'yield from' should be modelled on basic iteration, which means no implicit finalisation. Now, keep in mind that in parallel with this I am now saying that *all* exceptions, *including GeneratorExit* should be passed down to the subiterator if it has a throw() method. So even without implicit finalisation you can use "yield from" to nest generators to your heart's content and an explicit close on the outermost generator will be passed down to the innermost generator and unwind the generator stack from there. Using your "no finally clause" version from earlier in this thread as the base for the exact semantic description: _i = iter(EXPR) try: _u = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _v = yield _u except BaseException, _e: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(_e) else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: _r = _e.value break RESULT = _r With an expansion of that form, you can easily make arbitrary iterators (including generators) shareable by wrapping them in an iterator with no throw or send methods: class ShareableIterator(object): def __init__(self, itr): self.itr = itr def __iter__(self): return self def __next__(self): return self.itr.next() next = __next__ # Be 2.x friendly def close(self): # Still support explicit finalisation of the # shared iterator, just not throw() or send() try: close_itr = self.itr.close except AttributeError: pass else: close_itr() # Decorator to use the above on a generator function def shareable(g): @functools.wraps(g) def wrapper(*args, **kwds): return ShareableIterator(g(*args, **kwds)) return wrapper Iterators that need finalisation can either make themselves implicitly closable in yield from expressions by defining a throw() method that delegates to close() and then reraises the exception appropriately, or else they can recommend explicit closure regardless of the means of iteration (be it a for loop, a generator expression or container comprehension, manual iteration or the new yield from expression). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From steve at pearwood.info Wed Mar 25 13:28:48 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 25 Mar 2009 23:28:48 +1100 Subject: [Python-ideas] =?iso-8859-1?q?About_adding_a_new_iteratormethodca?= =?iso-8859-1?q?lled_=22shuffled=22?= In-Reply-To: <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <18889.33666.764025.595818@jon.es> <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1> Message-ID: <200903252328.49177.steve@pearwood.info> On Wed, 25 Mar 2009 12:59:27 pm Raymond Hettinger wrote: > > Greg> Think about it: A shuffling algorithm is a function from a > > random Greg> number to a permutation. There's no way you can get > > more permutations Greg> out than there are random numbers to put > > in. > > If our random number generator can produce more possible shuffles > than there are atoms in the universe, I say you don't worry about it. No, I'm afraid that is a fallacy, because what is important is the number of permutations in the list, and that grows as the factorial of the number of items. The Mersenne Twister has a period of 2**19937-1, which sounds huge, but it takes a list of only 2081 items for the number of permutations to exceed that. To spell it out in tedious detail: that means that random.shuffle() can produce every permutation of a list of 2080 items, but for 2081 items approximately 98% of the possibilities can't be reached. For 2082 items, approx 99.999% will never be reached. And so on. Don't get me wrong, random.shuffle() is perfectly adequate for any use-case I can think of. But beyond 2080 items in the list, it becomes greatly biased, and I think that's important to note in the docs. Those who need to know about it will be told, and those who don't care can continue to not care. -- Steven D'Aprano From ncoghlan at gmail.com Wed Mar 25 13:34:49 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 25 Mar 2009 22:34:49 +1000 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><20090324155828.GA15670@panix.com><200903250720.00433.steve@pearwood.info><49C97F4E.8010200@canterbury.ac.nz> <18889.33666.764025.595818@jon.es> <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1> Message-ID: <49CA24E9.3020706@gmail.com> Raymond Hettinger wrote: > >> Greg> Think about it: A shuffling algorithm is a function from a random >> Greg> number to a permutation. There's no way you can get more >> permutations >> Greg> out than there are random numbers to put in. > > If our random number generator can produce more possible shuffles than > there are atoms in the universe, I say you don't worry about it. The "long int too large to convert to float" error that I got on my first attempt at printing 2080! in scientific notation is also something of a hint :) The decimal module came to my rescue though: >>> +Decimal(math.factorial(2080)) Decimal('1.983139957541900373849131897E+6000') That's one heck of a big number! Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From jh at improva.dk Wed Mar 25 15:31:05 2009 From: jh at improva.dk (Jacob Holm) Date: Wed, 25 Mar 2009 15:31:05 +0100 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CA20F2.7040207@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> Message-ID: <49CA4029.6050703@improva.dk> Nick Coghlan wrote: > [snip arguments for modelling on basic iteration] > That is, I now believe the 'normal' case for 'yield from' should be > modelled on basic iteration, which means no implicit finalisation. > > Now, keep in mind that in parallel with this I am now saying that *all* > exceptions, *including GeneratorExit* should be passed down to the > subiterator if it has a throw() method. > I still think that is less useful than catching it and just dropping the reference, see below. > So even without implicit finalisation you can use "yield from" to nest > generators to your heart's content and an explicit close on the > outermost generator will be passed down to the innermost generator and > unwind the generator stack from there. > The same would happen with the *implicit* close caused by the last reference to the outermost generator going away. Delegating the GeneratorExit is a sure way to premature finalization when using shared generators, but only in a refcounting implementation like C-Python. That makes this the only feature I know of that would be *more* useful in a non-refcounting implementation. > Using your "no finally clause" version from earlier in this thread as > the base for the exact semantic description: > > _i = iter(EXPR) > try: > _u = _i.next() > except StopIteration, _e: > _r = _e.value > else: > while 1: > try: > _v = yield _u > except BaseException, _e: > _m = getattr(_i, 'throw', None) > if _m is not None: > _u = _m(_e) > else: > raise > else: > try: > if _v is None: > _u = _i.next() > else: > _u = _i.send(_v) > except StopIteration, _e: > _r = _e.value > break > RESULT = _r > > I know I didn't comment on that expansion earlier, but should have. It fails to handle the case where the throw raises a StopIteration (or there is no throw method and the thrown exception is a StopIteration). You need something like: _i = iter(EXPR) try: _u = _i.next() while 1: try: _v = yield _u # except GeneratorExit: # raise except BaseException: _m = getattr(_i, 'throw', None) if _m is not None: _u = _m(*sys.exc_info()) else: raise else: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration, _e: RESULT = _e.value finally: _i = _u = _v = _e = _m = None del _i, _u, _v, _e, _m This is independent of the GeneratorExit issue, but I put it in there as a comment just to make it clear what *I* think it should be if we are not putting a close in the finally clause. If we *do* put a call to close in the finally clause, the premature finalization of shared generators is guaranteed anyway, so there is not much point in specialcasing GeneratorExit. > With an expansion of that form, you can easily make arbitrary iterators > (including generators) shareable by wrapping them in an iterator with no > throw or send methods: > > class ShareableIterator(object): > def __init__(self, itr): > self.itr = itr > def __iter__(self): > return self > def __next__(self): > return self.itr.next() > next = __next__ # Be 2.x friendly > def close(self): > # Still support explicit finalisation of the > # shared iterator, just not throw() or send() > try: > close_itr = self.itr.close > except AttributeError: > pass > else: > close_itr() > > # Decorator to use the above on a generator function > def shareable(g): > @functools.wraps(g) > def wrapper(*args, **kwds): > return ShareableIterator(g(*args, **kwds)) > return wrapper > With this wrapper, you will not be able to throw *any* exceptions to the shared iterator. Even if you fix the wrapper to pass through all other exceptions than GeneratorExit, you will still completely lose the speed benefits of yield-from when doing so. (For next, send, and throw it is possible to completely bypass all the intervening generators, so the call overhead becomes independent of the number of generators in the yield-from chain. I have a patch that does exactly this, working except for details related to this discussion). It is not possible to write such a wrapper efficiently without making it a builtin and special-casing it in the yield-from implementation, and I don't think that is a good idea. > Iterators that need finalisation can either make themselves implicitly > closable in yield from expressions by defining a throw() method that > delegates to close() and then reraises the exception appropriately, or > else they can recommend explicit closure regardless of the means of > iteration (be it a for loop, a generator expression or container > comprehension, manual iteration or the new yield from expression). > A generator or iterator that needs closing should recommend explicit closing *anyway* to work correctly in other contexts on platforms other than C-Python. Not delegating GeneratorExit just happens to make it much simpler and faster to use shared generators/iterators that *don't* need immediate finalization. In C-Python you even get the finalization for free due to the refcounting, but of course relying on that is generally considered a bad idea. - Jacob From qrczak at knm.org.pl Wed Mar 25 22:41:27 2009 From: qrczak at knm.org.pl (Marcin 'Qrczak' Kowalczyk) Date: Wed, 25 Mar 2009 22:41:27 +0100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <200903252328.49177.steve@pearwood.info> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <18889.33666.764025.595818@jon.es> <87537F9E9CDC422ABA43F7A588083309@RaymondLaptop1> <200903252328.49177.steve@pearwood.info> Message-ID: <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> On Wed, Mar 25, 2009 at 13:28, Steven D'Aprano wrote: > Don't get me wrong, random.shuffle() is perfectly adequate for any > use-case I can think of. But beyond 2080 items in the list, it becomes > greatly biased, and I think that's important to note in the docs. Those > who need to know about it will be told, and those who don't care can > continue to not care. Why anyone would care? Orderings possible to obtain from a given good random number generator are quite uniformly distributed among all orderings. I bet you can't even predict any particular ordering which is impossible to obtain. There is no time to generate all orderings. The factorial of large numbers is just huge. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From greg.ewing at canterbury.ac.nz Wed Mar 25 22:47:18 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Mar 2009 09:47:18 +1200 Subject: [Python-ideas] About adding a new iterator methodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <20090324155828.GA15670@panix.com> <200903250720.00433.steve@pearwood.info> <49C97F4E.8010200@canterbury.ac.nz> Message-ID: <49CAA666.2020606@canterbury.ac.nz> Antoine Pitrou wrote: > The period of the generator should be (much) larger than the number of possible > random numbers, because of the generator's internal state. Hm, yes, I should have said a function from an RNG state to a permutation. The initial state of the RNG completely determines the permutation generated, so there can't be more permutations than states. -- Greg From greg.ewing at canterbury.ac.nz Wed Mar 25 23:02:55 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Mar 2009 10:02:55 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CA20F2.7040207@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> Message-ID: <49CAAA0F.7090507@canterbury.ac.nz> Nick Coghlan wrote: > That is, I now believe the 'normal' case for 'yield from' should be > modelled on basic iteration, which means no implicit finalisation. > > Now, keep in mind that in parallel with this I am now saying that *all* > exceptions, *including GeneratorExit* should be passed down to the > subiterator if it has a throw() method. But those two things are contradictory. In a refcounting Python implementation, dropping the last reference to the delegating generator will cause it to close() itself, thus throwing a GeneratorExit into the subiterator. If other references to the subiterator still exist, this means it gets prematurely finalized. > With an expansion of that form, you can easily make arbitrary iterators > (including generators) shareable by wrapping them in an iterator with no > throw or send methods: But if you need explicit wrappers to prevent finalization, then you hardly have "no implicit finalization". So I'm a bit confused about what behaviour you're really asking for. -- Greg From greg.ewing at canterbury.ac.nz Thu Mar 26 00:35:34 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Mar 2009 11:35:34 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CA4029.6050703@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> Message-ID: <49CABFC6.1080207@canterbury.ac.nz> Jacob Holm wrote: > It > fails to handle the case where the throw raises a StopIteration (or > there is no throw method and the thrown exception is a StopIteration). No, I think it does the right thing in that case. By the inlining principle, the StopIteration should be thrown in like anything else, and if it propagates back out, it should stop the delegating generator, *not* the subiterator. -- Greg From jh at improva.dk Thu Mar 26 00:40:46 2009 From: jh at improva.dk (Jacob Holm) Date: Thu, 26 Mar 2009 00:40:46 +0100 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CABFC6.1080207@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> Message-ID: <49CAC0FE.5010305@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: >> It fails to handle the case where the throw raises a StopIteration >> (or there is no throw method and the thrown exception is a >> StopIteration). > > No, I think it does the right thing in that case. By the > inlining principle, the StopIteration should be thrown > in like anything else, and if it propagates back out, > it should stop the delegating generator, *not* the > subiterator. > But if you throw another exception and it is converted to a StopIteration by the subiterator, this should definitely stop the subiterator and get a return value. Or? - Jacob From steve at pearwood.info Thu Mar 26 00:58:58 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 26 Mar 2009 10:58:58 +1100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> Message-ID: <200903261058.59164.steve@pearwood.info> On Thu, 26 Mar 2009 08:41:27 am Marcin 'Qrczak' Kowalczyk wrote: > On Wed, Mar 25, 2009 at 13:28, Steven D'Aprano wrote: > > Don't get me wrong, random.shuffle() is perfectly adequate for any > > use-case I can think of. But beyond 2080 items in the list, it > > becomes greatly biased, and I think that's important to note in the > > docs. Those who need to know about it will be told, and those who > > don't care can continue to not care. > > Why anyone would care? Orderings possible to obtain from a given good > random number generator are quite uniformly distributed among all > orderings. Yes, that holds true for n <= 2080, since Fisher-Yates is an unbiased shuffler. But I don't think it remains true for n > 2080 since the vast majority of possible permutations have probability zero. I'm not saying that this absolutely *will* introduce statistical bias into the shuffled lists, but it could, and those who care about that risk shouldn't have to read the source code to learn this. > I bet you can't even predict any particular ordering which > is impossible to obtain. A moral dilemma... should I take advantage of your innumeracy by taking you up on that bet, or should I explain why that bet is a sure thing for me? *wink* Since the chances of me collecting on the bet is essentially near zero, I'll explain. For a list with 2082 items, shuffle() chooses from a subset of approximately 0.001% of all possible permutations. This means that if I give you a list of 2082 items and tell you to shuffle it, and then guess that such-and-such a permutation of it will never be reached, I can only lose if by chance I guessed on the 1 in 100,000 permutations that shuffle() can reach. I have 99,999 chances to win versus 1 to lose: that's essentially a sure thing. In practical terms, beyond (say) 2085 or so, it would be a bona fide miracle if I didn't win such a bet. -- Steven D'Aprano From greg.ewing at canterbury.ac.nz Thu Mar 26 01:24:25 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Mar 2009 12:24:25 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CAC0FE.5010305@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> Message-ID: <49CACB39.3020708@canterbury.ac.nz> Jacob Holm wrote: > But if you throw another exception and it is converted to a > StopIteration by the subiterator, this should definitely stop the > subiterator and get a return value. Not if it simply raises a StopIteration from the throw call. It would have to mark itself as completed, return normally from the throw and then raise StopIteration on the next call to next() or send(). -- Greg From jh at improva.dk Thu Mar 26 01:50:37 2009 From: jh at improva.dk (Jacob Holm) Date: Thu, 26 Mar 2009 01:50:37 +0100 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CACB39.3020708@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> Message-ID: <49CAD15D.2090008@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: > >> But if you throw another exception and it is converted to a >> StopIteration by the subiterator, this should definitely stop the >> subiterator and get a return value. > > Not if it simply raises a StopIteration from the > throw call. It would have to mark itself as > completed, return normally from the throw and > then raise StopIteration on the next call to > next() or send(). > One of us must be missing something... If the subiterator is exhausted before the throw, there won't *be* a value to return from the call so the only options for the throw method are to raise StopIteraton, or to raise some other exception. Example: def inner(): try: yield 1 except ValueError: pass return 2 def outer(): v = yield from inner() yield v g = outer() print g.next() # prints 1 print g.throw(ValueError) # prints 2 In your expansion, the StopIteration raised by inner escapes the outer generator as well, so we get a StopIteration instead of the second print that I would expect. Can you explain in a little more detail how the inlining argument makes you want to not catch a StopIteration escaping from throw? - Jacob From ncoghlan at gmail.com Thu Mar 26 02:56:43 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Mar 2009 11:56:43 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CAAA0F.7090507@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> Message-ID: <49CAE0DB.3090104@gmail.com> Greg Ewing wrote: > But if you need explicit wrappers to prevent finalization, > then you hardly have "no implicit finalization". So I'm a > bit confused about what behaviour you're really asking for. I should have said no *new* mechanism for implicit finalisation. Deletion of the outer generator would, as you say, still call close() and throw GeneratorExit in. I like it because the rules are simple: either an exception is thrown in and passed down to the subiterator (which may have the effect of finalising it), or else the subiterator is left alone (to be finalised either explicitly or implicitly when it is deleted). There's then no special case along the lines of "if GeneratorExit is passed in we just drop our reference to the subiterator instead of passing the exception down", or "if you iterate over a subiterator using 'yield from' instead of a for loop then the subiterator will automatically be closed at the end of the expression". No matter what you do with regards to finalisation, you're going to demand extra work from somebody. The simple rule means that subiterators will see all exceptions (even GeneratorExit), allowing them to handle their own finalisation needs, while shareable subiterators are also possible so long as they don't have throw() methods. The idea of a shareable iterator that *does* support send() or throw() just doesn't make any sense to me. Splitting up a data feed amongst multiple peer consumers, OK, that's fairly straightforward and I can easily imagine uses for it in a generator based coding style (e.g. having multiple clients pulling requests from a job queue). But having multiple peer writers attempting to feed values or exceptions back into that single iterator that can neither tell which writer a particular value or exception came from, nor direct results to particular consumers? That sounds like utter insanity. If you want to create a shareable iterator, preventing use of send() and throw() strikes me as a *very* good idea. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Thu Mar 26 03:17:24 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Mar 2009 12:17:24 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CAD15D.2090008@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> Message-ID: <49CAE5B4.4080005@gmail.com> Jacob Holm wrote: > Greg Ewing wrote: >> Jacob Holm wrote: >> >>> But if you throw another exception and it is converted to a >>> StopIteration by the subiterator, this should definitely stop the >>> subiterator and get a return value. >> >> Not if it simply raises a StopIteration from the >> throw call. It would have to mark itself as >> completed, return normally from the throw and >> then raise StopIteration on the next call to >> next() or send(). >> > One of us must be missing something... If the subiterator is exhausted > before the throw, there won't *be* a value to return from the call so > the only options for the throw method are to raise StopIteraton, or to > raise some other exception. I agree with Jacob here - contextlib.contextmanager contains a similar check in its __exit__ method. The thing to check for is the throw method call raising StopIteration and that StopIteration instance being a *different* exception from the one that was thrown in. (This matters more in the contextmanager case, since it is quite legitimate for a generator to finish and raise StopIteration from inside a with statement, so the contextmanager needs to avoid accidentally suppressing that exception). Avoiding the problem of suppressing thrown in StopIteration instances means we still need multiple inner try/except blocks rather than a large outer one. There is also another special case to consider: since a permitted response to "throw(GeneratorExit)" is for the iterator to just terminate instead of reraising GeneratorExit, the thrown in exception should be reraised unconditionally in that situation. So the semantics would then become: _i = iter(EXPR) try: _u = _i.next() except StopIteration as _e: _r = _e.value else: while 1: try: _v = yield _u except: _m = getattr(_i, 'throw', None) if _m is not None: _et, _ev, _tb = sys.exc_info() try: _u = _m(_et, _ev, _tb) except StopIteration as _e: if _e is _ev or _et is GeneratorExit: # Don't suppress a thrown in # StopIteration and handle the # case where a subiterator # handles GeneratorExit by # terminating rather than # reraising the exception raise # The thrown in exception # terminated the iterator # gracefully _r = _e.value else: raise else: try: if _v is None: _u = _i.next() else: _u = _i.send(_v) except StopIteration as _e: _r = _e.value break RESULT = _r -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From guido at python.org Thu Mar 26 04:56:46 2009 From: guido at python.org (Guido van Rossum) Date: Wed, 25 Mar 2009 20:56:46 -0700 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49C81A45.1070803@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> Message-ID: On Mon, Mar 23, 2009 at 4:24 PM, Greg Ewing wrote: > We have a decision to make. It appears we can have > *one* of the following, but not both: > > (1) In non-refcounting implementations, subiterators > are finalized promptly when the delegating generator > is explicitly closed. > > (2) Subiterators are not prematurely finalized when > other references to them exist. > > Since in the majority of intended use cases the > subiterator won't be shared, (1) seems like the more > important guarantee to uphold. Does anyone disagree > with that? > > Guido, what do you think? Gee, I'm actually glad I waited a while, because the following discussion shows that this is a really hairy issue... I think (1) means propagating GeneratorExit into the subgenerator (and recursively if that's also waiting in a yield-from), while (2) would mean not propagating it, right? I agree that (1) seems to make more sense unless you can think of a use case for (2) -- and it seems from Nick's last post that such a use case would have to be rather horrendously outrageous... -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Thu Mar 26 06:40:46 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Mar 2009 17:40:46 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CAD15D.2090008@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> Message-ID: <49CB155E.4040504@canterbury.ac.nz> Jacob Holm wrote: > Can you explain in a little more detail how the > inlining argument makes you want to not catch a StopIteration escaping > from throw? It's easier to see if we use an example that doesn't involve a return value, since it's clearer what "inlining" means in that case. def inner(): try: yield 1 except ValueError: pass def outer(): print "About to yield from inner" yield from inner() print "Finished yielding from inner" Now if we inline that, we get: def outer_and_inner(): print "About to yield from inner" try: yield 1 except ValueError: pass print "Finished yielding from inner" What would you expect that to do if you throw StopIteration into it while it's suspended at the yield? However, thinking about the return value case has made me realize that it's not so obvious what "inlining" means then. To get the return value in your example, one way would be to perform the inlining like this: def outer(): try: try: yield 1 except ValueError: pass raise StopIteration(2) except StopIteration, e: v = e.value yield v which results in the behaviour you are expecting. However, if you were inlining an ordinary function, that's not how you would handle a return value -- rather, you'd just replace the return by a statement that assigns the return value to wherever it needs to go. Using that strategy, we get def outer(): try: yield 1 except ValueError: pass v = 2 yield v That's closer to what I have in mind when I talk about "inlining" in the PEP. I realize that this is probably not exactly what the current expansion specifies. I'm working on a new one to fix issues like this. -- Greg From greg.ewing at canterbury.ac.nz Thu Mar 26 07:00:57 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Mar 2009 18:00:57 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CAE0DB.3090104@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> Message-ID: <49CB1A19.2000305@canterbury.ac.nz> Nick Coghlan wrote: > I like it because the rules are simple: either an exception is thrown in > and passed down to the subiterator (which may have the effect of > finalising it), or else the subiterator is left alone (to be finalised > either explicitly or implicitly when it is deleted). Okay, so you're in favour of accepting the risk of prematurely finalizing shared subiterators, on the grounds that it can be prevented using a wrapper in the rare cases where it matters. I can live with that, and in fact it's more or less where my most recent thinking has been leading me. > I like it because the rules are simple: either an exception is thrown in > and passed down to the subiterator (which may have the effect of > finalising it), or else the subiterator is left alone (to be finalised > either explicitly or implicitly when it is deleted). We might still want one special case. If GeneratorExit is thrown and the subiterator has no throw() or the GeneratorExit propagates back out of the throw(), I think an attempt should be made to close() it. Otherwise, explicitly closing the delegating generator wouldn't be guaranteed to finalize the subiterator unless it had a throw() method, whereas one would expect having close() to be sufficient for this. -- Greg From ncoghlan at gmail.com Thu Mar 26 07:32:28 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Mar 2009 16:32:28 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CB1A19.2000305@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> Message-ID: <49CB217C.6000603@gmail.com> Greg Ewing wrote: > We might still want one special case. If GeneratorExit is thrown > and the subiterator has no throw() or the GeneratorExit propagates > back out of the throw(), I think an attempt should be made to > close() it. Otherwise, explicitly closing the delegating generator > wouldn't be guaranteed to finalize the subiterator unless it had > a throw() method, whereas one would expect having close() to be > sufficient for this. I'm not so sure about that - we don't do it for normal iteration, so why would we do it for the new expression? However, I've been pondering the shareable iterator case a bit more, and in trying to come up with even a toy example, I couldn't think of anything that wouldn't be better handled just by actually *iterating* over the shared iterator with a for loop. Since the main advantage that the new expression has over simple iteration is delegating send() and throw() correctly, and I'm suggesting that shared iterators and those two methods don't mix, perhaps this whole issue can be set aside? Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From rhamph at gmail.com Thu Mar 26 07:42:09 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 26 Mar 2009 00:42:09 -0600 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <200903261058.59164.steve@pearwood.info> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> Message-ID: On Wed, Mar 25, 2009 at 5:58 PM, Steven D'Aprano wrote: > On Thu, 26 Mar 2009 08:41:27 am Marcin 'Qrczak' Kowalczyk wrote: >> Why anyone would care? Orderings possible to obtain from a given good >> random number generator are quite uniformly distributed among all >> orderings. > > Yes, that holds true for n <= 2080, since Fisher-Yates is an unbiased > shuffler. But I don't think it remains true for n > 2080 since the vast > majority of possible permutations have probability zero. I'm not saying > that this absolutely *will* introduce statistical bias into the > shuffled lists, but it could, and those who care about that risk > shouldn't have to read the source code to learn this. If random.shuffle() is broken for lists more than 2080 then it should raise an error. Claiming it "might" be broken in the docs for moderately sized lists, without researching such a claim, is pointless fear mongering. >> I bet you can't even predict any particular ordering which >> is impossible to obtain. > > A moral dilemma... should I take advantage of your innumeracy by taking > you up on that bet, or should I explain why that bet is a sure thing > for me? *wink* > > Since the chances of me collecting on the bet is essentially near zero, > I'll explain. > > For a list with 2082 items, shuffle() chooses from a subset of > approximately 0.001% of all possible permutations. This means that if I > give you a list of 2082 items and tell you to shuffle it, and then > guess that such-and-such a permutation of it will never be reached, I > can only lose if by chance I guessed on the 1 in 100,000 permutations > that shuffle() can reach. I have 99,999 chances to win versus 1 to > lose: that's essentially a sure thing. > > In practical terms, beyond (say) 2085 or so, it would be a bona fide > miracle if I didn't win such a bet. Go ahead, pick a combination, then iterate through all 2**19937-1 permutations to prove you're correct. Don't worry, we can wait. Of course a stronger analysis technique can prove it much quicker than brute force, but it's not a cryptographically secure PRNG, there's LOTS of information that can be found through such techniques. So far the 2080 limit is random trivia, nothing more. It has no real significance, imposes no new threats, and does not change how correct code is written. -- Adam Olsen, aka Rhamphoryncus From arnodel at googlemail.com Thu Mar 26 09:31:24 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Thu, 26 Mar 2009 08:31:24 +0000 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <200903261058.59164.steve@pearwood.info> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> Message-ID: <9bfc700a0903260131o4d209ca7m4845d12fff10079b@mail.gmail.com> 2009/3/25 Steven D'Aprano : > On Thu, 26 Mar 2009 08:41:27 am Marcin 'Qrczak' Kowalczyk wrote: >> On Wed, Mar 25, 2009 at 13:28, Steven D'Aprano > wrote: >> I bet you can't even predict any particular ordering which >> is impossible to obtain. > > A moral dilemma... should I take advantage of your innumeracy by taking > you up on that bet, or should I explain why that bet is a sure thing > for me? *wink* Your challenge was to exhibit a particular permutation which the algorithm will not generate. For good measure I think you should also join a proof that it won't be generated (since there isn't enough time or, probably, space, to test it). -- Arnaud From greg.ewing at canterbury.ac.nz Thu Mar 26 09:36:58 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 26 Mar 2009 20:36:58 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CB217C.6000603@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> Message-ID: <49CB3EAA.7090909@canterbury.ac.nz> Nick Coghlan wrote: > I'm not so sure about that - we don't do it for normal iteration, so why > would we do it for the new expression? Because of the inlining principle. If you inline a subgenerator, the result is just a single generator, and closing it finalizes the whole thing. > Since the main advantage that the new expression has over simple > iteration is delegating send() and throw() correctly, and I'm suggesting > that shared iterators and those two methods don't mix, perhaps this > whole issue can be set aside? Sounds good to me. -- Greg From stephen at xemacs.org Thu Mar 26 11:31:16 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 26 Mar 2009 19:31:16 +0900 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> Message-ID: <87iqlwy3rf.fsf@xemacs.org> Adam Olsen writes: > If random.shuffle() is broken for lists more than 2080 then it should > raise an error. Not really. Assuming the initial state is drawn from a uniform distribution on all possible states, if all 2080-shuffles are equiprobable, then any statistic you care to calculate based on that will come out the same as if you had 2080 statistically independent draws without replacement. Another way to put it is "if you need a random shuffle, this one is good enough for *any* such purpose". However, once you exceed that limit you have to ask whether it's good enough for the purpose at hand. For some purposes, the distribution of (2**19937-1)-shuffles might be good enough, even though they make up only 1/(2**19937-2) of the possible shuffles. (Yeah, I know, you can wait....) > Claiming it "might" be broken in the docs for moderately sized > lists, without researching such a claim, is pointless fear > mongering. How about if it's phrased the way I did above? Ie, this is good enough for any N-shuffle for *any purpose whatsoever*, for N < 2081. From ncoghlan at gmail.com Thu Mar 26 11:38:29 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 26 Mar 2009 20:38:29 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CB3EAA.7090909@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> Message-ID: <49CB5B25.4070105@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: > >> I'm not so sure about that - we don't do it for normal iteration, so why >> would we do it for the new expression? > > Because of the inlining principle. If you inline a > subgenerator, the result is just a single generator, > and closing it finalizes the whole thing. That makes perfect sense to me as a justification for treating GeneratorExit the same as any other exception (i.e. delegating it to the subgenerator). It doesn't lead me to think that the semantics ever need to involve calling close(). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From jh at improva.dk Thu Mar 26 15:16:42 2009 From: jh at improva.dk (Jacob Holm) Date: Thu, 26 Mar 2009 15:16:42 +0100 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CB155E.4040504@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> Message-ID: <49CB8E4A.3050108@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: > >> Can you explain in a little more detail how the inlining argument >> makes you want to not catch a StopIteration escaping from throw? > [snip explanation] Thank you very much for the clear explanation. It seems each of us were missing something. AFAICT your latest expansion (reproduced below) fixes this. I have a few (final, I hope) nits to pick about the finally clause. To start with there is no need for a separate "try". Just adding the finally clause to the next try..except..else has the exact same semantics. Then there is the contents of the finally clause. It is either too much or too little, depending on what it is you are trying to specify. If the intent is to show that the last reference from the expansion to _i disappears here, it fails because _m is likely to hold a reference as well. In any case I don't see a reason to single out _i for deletion. I suggest just dropping the finally clause altogether to make it clear that we are not promising any finalization beyond what is explicit in the rest of the code. - Jacob ------------------------------------------------------------------------ _i = iter(EXPR) try: try: _y = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _s = yield _y except: _m = getattr(_i, 'throw', None) if _m is not None: _x = sys.exc_info() try: _y = _m(*_x) except StopIteration, _e: if _e is _x[1]: raise else: _r = _e.value break else: _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _s is None: _y = _i.next() else: _y = _i.send(_s) except StopIteration, _e: _r = _e.value break finally: del _i RESULT = _r From grosser.meister.morti at gmx.net Thu Mar 26 16:07:25 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Thu, 26 Mar 2009 16:07:25 +0100 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <49ABCF35.5030002@molden.no> References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> <49ABCF35.5030002@molden.no> Message-ID: <49CB9A2D.4000905@gmx.net> Sturla Molden wrote: > On 3/1/2009 9:57 PM, Christian Heimes wrote: > >> with a, b as x, d as y: > > I'd like to add that parentheses improve readability here: > > with a, (b as x), (d as y): > > I am worried the proposed syntax could be a source of confusion and > errors. E.g. when looking at > > with a,b as c,d: > > my eyes read > > with nested(a,b) as c,d: > > when Python would read > > with a,(b as c),d: > > Good point. Maybe that would be better: with a,b as c,d: reads as: with nested(a,b) as c,d: This means there can only be one "as" in a with statement with the further implication that even unneeded values have to be assigned: with a,b,c as x,unused,y: Not as nice, but much more unambiguous. Unambiguity is what we need, I think. You can always assing to _, wich is very commonly used for unneeded values (well, or for the l10n hook - so using that name would not be very unambiguous). -panzi From guido at python.org Thu Mar 26 16:14:44 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 26 Mar 2009 08:14:44 -0700 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <49CB9A2D.4000905@gmx.net> References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> <49ABCF35.5030002@molden.no> <49CB9A2D.4000905@gmx.net> Message-ID: On Thu, Mar 26, 2009 at 8:07 AM, Mathias Panzenb?ck wrote: > Sturla Molden wrote: >> On 3/1/2009 9:57 PM, Christian Heimes wrote: >> >>> ? with a, b as x, d as y: >> >> I'd like to add that parentheses improve readability here: >> >> ? ?with a, (b as x), (d as y): >> >> I am worried the proposed syntax could be a source of confusion and >> errors. E.g. when looking at >> >> ? ?with a,b as c,d: >> >> my eyes read >> >> ? ?with nested(a,b) as c,d: >> >> when Python would read >> >> ? ?with a,(b as c),d: >> >> > > Good point. Maybe that would be better: > > ? with a,b as c,d: > > reads as: > > ? with nested(a,b) as c,d: > > This means there can only be one "as" in a with statement with the further > implication that even unneeded values have to be assigned: > > ? with a,b,c as x,unused,y: > > Not as nice, but much more unambiguous. Unambiguity is what we need, I > think. You can always assing to _, wich is very commonly used for unneeded > values (well, or for the l10n hook - so using that name would not be very > unambiguous). No, we should maintain the parallel with" import a, b as c, d". -- --Guido van Rossum (home page: http://www.python.org/~guido/) From sturla at molden.no Thu Mar 26 16:42:02 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 26 Mar 2009 16:42:02 +0100 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <49CB9A2D.4000905@gmx.net> References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> <49ABCF35.5030002@molden.no> <49CB9A2D.4000905@gmx.net> Message-ID: <49CBA24A.1050502@molden.no> On 3/26/2009 4:07 PM, Mathias Panzenb?ck wrote: > Good point. Maybe that would be better: > > with a,b as c,d: No. See Guido's reply. I was just trying to say that with nested(a,b) as c,d: is more readble than with a as c, b as d: which would argue against new syntax and better documentation of contextlib.nested. However, as the tuple (a,b) is built prior to the call to nested, new syntax is needed. It still does not hurt to put in parentheses for readability here: with (a as c), (b as d): Perhaps parentheses should be recommended in the documentation, even though they are syntactically superfluous here? Sturla Molden From guido at python.org Thu Mar 26 17:33:45 2009 From: guido at python.org (Guido van Rossum) Date: Thu, 26 Mar 2009 09:33:45 -0700 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: <49CBA24A.1050502@molden.no> References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> <49ABCF35.5030002@molden.no> <49CB9A2D.4000905@gmx.net> <49CBA24A.1050502@molden.no> Message-ID: On Thu, Mar 26, 2009 at 8:42 AM, Sturla Molden wrote: > On 3/26/2009 4:07 PM, Mathias Panzenb?ck wrote: > >> Good point. Maybe that would be better: >> >> ? with a,b as c,d: > > No. See Guido's reply. > > I was just trying to say that > > ?with nested(a,b) as c,d: > > is more readble than > > ?with a as c, b as d: > > which would argue against new syntax and better documentation of > contextlib.nested. However, as the tuple (a,b) is built prior to the call to > nested, new syntax is needed. > > It still does not hurt to put in parentheses for readability here: > > ?with (a as c), (b as d): > > > Perhaps parentheses should be recommended in the documentation, even though > they are syntactically superfluous here? No, the parens will be syntactically *illegal*. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From rhamph at gmail.com Thu Mar 26 18:43:35 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 26 Mar 2009 11:43:35 -0600 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <87iqlwy3rf.fsf@xemacs.org> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> Message-ID: On Thu, Mar 26, 2009 at 4:31 AM, Stephen J. Turnbull wrote: > Adam Olsen writes: > > ?> If random.shuffle() is broken for lists more than 2080 then it should > ?> raise an error. > > Not really. ?Assuming the initial state is drawn from a uniform > distribution on all possible states, if all 2080-shuffles are > equiprobable, then any statistic you care to calculate based on that > will come out the same as if you had 2080 statistically independent > draws without replacement. ?Another way to put it is "if you need a > random shuffle, this one is good enough for *any* such purpose". > > However, once you exceed that limit you have to ask whether it's good > enough for the purpose at hand. ?For some purposes, the distribution > of (2**19937-1)-shuffles might be good enough, even though they make > up only 1/(2**19937-2) of the possible shuffles. ?(Yeah, I know, you > can wait....) Is it or is it not broken? That's all I want to know. "maybe" isn't good enough. "Not broken for small lists" implies it IS broken for large lists. Disabling it (raising an exception for large lists) is of course just a stopgap measure. Better would be a PRNG with a much larger period.. but of course that'd require more CPU time and more seed. -- Adam Olsen, aka Rhamphoryncus From tim.peters at gmail.com Thu Mar 26 18:55:22 2009 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 26 Mar 2009 13:55:22 -0400 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> Message-ID: <1f7befae0903261055l35a845acr17529d78efcf136e@mail.gmail.com> [Adam Olsen] > Is it or is it not broken? ?That's all I want to know. Then you first need to define what "broken" means to you. Anything short of a source of /true/ random numbers is "broken" for /some/ purposes. Python's current generator is not broken for any purposes I care about, so my answer to your question is "no" -- but only if I ask your question of myself ;-) > Better would be a PRNG with a much larger period.. Not really. A larger period is necessary but not sufficient /if/ you're concerned about generating all permutations of bigger lists with equal probability -- see the old thread someone else pointed to for more info on that. The Mersenne Twister's provably superb "high-dimensional equidistribution" properties are far more important than its long period in this respect (the former is sufficient; the latter is merely necessary). From phd at phd.pp.ru Thu Mar 26 19:01:33 2009 From: phd at phd.pp.ru (Oleg Broytmann) Date: Thu, 26 Mar 2009 21:01:33 +0300 Subject: [Python-ideas] About adding a new iterator method called "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> Message-ID: <20090326180133.GB7849@phd.pp.ru> On Thu, Mar 26, 2009 at 11:43:35AM -0600, Adam Olsen wrote: > Is it or is it not broken? That's all I want to know. "maybe" isn't > good enough. "Not broken for small lists" implies it IS broken for > large lists. Practicality beats purity, IMHO. If shuffle cannot process a list I cannot even fit into virtual memory - I don't care, really. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From jh at improva.dk Thu Mar 26 19:18:46 2009 From: jh at improva.dk (Jacob Holm) Date: Thu, 26 Mar 2009 19:18:46 +0100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> Message-ID: <49CBC706.3060504@improva.dk> Oleg Broytmann wrote: > On Thu, Mar 26, 2009 at 11:43:35AM -0600, Adam Olsen wrote: > >> Is it or is it not broken? That's all I want to know. "maybe" isn't >> good enough. "Not broken for small lists" implies it IS broken for >> large lists. >> > > Practicality beats purity, IMHO. If shuffle cannot process a list > I cannot even fit into virtual memory - I don't care, really. > > Oleg. > A list of 2081 items certainly fits into the memory of my machine. There is a very clear sense in which it is broken for lists > 2080 items. It *may* be broken in other ways and for certain use cases for shorter lists, I don't know enough about the properties of the the PRNG to say anything about that. I *think* someone mentioned that it was equidistributed in 623 dimensions and that this should mean it is as good as possible for any PRNG for any list up to 623 items. If this is true, it would be nice to have a note about it in the docs. Documented limitations are always better than undocumented ones. (Explicit is better than implicit...) - Jacob From amcnabb at mcnabbs.org Thu Mar 26 18:54:17 2009 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Thu, 26 Mar 2009 11:54:17 -0600 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> Message-ID: <20090326175417.GT2948@mcnabbs.org> On Thu, Mar 26, 2009 at 11:43:35AM -0600, Adam Olsen wrote: > > Is it or is it not broken? That's all I want to know. "maybe" isn't > good enough. "Not broken for small lists" implies it IS broken for > large lists. > > Disabling it (raising an exception for large lists) is of course just > a stopgap measure. Better would be a PRNG with a much larger period.. > but of course that'd require more CPU time and more seed. It's only broken in a theoretical sense. It's fun to think about, but I wouldn't lose any sleep over it. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 From python at rcn.com Thu Mar 26 19:36:28 2009 From: python at rcn.com (Raymond Hettinger) Date: Thu, 26 Mar 2009 11:36:28 -0700 Subject: [Python-ideas] About adding a newiteratormethodcalled "shuffled" References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com><200903252328.49177.steve@pearwood.info><3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com><200903261058.59164.steve@pearwood.info><87iqlwy3rf.fsf@xemacs.org> <20090326175417.GT2948@mcnabbs.org> Message-ID: <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1> >> Is it or is it not broken? That's all I want to know. "maybe" isn't >> good enough. "Not broken for small lists" implies it IS broken for >> large lists. >> >> Disabling it (raising an exception for large lists) is of course just >> a stopgap measure. Better would be a PRNG with a much larger period.. >> but of course that'd require more CPU time and more seed. > > It's only broken in a theoretical sense. It's fun to think about, but I > wouldn't lose any sleep over it. It's not even broken in a theoretical sense. It does exactly what it says it does. Besides, this whole conversation is somewhat senseless. You can't get any more randomness out of a generator than you put into the seed in the first place. If you're not putting thousands of digits in your seed, then no PRNG is going to give you an equal chance of producing every possible shuffle for a large list. Raymond "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin." -- John von Neumann From rhamph at gmail.com Thu Mar 26 19:48:20 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 26 Mar 2009 12:48:20 -0600 Subject: [Python-ideas] About adding a newiteratormethodcalled "shuffled" In-Reply-To: <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <20090326175417.GT2948@mcnabbs.org> <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1> Message-ID: On Thu, Mar 26, 2009 at 12:36 PM, Raymond Hettinger wrote: > It's not even broken in a theoretical sense. ?It does exactly what it says > it does. > > Besides, this whole conversation is somewhat senseless. ?You can't get any > more randomness out of a generator than you put into the seed in the > first place. ?If you're not putting thousands of digits in your seed, then > no PRNG is going to give you an equal chance of producing every possible > shuffle for a large list. Indeed, a million item list requires over 2 megabytes of entropy. -- Adam Olsen, aka Rhamphoryncus From amcnabb at mcnabbs.org Thu Mar 26 19:50:45 2009 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Thu, 26 Mar 2009 12:50:45 -0600 Subject: [Python-ideas] About adding a newiteratormethodcalled "shuffled" In-Reply-To: <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1> References: <20090326175417.GT2948@mcnabbs.org> <09F20AEF3B3C4F6CA263B98FAC1C795F@RaymondLaptop1> Message-ID: <20090326185045.GU2948@mcnabbs.org> On Thu, Mar 26, 2009 at 11:36:28AM -0700, Raymond Hettinger wrote: > >> It's only broken in a theoretical sense. It's fun to think about, but I >> wouldn't lose any sleep over it. > > It's not even broken in a theoretical sense. It does exactly what it says it does. > > Besides, this whole conversation is somewhat senseless. You can't get any > more randomness out of a generator than you put into the seed in the > first place. If you're not putting thousands of digits in your seed, then > no PRNG is going to give you an equal chance of producing every possible > shuffle for a large list. I agree with you--I just didn't want to make too strong of a statement. I certainly believe that any comment in the docs about this issue would be distracting and unhelpful. -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 From jan.kanis at phil.uu.nl Thu Mar 26 21:19:16 2009 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Thu, 26 Mar 2009 21:19:16 +0100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <87iqlwy3rf.fsf@xemacs.org> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> Message-ID: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> Just out of curiosity, would doing l = range(2082) random.shuffle(l) random.shuffle(l) give me (with a high probability) one of those permutations that is unreachable with a single shuffle? If so, I'd presume you could get any shuffle (in case you really cared) by calling random.shuffle repeatedly and reseeding the prng in between. From rrr at ronadam.com Thu Mar 26 21:26:13 2009 From: rrr at ronadam.com (Ron Adam) Date: Thu, 26 Mar 2009 15:26:13 -0500 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CB3EAA.7090909@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> Message-ID: <49CBE4E5.6010305@ronadam.com> Greg Ewing wrote: > Nick Coghlan wrote: > >> I'm not so sure about that - we don't do it for normal iteration, so why >> would we do it for the new expression? > > Because of the inlining principle. If you inline a > subgenerator, the result is just a single generator, > and closing it finalizes the whole thing. > >> Since the main advantage that the new expression has over simple >> iteration is delegating send() and throw() correctly, and I'm suggesting >> that shared iterators and those two methods don't mix, perhaps this >> whole issue can be set aside? > > Sounds good to me. Just a thought... If the subgenerator does not interact with the generator it is in after it is started, then wouldn't it be as if it replaces the calling generator for the life of the sub generator? So instead of in-lining, can it be thought of more like switching-to another generator? Ron From rhamph at gmail.com Thu Mar 26 22:25:58 2009 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 26 Mar 2009 15:25:58 -0600 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> Message-ID: On Thu, Mar 26, 2009 at 2:19 PM, Jan Kanis wrote: > Just out of curiosity, would doing > > l = range(2082) > random.shuffle(l) > random.shuffle(l) > > give me (with a high probability) one of those permutations that is > unreachable with a single shuffle? If so, I'd presume you could get > any shuffle (in case you really cared) by calling random.shuffle > repeatedly and reseeding the prng in between. If you reseed, yes. That injects new entropy into the system. As I said though, you can end up needing megabytes of entropy. -- Adam Olsen, aka Rhamphoryncus From tjreedy at udel.edu Thu Mar 26 23:15:44 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 26 Mar 2009 18:15:44 -0400 Subject: [Python-ideas] with statement: multiple context manager In-Reply-To: References: <3f6c86f50903011153u25778ccehfb01f11c046cf882@mail.gmail.com> <50697b2c0903011222l56b19511w161b4f503ecba730@mail.gmail.com> <49ABCF35.5030002@molden.no> <49CB9A2D.4000905@gmx.net> Message-ID: Guido van Rossum wrote: > No, we should maintain the parallel with" import a, b as c, d". Which should then be mentioned in the doc. From stephen at xemacs.org Fri Mar 27 04:17:47 2009 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 27 Mar 2009 12:17:47 +0900 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> Message-ID: <87eiwjy7qc.fsf@xemacs.org> Adam Olsen writes: > Is it or is it not broken? What is so hard to understand about "depending on the statistical properties you demand, it may be broken and then again it may not?" > That's all I want to know. "maybe" isn't good enough. "If you have to ask, you can't afford it." Ie, you've defined your own answer: it's broken *for you*. The rest of us would like to be allowed to judge for ourselves, though. > "Not broken for small lists" implies it IS broken for large lists. You're being contentious. It logically implies no such thing, nor is it idiomatically an implication among consenting adults. And in any case, the phrasing I recommended is "guaranteed to have uniform distribution of shuffles up to N". The implication of "no guarantee" is "have a mechanic inspect it before you buy", not "this is a lemon". From greg.ewing at canterbury.ac.nz Fri Mar 27 05:28:30 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Mar 2009 17:28:30 +1300 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> Message-ID: <49CC55EE.4070208@canterbury.ac.nz> Jan Kanis wrote: > I'd presume you could get > any shuffle (in case you really cared) by calling random.shuffle > repeatedly and reseeding the prng in between. But how are you going to reseed the prng? To get an equal likelihood of any shuffle, you need another prng with a big enough state to generate seeds for your first prng. But then you might just as well shuffle based on that larger prng in the first place. -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 27 06:00:53 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Mar 2009 18:00:53 +1300 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CB8E4A.3050108@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> Message-ID: <49CC5D85.30409@canterbury.ac.nz> Jacob Holm wrote: > Just adding the > finally clause to the next try..except..else has the exact same semantics. True -- I haven't quite got used to the idea that you can do that yet! > In any case I don't see a reason to single out _i for deletion. That part seems to be a hangover from an earlier version. You're probably right that it can go. -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 27 06:01:08 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Mar 2009 18:01:08 +1300 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CB5B25.4070105@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com> Message-ID: <49CC5D94.7000609@canterbury.ac.nz> Nick Coghlan wrote: > That makes perfect sense to me as a justification for treating > GeneratorExit the same as any other exception (i.e. delegating it to the > subgenerator). It doesn't lead me to think that the semantics ever need > to involve calling close(). I'm also treating close() and throw(GeneratorExit) on the delegating generator as equivalent for finalization purposes. So if throw(GeneratorExit) doesn't fall back to close() on the subiterator, closing the delegating generator won't finalize the subiterator unless it pretends to be a generator by implementing throw(). Since the inlining principle strictly only applies to subgenerators, it doesn't *require* this behaviour, but to my mind it strongly suggests it. -- Greg From lorgandon at gmail.com Fri Mar 27 07:59:34 2009 From: lorgandon at gmail.com (Imri Goldberg) Date: Fri, 27 Mar 2009 09:59:34 +0300 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> Message-ID: On Thu, Mar 26, 2009 at 11:19 PM, Jan Kanis wrote: > Just out of curiosity, would doing > > l = range(2082) > random.shuffle(l) > random.shuffle(l) > > give me (with a high probability) one of those permutations that is > unreachable with a single shuffle? If so, I'd presume you could get > any shuffle (in case you really cared) by calling random.shuffle > repeatedly and reseeding the prng in between. I'm a bit rusty on the math, but that doesn't have to be the case. If all the permutations produced by random.shuffle() form a subgroup, or lie in a subgroup, then what you'll get is just another permutation from that subgroup, regardless of the randomness you put inside. -- Imri Goldberg -------------------------------------- www.algorithm.co.il/blogs/ -------------------------------------- -- insert signature here ---- -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnodel at googlemail.com Fri Mar 27 08:24:23 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 27 Mar 2009 07:24:23 +0000 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> Message-ID: On 27 Mar 2009, at 06:59, Imri Goldberg wrote: > > > On Thu, Mar 26, 2009 at 11:19 PM, Jan Kanis > wrote: > Just out of curiosity, would doing > > l = range(2082) > random.shuffle(l) > random.shuffle(l) > > give me (with a high probability) one of those permutations that is > unreachable with a single shuffle? If so, I'd presume you could get > any shuffle (in case you really cared) by calling random.shuffle > repeatedly and reseeding the prng in between. > > I'm a bit rusty on the math, but that doesn't have to be the case. > If all the permutations produced by random.shuffle() form a > subgroup, or lie in a subgroup, then what you'll get is just another > permutation from that subgroup, regardless of the randomness you put > inside. There is no reason that the set of shuffled permutations(S_n) will form a subgroup of the set of permutations (P_n) and it may well generate the whole of P_n. In fact you only need n transpositions to generate the whole of P_n. However, any function generates a random permutation is a function from the set of possible states of the PRNG to the set of permutations. Whatever tricks you use, if there are fewer states in the PRNG than there are permutations, you won't be able to reach them all. -- Arnaud From jh at improva.dk Fri Mar 27 11:44:18 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 27 Mar 2009 11:44:18 +0100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> Message-ID: <49CCAE02.8@improva.dk> Arnaud Delobelle wrote: > > On 27 Mar 2009, at 06:59, Imri Goldberg wrote: > >> >> >> On Thu, Mar 26, 2009 at 11:19 PM, Jan Kanis >> wrote: >> Just out of curiosity, would doing >> >> l = range(2082) >> random.shuffle(l) >> random.shuffle(l) >> >> give me (with a high probability) one of those permutations that is >> unreachable with a single shuffle? If so, I'd presume you could get >> any shuffle (in case you really cared) by calling random.shuffle >> repeatedly and reseeding the prng in between. >> >> I'm a bit rusty on the math, but that doesn't have to be the case. If >> all the permutations produced by random.shuffle() form a subgroup, or >> lie in a subgroup, then what you'll get is just another permutation >> from that subgroup, regardless of the randomness you put inside. > > There is no reason that the set of shuffled permutations(S_n) will > form a subgroup of the set of permutations (P_n) and it may well > generate the whole of P_n. In fact you only need n transpositions to > generate the whole of P_n. True, it is extremely likely that the group G_n generated by S_n is P_n and not a subgroup. > > However, any function generates a random permutation is a function > from the set of possible states of the PRNG to the set of > permutations. Whatever tricks you use, if there are fewer states in > the PRNG than there are permutations, you won't be able to reach them > all. > You are right, you won't be able to reach them all in a single call to shuffle. However, by repeated shuffling and reseeding like the OP suggested, you can in theory get to all elements of G_n *if you keep shuffling long enough*. Unfortunately you will need at least |G_n|/|S_n| shuffles which means it is not even remotely practical. - Jacob From ncoghlan at gmail.com Fri Mar 27 11:51:37 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 27 Mar 2009 20:51:37 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CC5D94.7000609@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C258E2.8050505@improva.dk> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz> Message-ID: <49CCAFB9.1090800@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: > >> That makes perfect sense to me as a justification for treating >> GeneratorExit the same as any other exception (i.e. delegating it to the >> subgenerator). It doesn't lead me to think that the semantics ever need >> to involve calling close(). > > I'm also treating close() and throw(GeneratorExit) on > the delegating generator as equivalent for finalization > purposes. So if throw(GeneratorExit) doesn't fall back > to close() on the subiterator, closing the delegating > generator won't finalize the subiterator unless it > pretends to be a generator by implementing throw(). > > Since the inlining principle strictly only applies to > subgenerators, it doesn't *require* this behaviour, > but to my mind it strongly suggests it. I believe I already said this at some point, but after realising that shareable subiterators are almost still going to be better handled by iterating over them rather than delegating to them, I'm actually not too worried one way or the other. While I do still have a slight preference for limiting the methods involved in generator delegation to just next(), send() and throw(), I won't object strenuously to accepting close() as an alternative spelling of throw(exc) that will always reraise the passed in exception. As you say, it does make it easier to write a non-generator delegation target, since implementing close() for finalisation means not having to deal with the vagaries of correctly reraising exceptions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From greg.ewing at canterbury.ac.nz Fri Mar 27 12:05:01 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Mar 2009 23:05:01 +1200 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <49CCAE02.8@improva.dk> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> <49CCAE02.8@improva.dk> Message-ID: <49CCB2DD.30706@canterbury.ac.nz> Jacob Holm wrote: > However, by repeated shuffling and reseeding like the OP > suggested, you can in theory get to all elements of G_n But then you need a sufficient number of distinct seed values, so you're back to the original problem. -- Greg From greg.ewing at canterbury.ac.nz Fri Mar 27 12:08:25 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 27 Mar 2009 23:08:25 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CCAFB9.1090800@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz> <49CCAFB9.1090800@gmail.com> Message-ID: <49CCB3A9.40300@canterbury.ac.nz> Nick Coghlan wrote: > As you say, it does make it easier to write a non-generator delegation > target, since implementing close() for finalisation means not having to > deal with the vagaries of correctly reraising exceptions. It also means that existing things with a close method, such as files, can be used without change. Having a close method is a fairly well-established way to make an iterator explicitly finalizable, whereas having a throw method isn't. -- Greg From jh at improva.dk Fri Mar 27 12:45:53 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 27 Mar 2009 12:45:53 +0100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <49CCB2DD.30706@canterbury.ac.nz> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> <49CCAE02.8@improva.dk> <49CCB2DD.30706@canterbury.ac.nz> Message-ID: <49CCBC71.7070905@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: >> However, by repeated shuffling and reseeding like the OP suggested, >> you can in theory get to all elements of G_n > > But then you need a sufficient number of distinct seed > values, so you're back to the original problem. > Ehr, no. Suppose my PRNG only has period two and the shuffle based on it can only generate the permutations [1, 0, 2] and [2, 1, 0] from [0, 1, 2]. Each time I reseed from a truly random source, the next shuffle will use one of those permutations at random. By shuffling and reseeding enough times I can get all combinations of those two permutations. This happens to be all 6 possible permutations of 3 elements. - Jacob From jh at improva.dk Fri Mar 27 12:49:39 2009 From: jh at improva.dk (Jacob Holm) Date: Fri, 27 Mar 2009 12:49:39 +0100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <49CCBC71.7070905@improva.dk> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> <49CCAE02.8@improva.dk> <49CCB2DD.30706@canterbury.ac.nz> <49CCBC71.7070905@improva.dk> Message-ID: <49CCBD53.8000903@improva.dk> Jacob Holm wrote: > Greg Ewing wrote: >> Jacob Holm wrote: >>> However, by repeated shuffling and reseeding like the OP suggested, >>> you can in theory get to all elements of G_n >> >> But then you need a sufficient number of distinct seed >> values, so you're back to the original problem. >> > Ehr, no. Suppose my PRNG only has period two and the shuffle based on > it can only generate the permutations [1, 0, 2] and [2, 1, 0] from [0, > 1, 2]. Each time I reseed from a truly random source, the next shuffle > will use one of those permutations at random. By shuffling and > reseeding enough times I can get all combinations of those two > permutations. This happens to be all 6 possible permutations of 3 > elements. > Ok, I may have misinterpreted your statement. Yes, you need to reseed a lot. You just don't need the seeds to be different. - Jacob From ncoghlan at gmail.com Fri Mar 27 13:17:07 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 27 Mar 2009 22:17:07 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CCB3A9.40300@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz> <49CCAFB9.1090800@gmail.com> <49CCB3A9.40300@canterbury.ac.nz> Message-ID: <49CCC3C3.4050100@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: > >> As you say, it does make it easier to write a non-generator delegation >> target, since implementing close() for finalisation means not having to >> deal with the vagaries of correctly reraising exceptions. > > It also means that existing things with a close > method, such as files, can be used without change. > > Having a close method is a fairly well-established > way to make an iterator explicitly finalizable, > whereas having a throw method isn't. But then we're back to the point that if someone *wants* deterministic finalisation, then that's why the with statement exists. The part that isn't clicking for me is that I still don't understand *why* 'yield from' should include implicit finalisation as part of its definition. The full delegation of next(), send() and throw() I get completely (since that's the whole point of the new expression). The fact that that *also* ends up delegating the close() method of generators in particular also makes sense (as it's a natural consequence of delegating the first three methods). It's the generalisation of that to all other iterators that happen to offer a close() method that seems somewhat arbitrary. Other than the fact that generators happen to provide a close() method that invokes throw(), it appears to have nothing to do with generator delegation and hence seems like a fairly random addition to the PEP. Using a file as the subiterator is an interesting case in point (and perhaps an interesting exploration as to when a shareable subiterator may make sense: if a subiterator offers separate reading and writing APIs, then those can be exposed as separate generators): class YieldingFile: # Mixing reads and writes with this strawman # version would be a rather bad idea :) EOF = object() def __init__(self, f): self.f = f def read_all(self): self.f.seek(0) yield from self.f def append_lines(self): self.f.seek(0, 2) lines_written = 0 while 1: line = yield if line == self.EOF: break self.f.writeline(line) lines_written += 1 return lines_written The problem I see with the above is that with the current specification in the PEP, the read_all() implementation is outright broken rather than merely redundant (it is obviously wasteful, since it could just return self.f instead of yielding from it - but it is far from clear that it should be broken rather than just pointlessly slow). The first use of read_all() will implicitly close the file when it is finished - that seems totally nonobvious to me. It strikes me as simpler all round to leave the deterministic finalisation to the tool that was designed for the task, and let the new expression focus solely on correct delegation to subgenerators without worrying too much about other iterators. Sure, there are plenty of ways to avoid the implicit finalisation if you want to, but I'm still not convinced the "oh, you don't support throw() so I will fall back to close() instead" fallback behaviour is a particularly good idea. (It isn't a dealbreaker for me though - I still support the PEP overall, even though I'm -0 on this particular aspect of it). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From mrts.pydev at gmail.com Fri Mar 27 16:26:52 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Fri, 27 Mar 2009 17:26:52 +0200 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: Message-ID: Appending query parameters to a URL is a very common need. However, there's nothing in urllib.parse (and older urlparse) that caters for that need. Therefore, I propose adding the following to 2.7 and 3.1 in the respective libs: def add_query_params(url, **params): """ Adds additional query parameters to the given url, preserving original parameters. Usage: >>> add_query_params('http://foo.com', a='b') 'http://foo.com?a=b' >>> add_query_params('http://foo.com?a=b', b='c', d='q') 'http://foo.com?a=b&b=c&d=q' The real implementation should be more strict, e.g. raise on the following: >>> add_query_params('http://foo.com?a=b', a='b') 'http://foo.com?a=b&a=b' """ if not params: return url encoded = urllib.urlencode(params) url = urlparse.urlparse(url) return urlparse.urlunparse((url.scheme, url.netloc, url.path, url.params, (encoded if not url.query else url.query + '&' + encoded), url.fragment)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From venkat83 at gmail.com Fri Mar 27 16:55:02 2009 From: venkat83 at gmail.com (Venkatraman S) Date: Fri, 27 Mar 2009 21:25:02 +0530 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: Message-ID: On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa wrote: > > > Usage: > >>> add_query_params('http://foo.com', a='b') > 'http://foo.com?a=b' > >>> add_query_params('http://foo.com?a=b', b='c', d='q') > 'http://foo.com?a=b&b=c&d=q' > > The real implementation should be more strict, e.g. raise on the > following: > >>> add_query_params('http://foo.com?a=b', a='b') > 'http://foo.com?a=b&a=b' > Well, this is not 'generic' - for eg. in Django sites the above would not be applicable. -V- http://twitter.com/venkasub -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrts.pydev at gmail.com Fri Mar 27 17:00:43 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Fri, 27 Mar 2009 18:00:43 +0200 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: Message-ID: Why not? 2009/3/27 Venkatraman S > > On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa wrote: > >> >> >> Usage: >> >>> add_query_params('http://foo.com', a='b') >> 'http://foo.com?a=b' >> >>> add_query_params('http://foo.com?a=b', b='c', d='q') >> 'http://foo.com?a=b&b=c&d=q' >> >> The real implementation should be more strict, e.g. raise on the >> following: >> >>> add_query_params('http://foo.com?a=b', a='b') >> 'http://foo.com?a=b&a=b' >> > > Well, this is not 'generic' - for eg. in Django sites the above would not > be applicable. > > -V- > http://twitter.com/venkasub > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From venkat83 at gmail.com Fri Mar 27 17:05:51 2009 From: venkat83 at gmail.com (Venkatraman S) Date: Fri, 27 Mar 2009 21:35:51 +0530 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: Message-ID: On Fri, Mar 27, 2009 at 9:30 PM, Mart S?mermaa wrote: > Why not? > > 2009/3/27 Venkatraman S > >> >> On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa wrote: >> >>> >>> >>> Usage: >>> >>> add_query_params('http://foo.com', a='b') >>> 'http://foo.com?a=b' >>> >>> add_query_params('http://foo.com?a=b', b='c', d='q') >>> 'http://foo.com?a=b&b=c&d=q' >>> >>> The real implementation should be more strict, e.g. raise on the >>> following: >>> >>> add_query_params('http://foo.com?a=b', a='b') >>> 'http://foo.com?a=b&a=b' >>> >> >> Well, this is not 'generic' - for eg. in Django sites the above would not >> be applicable. >> > http://foo.com?a=b != http://foo.com/a/b . Semantically , both are same,but the framework rules are different. Not sure how you would this - by telling urllib that it is a 'pretty' django URL? (or am i missing out something?) -V- -------------- next part -------------- An HTML attachment was scrubbed... URL: From janssen at parc.com Fri Mar 27 17:13:57 2009 From: janssen at parc.com (Bill Janssen) Date: Fri, 27 Mar 2009 09:13:57 PDT Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: Message-ID: <19919.1238170437@parc.com> Mart S?mermaa wrote: > Appending query parameters to a URL is a very common need. However, there's > nothing in urllib.parse (and older urlparse) that caters for that need. > > Therefore, I propose adding the following to 2.7 and 3.1 in the respective > libs: > >>> add_query_params('http://foo.com?a=b', b='c', d='q') To begin with, I wouldn't use keyword params. They're syntactically more restrictive than the rules for application/x-www-form-urlencoded allow, so you start by ruling out whole classes of URLs. Bill From mrts.pydev at gmail.com Fri Mar 27 17:16:02 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Fri, 27 Mar 2009 18:16:02 +0200 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: Message-ID: You are definitely "missing out something". For the use case you describe, there's already ulrjoin(). add_query_params() is for a different use case, i.e. it *complements* urljoin(). 2009/3/27 Venkatraman S > > On Fri, Mar 27, 2009 at 9:30 PM, Mart S?mermaa wrote: > >> Why not? >> >> 2009/3/27 Venkatraman S >> >>> >>> On Fri, Mar 27, 2009 at 8:56 PM, Mart S?mermaa wrote: >>> >>>> >>>> >>>> Usage: >>>> >>> add_query_params('http://foo.com', a='b') >>>> 'http://foo.com?a=b' >>>> >>> add_query_params('http://foo.com?a=b', b='c', d='q') >>>> 'http://foo.com?a=b&b=c&d=q' >>>> >>>> The real implementation should be more strict, e.g. raise on the >>>> following: >>>> >>> add_query_params('http://foo.com?a=b', a='b') >>>> 'http://foo.com?a=b&a=b' >>>> >>> >>> Well, this is not 'generic' - for eg. in Django sites the above would >>> not be applicable. >>> >> > > http://foo.com?a=b != http://foo.com/a/b > . > Semantically , both are same,but the framework rules are different. Not > sure how you would this - by telling urllib that it is a 'pretty' django > URL? (or am i missing out something?) > > -V- > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrts.pydev at gmail.com Fri Mar 27 17:17:52 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Fri, 27 Mar 2009 18:17:52 +0200 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <19919.1238170437@parc.com> References: <19919.1238170437@parc.com> Message-ID: On Fri, Mar 27, 2009 at 6:13 PM, Bill Janssen wrote: > Mart S?mermaa wrote: > > > Appending query parameters to a URL is a very common need. However, > there's > > nothing in urllib.parse (and older urlparse) that caters for that need. > > > > Therefore, I propose adding the following to 2.7 and 3.1 in the > respective > > libs: > > > >>> add_query_params('http://foo.com?a=b', b='c', d='q') > > To begin with, I wouldn't use keyword params. They're syntactically > more restrictive than the rules for application/x-www-form-urlencoded > allow, so you start by ruling out whole classes of URLs. > > Bill > Valid point, using an ordinary dict instead would resolve that (i.e. def add_query_params(url, param_dict)). -------------- next part -------------- An HTML attachment was scrubbed... URL: From rhamph at gmail.com Fri Mar 27 19:08:46 2009 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 27 Mar 2009 12:08:46 -0600 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <87eiwjy7qc.fsf@xemacs.org> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <87eiwjy7qc.fsf@xemacs.org> Message-ID: On Thu, Mar 26, 2009 at 9:17 PM, Stephen J. Turnbull wrote: > Adam Olsen writes: > ?> "Not broken for small lists" implies it IS broken for large lists. > > You're being contentious. ?It logically implies no such thing, nor is > it idiomatically an implication among consenting adults. ?And in any > case, the phrasing I recommended is "guaranteed to have uniform > distribution of shuffles up to N". ?The implication of "no guarantee" > is "have a mechanic inspect it before you buy", not "this is a lemon". We'll have to agree to disagree there. The irony is that we only seed with 128 bits, so rather than 2**19937 combinations, there's just 2**128. That drops our "safe" list size down to 34. Weee! -- Adam Olsen, aka Rhamphoryncus From arnodel at googlemail.com Fri Mar 27 19:49:09 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Fri, 27 Mar 2009 18:49:09 +0000 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> Message-ID: On 27 Mar 2009, at 16:17, Mart S?mermaa wrote: > On Fri, Mar 27, 2009 at 6:13 PM, Bill Janssen > wrote: > Mart S?mermaa wrote: > > > Appending query parameters to a URL is a very common need. > However, there's > > nothing in urllib.parse (and older urlparse) that caters for that > need. > > > > Therefore, I propose adding the following to 2.7 and 3.1 in the > respective > > libs: > > > >>> add_query_params('http://foo.com?a=b', b='c', d='q') > > To begin with, I wouldn't use keyword params. They're syntactically > more restrictive than the rules for application/x-www-form-urlencoded > allow, so you start by ruling out whole classes of URLs. > > Bill > > Valid point, using an ordinary dict instead would resolve that (i.e. > def add_query_params(url, param_dict)). Note that it's still not general enough as query fields can be repeated, e.g. http://foo.com/search/?q=spam&q=eggs -- Arnaud From eric at trueblade.com Fri Mar 27 19:54:56 2009 From: eric at trueblade.com (Eric Smith) Date: Fri, 27 Mar 2009 13:54:56 -0500 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> Message-ID: <49CD2100.3070502@trueblade.com> Arnaud Delobelle wrote: > > On 27 Mar 2009, at 16:17, Mart S?mermaa wrote: > >> On Fri, Mar 27, 2009 at 6:13 PM, Bill Janssen wrote: >> Mart S?mermaa wrote: >> >> > Appending query parameters to a URL is a very common need. However, >> there's >> > nothing in urllib.parse (and older urlparse) that caters for that need. >> > >> > Therefore, I propose adding the following to 2.7 and 3.1 in the >> respective >> > libs: >> >> > >>> add_query_params('http://foo.com?a=b', b='c', d='q') >> >> To begin with, I wouldn't use keyword params. They're syntactically >> more restrictive than the rules for application/x-www-form-urlencoded >> allow, so you start by ruling out whole classes of URLs. >> >> Bill >> >> Valid point, using an ordinary dict instead would resolve that (i.e. >> def add_query_params(url, param_dict)). > > Note that it's still not general enough as query fields can be repeated, > e.g. > > http://foo.com/search/?q=spam&q=eggs > It's also possible that the order matters. I think an iterable of tuples (such as returned by dict.items(), but any iterable will do) would be an okay interface. From jjb5 at cornell.edu Fri Mar 27 20:29:52 2009 From: jjb5 at cornell.edu (Joel Bender) Date: Fri, 27 Mar 2009 15:29:52 -0400 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <49CD2100.3070502@trueblade.com> References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> Message-ID: <49CD2930.4080307@cornell.edu> > It's also possible that the order matters. I think an iterable of tuples > (such as returned by dict.items(), but any iterable will do) would be an > okay interface. Ordered dict then :-) From jared.grubb at gmail.com Fri Mar 27 21:46:18 2009 From: jared.grubb at gmail.com (Jared Grubb) Date: Fri, 27 Mar 2009 13:46:18 -0700 Subject: [Python-ideas] [Python-Dev] Grammar for plus and minus unary ops In-Reply-To: References: Message-ID: <2D10F39F-D5E3-4C72-B5A5-89ED4A351610@gmail.com> (This is a reply to Joe's post on python-dev) That looks like a good solution. The downside I see with your rules is that combinations like "~+~-~ +~-" would still be valid, but if people want to write obfuscated code, there are always ways to do it. Forbidding the examples that you gave (and the ones I gave) is still a positive move, in my opinion. Jared On 27 Mar 2009, at 12:15, Joe Smith wrote: > Jared Grubb wrote: >> I'm not a EBNF expert, but it seems that we could modify the >> grammar to be more restrictive so the above code would not be >> silently valid. E.g., "++5" and "1+++5" and "1+-+5" are syntax >> errors, but still keep "1++5", "1+-5", "1-+5" as valid. (Although, >> '~' throws in a kink... should '~-5' be legal? Seems so...) > > So you want something like > u_expr :: = > power | "-" xyzzy_expr | "+" xyzzy_expr | "\~" u_expr > xyzzy_expr :: = > power | "\~" u_expr > > Such that: > 5 # valid u_expr > +5 # valid u_expr > -5 # valid u_expr > ~5 # valid u_expr > ~~5 # valid u_expr > ~+5 # valid u_expr > +~5 # valid u_expr > ~-5 # valid u_expr > -~5 # valid u_expr > +~-5# valid u_expr > > ++5 # not valid u_expr > +-5 # not valid u_expr > -+5 # not valid u_expr > --5 # not valid u_expr > > While, I'm not a python developer, (just a python user) that sounds > reasonable to me, as long as this does not silently change the > meaning of any expression, but only noisily breaks programs, and > that the broken constructs are not used frequently. > > Can anybody come up with any expressions that would silently change > in meaning if the above were applied? > > Obviously a sane name would need to be chosen to replace xyzzy_expr. > From mrts.pydev at gmail.com Fri Mar 27 22:00:47 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Fri, 27 Mar 2009 23:00:47 +0200 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <49CD2930.4080307@cornell.edu> References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> Message-ID: As far as I can see, people tend to agree that this is useful. So, unless someone steps up to oppose this, I'll file a feature request to the Python bug tracker. Will propose an implementation based on ordered dict (that will be in 2.7/3.1 anyway). On Fri, Mar 27, 2009 at 9:29 PM, Joel Bender wrote: > It's also possible that the order matters. I think an iterable of tuples >> (such as returned by dict.items(), but any iterable will do) would be an >> okay interface. >> > > Ordered dict then :-) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jared.grubb at gmail.com Fri Mar 27 22:07:16 2009 From: jared.grubb at gmail.com (Jared Grubb) Date: Fri, 27 Mar 2009 14:07:16 -0700 Subject: [Python-ideas] Grammar for plus and minus unary ops Message-ID: (Originally posted to python-dev, discussion moved here per request by GvR; also fixed pseudo-code to not use a keyword as local var) Begin forwarded message: > > I was recently reviewing some Python code for a friend who is a C++ > programmer, and he had code something like this: > > def foo(): > attempt = 0 > while attempt ret = bar() > if ret: break > ++attempt > > I was a bit surprised that this was syntactically valid, and because > the timeout condition only occurred in exceptional cases, the error > has not yet caused any problems. > > It appears that the grammar treats the above example as the unary + > op applied twice: > > u_expr ::= > power | "-" u_expr > | "+" u_expr | "\~" u_expr > > Playing in the interpreter, expressions like "1+++++++++5" and "1+- > +-+-+-+-+-5" evaluate to 6. > > I'm not a EBNF expert, but it seems that we could modify the grammar > to be more restrictive so the above code would not be silently > valid. E.g., "++5" and "1+++5" and "1+-+5" are syntax errors, but > still keep "1++5", "1+-5", "1-+5" as valid. (Although, '~' throws in > a kink... should '~-5' be legal? Seems so...) > > Jared > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Mar 27 22:27:43 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Mar 2009 16:27:43 -0500 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> Message-ID: 2009/3/27 Mart S?mermaa : > As far as I can see, people tend to agree that this is useful. So, unless > someone steps up to oppose this, I'll file a feature request to the Python > bug tracker. > > Will propose an implementation based on ordered dict (that will be in > 2.7/3.1 anyway). I hope by this you mean you'll provide a patch! -- --Guido van Rossum (home page: http://www.python.org/~guido/) From tjreedy at udel.edu Sat Mar 28 00:36:20 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 27 Mar 2009 19:36:20 -0400 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <49CD2930.4080307@cornell.edu> References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> Message-ID: Joel Bender wrote: >> It's also possible that the order matters. I think an iterable of >> tuples (such as returned by dict.items(), but any iterable will do) >> would be an okay interface. > > Ordered dict then :-) But that, unlike iterable of tuples, would exclude repeated fields, as in Arnaud's example >Note that it's still not general enough as query fields can be repeated, e.g. >http://foo.com/search/?q=spam&q=eggs tjr From greg.ewing at canterbury.ac.nz Sat Mar 28 00:46:57 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 28 Mar 2009 11:46:57 +1200 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <49CCBC71.7070905@improva.dk> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> <49CCAE02.8@improva.dk> <49CCB2DD.30706@canterbury.ac.nz> <49CCBC71.7070905@improva.dk> Message-ID: <49CD6571.1010408@canterbury.ac.nz> Jacob Holm wrote: > Each time I reseed from a truly random source, If you have a "truly random source" on hand, then you have an infinite amount of entropy available and there is no problem. Just feed your truly random numbers straight into the shuffling algorithm. We're talking about the case where you *don't* have truly random numbers, but only a PRNG with a limited amount of internal state. -- Greg From tjreedy at udel.edu Sat Mar 28 00:49:45 2009 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 27 Mar 2009 19:49:45 -0400 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: References: Message-ID: >> I was recently reviewing some Python code for a friend who is a C++ >> programmer, and he had code something like this: >> >> def foo(): >> attempt = 0 >> while attempt> ret = bar() >> if ret: break >> ++attempt >> >> I was a bit surprised that this was syntactically valid, and because >> the timeout condition only occurred in exceptional cases, the error >> has not yet caused any problems. A complete test suite would include such a case ;-). >> It appears that the grammar treats the above example as the unary + op >> applied twice: >> >> u_expr ::= >> power | "-" u_expr >> | "+" u_expr | "\~" u_expr >> >> Playing in the interpreter, expressions like "1+++++++++5" and >> "1+-+-+-+-+-+-5" evaluate to 6. >> >> I'm not a EBNF expert, but it seems that we could modify the grammar >> to be more restrictive so the above code would not be silently valid. >> E.g., "++5" and "1+++5" and "1+-+5" are syntax errors, but still keep >> "1++5", "1+-5", "1-+5" as valid. (Although, '~' throws in a kink... >> should '~-5' be legal? Seems so...) -1 1) This would be a petty, gratuitous restriction that would only complicate the language and make it harder to learn for no real gain. 2) It could break code. + ob maps to type(ob).__pos__(ob), which could do anything, and no necessary just return ob as you are assuming. 3) Consider eval('+' + somecode). Suppose somecode happens to start with '+'. Currently the redundancy is harmless if not meaningful. In summary, I think the following applies here: "Special cases aren't special enough to break the rules." Terry Jan Reedy From greg.ewing at canterbury.ac.nz Sat Mar 28 00:58:05 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 28 Mar 2009 11:58:05 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CCC3C3.4050100@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz> <49CCAFB9.1090800@gmail.com> <49CCB3A9.40300@canterbury.ac.nz> <49CCC3C3.4050100@gmail.com> Message-ID: <49CD680D.2020502@canterbury.ac.nz> Nick Coghlan wrote: > The part that > isn't clicking for me is that I still don't understand *why* 'yield > from' should include implicit finalisation as part of its definition. > > It's the generalisation of that to all other iterators that happen to > offer a close() method that seems somewhat arbitrary. It's a matter of opinion. I would find it surprising if generators behaved differently from all other iterators in this respect. It would be un-ducktypish. I think we need a BDFL opinion to settle this one. -- Greg From george.sakkis at gmail.com Sat Mar 28 01:28:59 2009 From: george.sakkis at gmail.com (George Sakkis) Date: Fri, 27 Mar 2009 20:28:59 -0400 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> Message-ID: <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> On Fri, Mar 27, 2009 at 7:36 PM, Terry Reedy wrote: > Joel Bender wrote: >>> >>> It's also possible that the order matters. I think an iterable of tuples >>> (such as returned by dict.items(), but any iterable will do) would be an >>> okay interface. >> >> Ordered dict then :-) > > But that, unlike iterable of tuples, would exclude repeated fields, as in > Arnaud's example > >>Note that it's still not general enough as query fields can be repeated, >> e.g. >>http://foo.com/search/?q=spam&q=eggs Repeated fields can be packed together in a tuple/list: add_query_params('http://foo.com', dict(q=('spam', 'eggs'))) To which one might reply that this would exclude non-consecutive repeated fields,e g. '?q=spam&foo=bar&q=eggs. To which I would reply that for this 0.01% of cases that require this (a) do it by hand as now or (b) use the same signature as dict() (plus the host in the beginning): add_query_params(host, mapping_or_iterable=None, **params) George From jh at improva.dk Sat Mar 28 01:36:25 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 28 Mar 2009 01:36:25 +0100 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <49CD6571.1010408@canterbury.ac.nz> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <59a221a0903261319h478cf89al66793c760c97cf9@mail.gmail.com> <49CCAE02.8@improva.dk> <49CCB2DD.30706@canterbury.ac.nz> <49CCBC71.7070905@improva.dk> <49CD6571.1010408@canterbury.ac.nz> Message-ID: <49CD7109.9040704@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: >> Each time I reseed from a truly random source, > > If you have a "truly random source" on hand, then > you have an infinite amount of entropy available > and there is no problem. Just feed your truly > random numbers straight into the shuffling > algorithm. Of course. > > We're talking about the case where you *don't* > have truly random numbers, but only a PRNG with a > limited amount of internal state. > As it happens, you don't really need the random source. As long as the set of shuffles you can get after reseeding generates the full set of permutations, all you need is to reseed in a way that will eventually have used all long enough sequences of possible seed values. No this is not even remotely practical, and it has very little to do with randomness, but I think I said that right from the start. I was just reacting to the statement that you wouldn't be able to generate all permutations using shuffle+reseed. You almost certainly can, but it is a silly thing to do. If you want large random permutations, you need a PRNG with an *extremely* long period, and if you have that there is no need for repeated shuffles. - Jacob From guido at python.org Sat Mar 28 03:26:44 2009 From: guido at python.org (Guido van Rossum) Date: Fri, 27 Mar 2009 21:26:44 -0500 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> Message-ID: There's way too much bikeshedding in this thread (not picking on you specifically). I think the originally proposed API is fine, except it should *not* reject duplicates. To add duplicates you'd just call it multiple times, e.g. add_query_params(add_query_params(url, a='x'), a='y'). It's a pretty minor use case anyways. --Guido On Fri, Mar 27, 2009 at 7:28 PM, George Sakkis wrote: > On Fri, Mar 27, 2009 at 7:36 PM, Terry Reedy wrote: > >> Joel Bender wrote: >>>> >>>> It's also possible that the order matters. I think an iterable of tuples >>>> (such as returned by dict.items(), but any iterable will do) would be an >>>> okay interface. >>> >>> Ordered dict then :-) >> >> But that, unlike iterable of tuples, would exclude repeated fields, as in >> Arnaud's example >> >>>Note that it's still not general enough as query fields can be repeated, >>> e.g. >>>http://foo.com/search/?q=spam&q=eggs > > Repeated fields can be packed together in a tuple/list: > > add_query_params('http://foo.com', dict(q=('spam', 'eggs'))) > > To which one might reply that this would exclude non-consecutive > repeated fields,e g. '?q=spam&foo=bar&q=eggs. > > To which I would reply that for this 0.01% of cases that require this > (a) do it by hand as now or (b) use the same signature as dict() (plus > the host in the beginning): > > add_query_params(host, mapping_or_iterable=None, **params) > > George > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dreamingforward at gmail.com Sat Mar 28 03:33:49 2009 From: dreamingforward at gmail.com (average) Date: Fri, 27 Mar 2009 19:33:49 -0700 Subject: [Python-ideas] Builtin test function Message-ID: <913f9f570903271933s44b0b646ge0f37e8ce97d028@mail.gmail.com> >>> There's been some discussion about automatic test discovery lately. >>> Here's a random (not in any way thought through) idea: add a builtin >>> function test() that runs tests associated with a given function, >>> class, module, or object. >> >> Improved testing is always welcome, but why a built-in? >> >> I know testing is important, but is it so common and important that we >> need it at our fingertips, so to speak, and can't even import a module >> first before running tests? What's the benefit to making it a built-in >> instead of part of a test module? > > The advantage would be a uniform and very simple interface for testing any > module, without having to know whether I should import doctest, > unittest or something else (and having to remember the commands > used by each framework). It would certainly not be a replacement for more > advanced test frameworks. By making it a builtin it's also pointing out to users that code-testing is an important part of the python culture (as well as good development practice). It may seem easy "just to do a module import and then run the imported test function", but such a construct says that testing is just an optional thing among many dozens of modules within python. As for a name, Guido's criticism aside, I do like like it spelled test() with usage very much similar to the builtin help() function--both would be accessing the same docstrings but for two different purposes. I think it would add a lot of encouragement for the use of doctest (one of my favorites) as well as facilitate good test-driven development. And, regarding the name, if any function deserves the name test() it would be this builtin--all others would necessarily be secondary. But if there's rancor regarding the name, call it testdoc() or something. Personally, I'm +2 on the idea, but that may only be in cents.... marcos PS. Add test() to the GSoC suggesting of improving doctest with scope-aware doc-test variables (for easing setup code between module->class->method docs). From bruce at leapyear.org Sat Mar 28 04:16:05 2009 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 27 Mar 2009 20:16:05 -0700 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: References: Message-ID: If you want to make sure that you can't use ++ or --, then target that directly: add ++ and -- as tokens to the language and make them always illegal. While that might be a bit of a kludge, I think it's far better than adding a complicated rule that you can't put two + or two - in a row. And if you're worried about eval('+' + somecode), you've got three choices: (1) leave out the '+' because it has no effect; (2) write eval('+ ' + somecode) and (3) are you sure you really want to use eval? --- Bruce On Fri, Mar 27, 2009 at 4:49 PM, Terry Reedy wrote: > > I was recently reviewing some Python code for a friend who is a C++ >>> programmer, and he had code something like this: >>> >>> def foo(): >>> attempt = 0 >>> while attempt>> ret = bar() >>> if ret: break >>> ++attempt >>> >>> I was a bit surprised that this was syntactically valid, and because the >>> timeout condition only occurred in exceptional cases, the error has not yet >>> caused any problems. >>> >> > A complete test suite would include such a case ;-). > > It appears that the grammar treats the above example as the unary + op >>> applied twice: >>> >>> u_expr ::= >>> power | "-" u_expr >>> | "+" u_expr | "\~" u_expr >>> >>> Playing in the interpreter, expressions like "1+++++++++5" and >>> "1+-+-+-+-+-+-5" evaluate to 6. >>> >>> I'm not a EBNF expert, but it seems that we could modify the grammar to >>> be more restrictive so the above code would not be silently valid. E.g., >>> "++5" and "1+++5" and "1+-+5" are syntax errors, but still keep "1++5", >>> "1+-5", "1-+5" as valid. (Although, '~' throws in a kink... should '~-5' be >>> legal? Seems so...) >>> >> > -1 > > 1) This would be a petty, gratuitous restriction that would only complicate > the language and make it harder to learn for no real gain. > > 2) It could break code. + ob maps to type(ob).__pos__(ob), which could do > anything, and no necessary just return ob as you are assuming. > > 3) Consider eval('+' + somecode). Suppose somecode happens to start with > '+'. Currently the redundancy is harmless if not meaningful. > > In summary, I think the following applies here: > "Special cases aren't special enough to break the rules." > > Terry Jan Reedy > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Mar 28 05:13:20 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 28 Mar 2009 15:13:20 +1100 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: References: Message-ID: <200903281513.21029.steve@pearwood.info> Moved from python-dev to python-ideas. On Sat, 28 Mar 2009 04:19:46 am Jared Grubb wrote: > I was recently reviewing some Python code for a friend who is a C++ > programmer, and he had code something like this: > > def foo(): > try = 0 > while try ret = bar() > if ret: break > ++try > > I was a bit surprised that this was syntactically valid, You shouldn't be. Unary operators are inspired by the equivalent mathematical unary operators. ... > It appears that the grammar treats the above example as the unary + > op applied twice: As it should. ... > I'm not a EBNF expert, but it seems that we could modify the grammar > to be more restrictive so the above code would not be silently valid. > E.g., "++5" and "1+++5" and "1+-+5" are syntax errors, but still keep > "1++5", "1+-5", "1-+5" as valid. (Although, '~' throws in a kink... > should '~-5' be legal? Seems so...) Why would we want to do this? I'm sure there are plenty of other syntax constructions in Python which just happen to look like something from other languages, but have a different meaning. Do we have to chase our tails removing every possible syntactically valid string in Python that has a different meaning in some other language? Or is C++ somehow special that we treat it differently from all the other languages? Not only is this a self-inflicted error (writing C++ code in a Python program is a PEBCAK error), but it's rare: it only affects a minority of C++ programmers, and they are only a minority of Python programmers. There's no need to complicate the grammar to prevent this sort of error. Keep it simple. ---1 on the proposal (*grin*). -- Steven D'Aprano From dickinsm at gmail.com Sat Mar 28 05:26:59 2009 From: dickinsm at gmail.com (Mark Dickinson) Date: Sat, 28 Mar 2009 04:26:59 +0000 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: <200903281513.21029.steve@pearwood.info> References: <200903281513.21029.steve@pearwood.info> Message-ID: <5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com> Does PyChecker check for uses of '--' and '++'? That would seem like the obvious place to have such a check. Mark From aahz at pythoncraft.com Sat Mar 28 05:59:44 2009 From: aahz at pythoncraft.com (Aahz) Date: Fri, 27 Mar 2009 21:59:44 -0700 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: <200903281513.21029.steve@pearwood.info> References: <200903281513.21029.steve@pearwood.info> Message-ID: <20090328045944.GA14415@panix.com> On Sat, Mar 28, 2009, Steven D'Aprano wrote: > > Not only is this a self-inflicted error (writing C++ code in a Python > program is a PEBCAK error), but it's rare: it only affects a minority > of C++ programmers, and they are only a minority of Python programmers. > There's no need to complicate the grammar to prevent this sort of > error. Keep it simple. ---1 on the proposal (*grin*). In all fairness, "++" is valid in many C-derived languages, so it hits C and C++ programmers, plus Ruby and Perl programmers. I'm not in favor of this restriction, but I'm not opposed, either, and I think your thesis is invalid. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "At Resolver we've found it useful to short-circuit any doubt and just refer to comments in code as 'lies'. :-)" --Michael Foord paraphrases Christian Muirhead on python-dev, 2009-3-22 From leif.walsh at gmail.com Sat Mar 28 06:12:02 2009 From: leif.walsh at gmail.com (Leif Walsh) Date: Sat, 28 Mar 2009 01:12:02 -0400 (EDT) Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: <5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com> Message-ID: 2009/3/28 Mark Dickinson : > Does PyChecker check for uses of '--' and '++'? ?That > would seem like the obvious place to have such a check. +--1 ;-) -- Cheers, Leif -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: OpenPGP digital signature URL: From arnodel at googlemail.com Sat Mar 28 08:54:03 2009 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Sat, 28 Mar 2009 07:54:03 +0000 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <49CD2930.4080307@cornell.edu> References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> Message-ID: <3230F12B-2FC3-42C9-B8B5-70829BC9C5C5@googlemail.com> On 27 Mar 2009, at 19:29, Joel Bender wrote: >> It's also possible that the order matters. I think an iterable of >> tuples (such as returned by dict.items(), but any iterable will do) >> would be an okay interface. > > Ordered dict then :-) Why not use the same signature as dict.update()? update(...) D.update(E, **F) -> None. Update D from E and F: for k in E: D[k] = E[k] (if E has keys else: for (k, v) in E: D[k] = v) then: for k in F: D[k] = F[k] -- Arnaud From ncoghlan at gmail.com Sat Mar 28 09:16:07 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Mar 2009 18:16:07 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CD680D.2020502@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CAAA0F.7090507@canterbury.ac.nz> <49CAE0DB.3090104@gmail.com> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz> <49CCAFB9.1090800@gmail.com> <49CCB3A9.40300@canterbury.ac.nz> <49CCC3C3.4050100@gmail.com> <49CD680D.2020502@canterbury.ac.nz> Message-ID: <49CDDCC7.6050105@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: >> The part that >> isn't clicking for me is that I still don't understand *why* 'yield >> from' should include implicit finalisation as part of its definition. >> >> It's the generalisation of that to all other iterators that happen to >> offer a close() method that seems somewhat arbitrary. > > It's a matter of opinion. I would find it surprising if > generators behaved differently from all other iterators > in this respect. It would be un-ducktypish. > > I think we need a BDFL opinion to settle this one. It's still your PEP, so unless Guido objects to your preference, I'll cope - I suspect either approach can be explained easily enough in the documentation. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sat Mar 28 09:19:40 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Mar 2009 18:19:40 +1000 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: References: Message-ID: <49CDDD9C.3050102@gmail.com> Bruce Leban wrote: > And if you're worried about eval('+' + somecode), you've got three > choices: (1) leave out the '+' because it has no effect That's not true for all data types - Decimal is the one that comes to mind as having a significant use for unary '+' (specifically, it is used to say "round to currently defined precision, but otherwise leave the value alone") Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sat Mar 28 10:03:40 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 28 Mar 2009 19:03:40 +1000 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: <5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com> References: <200903281513.21029.steve@pearwood.info> <5c6f2a5d0903272126n4eb6c47eqabff333991c4487b@mail.gmail.com> Message-ID: <49CDE7EC.2090706@gmail.com> Mark Dickinson wrote: > Does PyChecker check for uses of '--' and '++'? That > would seem like the obvious place to have such a check. Yep, sounds like a pychecker/pylint kind of problem to me as well. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From denis.spir at free.fr Sat Mar 28 10:18:26 2009 From: denis.spir at free.fr (spir) Date: Sat, 28 Mar 2009 10:18:26 +0100 Subject: [Python-ideas] Builtin test function In-Reply-To: <913f9f570903271933s44b0b646ge0f37e8ce97d028@mail.gmail.com> References: <913f9f570903271933s44b0b646ge0f37e8ce97d028@mail.gmail.com> Message-ID: <20090328101826.337907bc@o> Le Fri, 27 Mar 2009 19:33:49 -0700, average s'exprima ainsi: > >>> There's been some discussion about automatic test discovery lately. > >>> Here's a random (not in any way thought through) idea: add a builtin > >>> function test() that runs tests associated with a given function, > >>> class, module, or object. > >> > >> Improved testing is always welcome, but why a built-in? > >> > >> I know testing is important, but is it so common and important that we > >> need it at our fingertips, so to speak, and can't even import a module > >> first before running tests? What's the benefit to making it a built-in > >> instead of part of a test module? > > > > The advantage would be a uniform and very simple interface for testing any > > module, without having to know whether I should import doctest, > > unittest or something else (and having to remember the commands > > used by each framework). It would certainly not be a replacement for more > > advanced test frameworks. > > By making it a builtin it's also pointing out to users that > code-testing is an important part of the python culture (as well as > good development practice). It may seem easy "just to do a module > import and then run the imported test function", but such a construct > says that testing is just an optional thing among many dozens of > modules within python. Really true for me. Also, I think python needs a standard method for testing. As well as for doc-ing. [But I'm not sure that pseudo-strings are the best format to store test information (idem for doc). I would prefere specialized types -- maybe subtype of string.] I really support the idea because I feel personally concerned: would probably do a more systematical use of tests if there were a (well thought / straightforward / *clear*) builtin standard. > As for a name, Guido's criticism aside, I do like like it spelled > test() with usage very much similar to the builtin help() > function--both would be accessing the same docstrings but for two > different purposes. I think it would add a lot of encouragement for > the use of doctest (one of my favorites) as well as facilitate good > test-driven development. And, regarding the name, if any function > deserves the name test() it would be this builtin--all others would > necessarily be secondary. But if there's rancor regarding the name, > call it testdoc() or something. The analogy with help() sounds sensible. A builtin/standard testing func should definitely be called test(). *Other* test methods should use another name or be prefixed with a module name. Now, we must also cope with existing code. The name should not imply that it's a special method. Maybe runtest() or check()? > Personally, I'm +2 on the idea, but that may only be in cents.... > > marcos Denis ------ la vita e estrany From foobarmus at gmail.com Sat Mar 28 11:06:13 2009 From: foobarmus at gmail.com (Mark Donald) Date: Sat, 28 Mar 2009 18:06:13 +0800 Subject: [Python-ideas] suggestion for try/except program flow Message-ID: In this situation, which happens to me fairly frequently... try: try: raise Cheese except Cheese, e: # handle cheese raise except: # handle all manner of stuff, including cheese ...it would be nice (& more readable) if one were able to recatch a named exception with the generic (catch-all) except clause of its own try, something like this: try: raise Cheese except Cheese, e: # handle cheese recatch except: # handle all manner of stuff, including cheese From guido at python.org Sat Mar 28 12:56:50 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Mar 2009 06:56:50 -0500 Subject: [Python-ideas] suggestion for try/except program flow In-Reply-To: References: Message-ID: On Sat, Mar 28, 2009 at 5:06 AM, Mark Donald wrote: > In this situation, which happens to me fairly frequently... > > try: > ?try: > ? ? raise Cheese > ?except Cheese, e: > ? ? # handle cheese > ? ? raise > except: > ?# handle all manner of stuff, including cheese > > ...it would be nice (& more readable) if one were able to recatch a > named exception with the generic (catch-all) except clause of its own > try, something like this: > > try: > ?raise Cheese > except Cheese, e: > ?# handle cheese > ?recatch > except: > ?# handle all manner of stuff, including cheese I'm not sure recatch is all that more reasonable -- it's another fairly obscure control flow verb. I think the current situation isn't so bad. Nick already pointed out an idiom for doing this without two try clauses: except BaseException as e: if isinstance(e, Cheese): # handle cheese # handle all manner of stuff, including cheese -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jh at improva.dk Sat Mar 28 14:44:31 2009 From: jh at improva.dk (Jacob Holm) Date: Sat, 28 Mar 2009 14:44:31 +0100 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CC5D85.30409@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C2E214.1040003@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> Message-ID: <49CE29BF.3040502@improva.dk> Hi Greg There seems to be another issue with GeneratorExit in the latest expansion (reproduced below). Based on the inlining/refactoring principle, I would expect the following code: def inner(): try: yield 1 yield 2 yield 3 except GeneratorExit: val = 'closed' else: val = 'exhausted' return val.upper() def outer(): val = yield from inner() print val To be equivalent to this: def outer(): try: yield 1 yield 2 yield 3 except GeneratorExit: val = 'closed' else: val = 'exhausted' val = val.upper() print val However, with the current expansion they are different. Only the version not using "yield from" will print "CLOSED" in this case: g = outer() g.next() # prints 1 g.close() # should print "CLOSED", but doesn't because the GeneratorExit is reraised by yield-from I currently don't think that a special case for GeneratorExit is needed. Can you give me an example showing that it is? - Jacob ------------------------------------------------------------------------ _i = iter(EXPR) try: _y = _i.next() except StopIteration, _e: _r = _e.value else: while 1: try: _s = yield _y except: _m = getattr(_i, 'throw', None) if _m is not None: _x = sys.exc_info() try: _y = _m(*_x) except StopIteration, _e: if _e is _x[1] or isinstance(_x[1], GeneratorExit): raise else: _r = _e.value break else: _m = getattr(_i, 'close', None) if _m is not None: _m() raise else: try: if _s is None: _y = _i.next() else: _y = _i.send(_s) except StopIteration, _e: _r = _e.value break RESULT = _r From ptspts at gmail.com Sat Mar 28 15:44:52 2009 From: ptspts at gmail.com (=?ISO-8859-1?Q?P=E9ter_Szab=F3?=) Date: Sat, 28 Mar 2009 15:44:52 +0100 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 Message-ID: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Hi, If Python had method decorators @final (meaning: it is an error to override this method in any subclass) and @override (meaning: it is an error not having this method in a superclass), I would use them in my projects (some of them approaching 20 000 lines of Python code) and I'll feel more confident writing object-oriented Python code. Java already has similar decorators or specifiers. Do you think it is a good idea to have these in Python? I've created a proof-of-concept implementation, which uses metaclasses, and it works in Python 2.4 an Python 2.5. See http://www.math.bme.hu/~pts/pobjects.py and http://www.math.bme.hu/~pts/pobjects_example.py Best regards, P?ter From foobarmus at gmail.com Sat Mar 28 15:54:33 2009 From: foobarmus at gmail.com (Mark Donald) Date: Sat, 28 Mar 2009 22:54:33 +0800 Subject: [Python-ideas] suggestion for try/except program flow In-Reply-To: References: Message-ID: Unless something has changed in Python 3+, I believe Nick's idiom requires the generic handler code to be copied into a second except clause to achieve identical behaviour, as follows... except BaseException as e: if isinstance(e, Cheese): # handle cheese # handle all manner of stuff, including cheese except: # handle all manner of OTHER stuff in the same way ...which makes the nested try block more semantic. This can be tested by putting > raise "this is deprecated..." into the try block, which generates an error that is NOT caught by "except BaseException" however IS caught by "except". I realise that "recatch" may seem like a trivial suggestion, but I believe it would be a slight improvement, which means it's not trivial. Try/except statements are supposedly preferential to "untried" if statements that handle isolated cases, but if statements are a lot easier to imagine, so people use them all them time - style be damned. Each slight improvement in try/except is liable to make it more attractive to coders, the end result being better code (for example, the implementation of PEP 341 increased my team's use of try/except significantly, causing a huge improvement to code readability). Imagine if you could do this: runny = True runnier_than_you_like_it = False raise Cheese except Cheese, e # handle cheese if runny: recatch as Camembert except Camembert, e # handle camembert if runnier_than_you_like_it: recatch else: uncatch # ie, else clause will be effective... except: # not much of a cheese shop, really is it? else: # negotiate vending of cheesy comestibles finally: # sally forth And, I'm not trying to be belligerent here, but before somebody says... Cheese(Exception) Camembert(Cheese) ...just please have a little think about it. Mark 2009/3/28 Guido van Rossum : > On Sat, Mar 28, 2009 at 5:06 AM, Mark Donald wrote: >> In this situation, which happens to me fairly frequently... >> >> try: >> ?try: >> ? ? raise Cheese >> ?except Cheese, e: >> ? ? # handle cheese >> ? ? raise >> except: >> ?# handle all manner of stuff, including cheese >> >> ...it would be nice (& more readable) if one were able to recatch a >> named exception with the generic (catch-all) except clause of its own >> try, something like this: >> >> try: >> ?raise Cheese >> except Cheese, e: >> ?# handle cheese >> ?recatch >> except: >> ?# handle all manner of stuff, including cheese > > I'm not sure recatch is all that more reasonable -- it's another > fairly obscure control flow verb. I think the current situation isn't > so bad. Nick already pointed out an idiom for doing this without two > try clauses: > > except BaseException as e: > ?if isinstance(e, Cheese): > ? ?# handle cheese > ?# handle all manner of stuff, including cheese > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > From grosser.meister.morti at gmx.net Sat Mar 28 16:21:51 2009 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sat, 28 Mar 2009 16:21:51 +0100 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Message-ID: <49CE408F.1030005@gmx.net> P?ter Szab? wrote: > Hi, > > If Python had method decorators @final (meaning: it is an error to > override this method in any subclass) and @override (meaning: it is an > error not having this method in a superclass), I would use them in my > projects (some of them approaching 20 000 lines of Python code) and > I'll feel more confident writing object-oriented Python code. Java > already has similar decorators or specifiers. Do you think it is a > good idea to have these in Python? > > I've created a proof-of-concept implementation, which uses > metaclasses, and it works in Python 2.4 an Python 2.5. See > http://www.math.bme.hu/~pts/pobjects.py and > http://www.math.bme.hu/~pts/pobjects_example.py > > Best regards, > > P?ter +1 on the idea. however, using a metaclass would be to limiting imho. can you implement it in a different way? a lot of things people use metaclasses for work perfectly fine without them (instead use a superclass that overrides __new__ or similar). -apnzi From aahz at pythoncraft.com Sat Mar 28 16:40:09 2009 From: aahz at pythoncraft.com (Aahz) Date: Sat, 28 Mar 2009 08:40:09 -0700 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <87eiwjy7qc.fsf@xemacs.org> Message-ID: <20090328154008.GB7421@panix.com> On Fri, Mar 27, 2009, Adam Olsen wrote: > > The irony is that we only seed with 128 bits, so rather than 2**19937 > combinations, there's just 2**128. That drops our "safe" list size > down to 34. Weee! That's probably worth a bug report or RFE if one doesn't already exist. -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "At Resolver we've found it useful to short-circuit any doubt and just refer to comments in code as 'lies'. :-)" --Michael Foord paraphrases Christian Muirhead on python-dev, 2009-3-22 From Scott.Daniels at Acm.Org Sat Mar 28 19:32:19 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sat, 28 Mar 2009 11:32:19 -0700 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Message-ID: P?ter Szab? wrote: > Hi, > > If Python had method decorators @final (meaning: it is an error to > override this method in any subclass) and @override (meaning: it is an > error not having this method in a superclass), I would use them in my > projects (some of them approaching 20 000 lines of Python code) and > I'll feel more confident writing object-oriented Python code. Java > already has similar decorators or specifiers. Do you think it is a > good idea to have these in Python? > > I've created a proof-of-concept implementation, which uses > metaclasses, and it works in Python 2.4 an Python 2.5. See > http://www.math.bme.hu/~pts/pobjects.py and > http://www.math.bme.hu/~pts/pobjects_example.py I have no idea why you want these, and severe trepidation about dealing with code that uses them "just to be safe." It smacks of the over-use I see of doubled underscores. For @override, just because you've built a base class for one kind of object does not mean I have not thought of an interesting way to use 40% of your code to accomplish my own end. Why make me cut and paste? You are not responsible for the correctness of my flea-brained idea whether I inherit from your class or not. For @final, "how dare you" for similar reasons. Java at least has an excuse (compilation can proceed differently). --Scott David Daniels Scott.Daniels at Acm.Org From jared.grubb at gmail.com Sat Mar 28 20:26:33 2009 From: jared.grubb at gmail.com (Jared Grubb) Date: Sat, 28 Mar 2009 12:26:33 -0700 Subject: [Python-ideas] Grammar for plus and minus unary ops In-Reply-To: <200903281513.21029.steve@pearwood.info> References: <200903281513.21029.steve@pearwood.info> Message-ID: On 27 Mar 2009, at 21:13, Steven D'Aprano wrote: >> >> I was a bit surprised that this was syntactically valid, > > You shouldn't be. Unary operators are inspired by the equivalent > mathematical unary operators. > ..... > Why would we want to do this? I'm sure there are plenty of other > syntax > constructions in Python which just happen to look like something from > other languages, but have a different meaning. Do we have to chase our > tails removing every possible syntactically valid string in Python > that > has a different meaning in some other language? Or is C++ somehow > special that we treat it differently from all the other languages? > > Not only is this a self-inflicted error (writing C++ code in a Python > program is a PEBCAK error), but it's rare: it only affects a minority > of C++ programmers, and they are only a minority of Python > programmers. > There's no need to complicate the grammar to prevent this sort of > error. Keep it simple. ---1 on the proposal (*grin*). It *was* a surprise. Of the languages I've used in my life (BASIC, C, C ++, Java, Javascript, Perl, PHP, and Python), only two would treat prefix ++ as double unary plus (and I try to forget BASIC as best I can :) ). I remember when I first picked up Python, I wrote "i++" once (I think many beginning Python programmers do), and I was grateful that a syntax error popped up (rather than silently doing nothing) and I never did it again... So, now, a few years later I was reviewing code that had "++i" in it (from a new Python developer), and did a double-take on the code and had a moment of surprise that it had even run at all. As a devil's advocate: any code that requires double-unary plus is probably either abusing operator overloading or is abusing the eval keyword. It seems that adding a restriction to the grammar would probably be more helpful than harmful (the workaround for the alien case, if there is one, of needing double-unary plus would be to use parens: "+(+x)"). In any case, I understand that dynamic languages are going to allow for side effects to occur anywhere, so it's tough to remove it. I'm actually only +0 on it as it is... Just a "nice" feature I thought I'd throw out there.... :) Jared From guido at python.org Sat Mar 28 22:24:11 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Mar 2009 16:24:11 -0500 Subject: [Python-ideas] suggestion for try/except program flow In-Reply-To: References: Message-ID: On Sat, Mar 28, 2009 at 9:54 AM, Mark Donald wrote: > Unless something has changed in Python 3+, I believe Nick's idiom > requires the generic handler code to be copied into a second except > clause to achieve identical behaviour, as follows... > > except BaseException as e: > ? ?if isinstance(e, Cheese): > ? ? ? ?# handle cheese > ? ?# handle all manner of stuff, including cheese > except: > ? ?# handle all manner of OTHER stuff in the same way In 3.0, the second except clause is unreachable because all exceptions inherit from BaseException. > ...which makes the nested try block more semantic. This can be tested > by putting > raise "this is deprecated..." ?into the try block, which > generates an error that is NOT caught by "except BaseException" > however IS caught by "except". > > I realise that "recatch" may seem like a trivial suggestion, but I > believe it would be a slight improvement, which means it's not > trivial. Try/except statements are supposedly preferential to > "untried" if statements that handle isolated cases, but if statements > are a lot easier to imagine, so people use them all them time - style > be damned. Each slight improvement in try/except is liable to make it > more attractive to coders, the end result being better code (for > example, the implementation of PEP 341 increased my team's use of > try/except significantly, causing a huge improvement to code > readability). > > Imagine if you could do this: > > ? ?runny = True > ? ?runnier_than_you_like_it = False > ? ?raise Cheese > except Cheese, e > ? ?# handle cheese > ? ?if runny: recatch as Camembert > except Camembert, e > ? ?# handle camembert > ? ?if runnier_than_you_like_it: > ? ? ? ?recatch > ? ?else: > ? ? ? ?uncatch # ie, else clause will be effective... > except: > ? ?# not much of a cheese shop, really is it? > else: > ? ?# negotiate vending of cheesy comestibles > finally: > ? ?# sally forth > > And, I'm not trying to be belligerent here, but before somebody says... > > Cheese(Exception) > Camembert(Cheese) > > ...just please have a little think about it. > > Mark > > 2009/3/28 Guido van Rossum : >> On Sat, Mar 28, 2009 at 5:06 AM, Mark Donald wrote: >>> In this situation, which happens to me fairly frequently... >>> >>> try: >>> ?try: >>> ? ? raise Cheese >>> ?except Cheese, e: >>> ? ? # handle cheese >>> ? ? raise >>> except: >>> ?# handle all manner of stuff, including cheese >>> >>> ...it would be nice (& more readable) if one were able to recatch a >>> named exception with the generic (catch-all) except clause of its own >>> try, something like this: >>> >>> try: >>> ?raise Cheese >>> except Cheese, e: >>> ?# handle cheese >>> ?recatch >>> except: >>> ?# handle all manner of stuff, including cheese >> >> I'm not sure recatch is all that more reasonable -- it's another >> fairly obscure control flow verb. I think the current situation isn't >> so bad. Nick already pointed out an idiom for doing this without two >> try clauses: >> >> except BaseException as e: >> ?if isinstance(e, Cheese): >> ? ?# handle cheese >> ?# handle all manner of stuff, including cheese >> >> -- >> --Guido van Rossum (home page: http://www.python.org/~guido/) >> > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sat Mar 28 23:03:50 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Mar 2009 17:03:50 -0500 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <49CE408F.1030005@gmx.net> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <49CE408F.1030005@gmx.net> Message-ID: On Sat, Mar 28, 2009 at 10:21 AM, Mathias Panzenb?ck wrote: > P?ter Szab? wrote: >> If Python had method decorators @final (meaning: it is an error to >> override this method in any subclass) and @override (meaning: it is an >> error not having this method in a superclass), I would use them in my >> projects (some of them approaching 20 000 lines of Python code) and >> I'll feel more confident writing object-oriented Python code. Java >> already has similar decorators or specifiers. Do you think it is a >> good idea to have these in Python? >> >> I've created a proof-of-concept implementation, which uses >> metaclasses, and it works in Python 2.4 an Python 2.5. See >> http://www.math.bme.hu/~pts/pobjects.py and >> http://www.math.bme.hu/~pts/pobjects_example.py >> >> Best regards, >> >> P?ter > > +1 on the idea. > however, using a metaclass would be to limiting imho. can you implement it > in a different way? a lot of things people use metaclasses for work > perfectly fine without them (instead use a superclass that overrides __new__ > or similar). While it could be done by overriding __new__ in a superclass I'm not sure how that would make it easier to use, and it would make it harder to implement efficiently: this is a check that you would like to happen once at class definition time rather than on each instance creation. Of course you could do some caching to do it at the first instantiation only, but that still sounds clumsy; the metaclass is the obvious place to put this, and gives better error messages (at import instead of first use). But I don't think this idea is ripe for making it into a set of builtins yet, at least, I would prefer if someone coded this up as a 3rd party package and got feedback from a community of early adopters first. Or maybe one of the existing frameworks would be interested in adding this? While this may not be everyone's cup of tea (e.g. Scott David Daniels' reply), some frameworks cater to users who do like to be told when they're making this kind of errors. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From castironpi-ng at comcast.net Sat Mar 28 23:04:28 2009 From: castironpi-ng at comcast.net (castironpi-ng at comcast.net) Date: Sat, 28 Mar 2009 22:04:28 +0000 (UTC) Subject: [Python-ideas] python-like garbage collector & workaround In-Reply-To: <1398015503.825821238277798169.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net> Message-ID: <1652938876.826141238277868188.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net> I am writing a garbage collector that is similar to Python's. I want to know what you think, what problems I may encounter, and what kind of value I'm looking at. For my review of literature, I have read excerpts from, and stepped through, Python's GC. I'm picturing it as a specialized breadth-first search. I am concerned by the inability to call user-defined finalization methods. I'm considering a workaround that performs GC in two steps. First, it requests the objects to drop their references that participate in the cycle. Then, it enqueues the decref'ed object for an unnested destruction. Here is a proof-of-concept implementation. http://groups.google.com/group/comp.lang.python/browse_thread/thread/d3bb410cc6dcae54/f4b282e545335c30 From tleeuwenburg at gmail.com Sat Mar 28 23:29:30 2009 From: tleeuwenburg at gmail.com (Tennessee Leeuwenburg) Date: Sun, 29 Mar 2009 09:29:30 +1100 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Message-ID: <43c8685c0903281529qe2c98abn1f22aaa0d92ae086@mail.gmail.com> Just thinking... this sounds rather like trying to bolt interfaces into Python. In the 'consenting adults' view, shouldn't you be able to override a method that you inherit if you would like to? I can well imagine some well-meaning library author protecting some method with @final, then me spending hours cursing under my breath because I am unable to tweak the functionality in some new direction. If I understand what you are suggesting correctly, then I'm -1 on the idea. I would suggest that a good docstring could do the job just as well -- "Don't override this method in subclasses!". Do you have any use cases to highlight the problem you are trying to fix with this suggestion? Cheers, -T 2009/3/29 P?ter Szab? > Hi, > > If Python had method decorators @final (meaning: it is an error to > override this method in any subclass) and @override (meaning: it is an > error not having this method in a superclass), I would use them in my > projects (some of them approaching 20 000 lines of Python code) and > I'll feel more confident writing object-oriented Python code. Java > already has similar decorators or specifiers. Do you think it is a > good idea to have these in Python? > > I've created a proof-of-concept implementation, which uses > metaclasses, and it works in Python 2.4 an Python 2.5. See > http://www.math.bme.hu/~pts/pobjects.py and > http://www.math.bme.hu/~pts/pobjects_example.py > > Best regards, > > P?ter > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- -------------------------------------------------- Tennessee Leeuwenburg http://myownhat.blogspot.com/ "Don't believe everything you think" -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Mar 28 23:32:27 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Mar 2009 10:32:27 +1200 Subject: [Python-ideas] python-like garbage collector & workaround In-Reply-To: <1652938876.826141238277868188.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net> References: <1652938876.826141238277868188.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net> Message-ID: <49CEA57B.5070004@canterbury.ac.nz> castironpi-ng at comcast.net wrote: > I'm considering a workaround that performs GC in two steps. First, it > requests the objects to drop their references that participate in the > cycle. Then, it enqueues the decref'ed object for an unnested > destruction. I don't see how that solves anything. The problem is that the destructors might depend on other objects in the cycle that have already been deallocated. Deferring the calling of the destructors doesn't help with that. The only thing that will help is decoupling the destructor from the object being destroyed. You can do that now by storing a weak reference to the object with the destructor as a callback. But the destructor needs to be designed so that it can work without holding any reference to the object being destroyed, since it will no longer exist by the time the destructor is called. -- Greg From ben+python at benfinney.id.au Sat Mar 28 23:41:44 2009 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 29 Mar 2009 09:41:44 +1100 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Message-ID: <874oxds21j.fsf@benfinney.id.au> P?ter Szab? writes: > If Python had method decorators @final (meaning: it is an error to > override this method in any subclass) What use case is there for this? It would have to be quite strong to override the Python philosophy that ?we're all consenting adults here?, and that the programmer of the subclass is the one who knows best whether a method needs overriding. > and @override (meaning: it is an error not having this method in a > superclass) I'm not sure I understand this one, but if I'm right this is supported now with: class FooABC(object): def frobnicate(self): raise NotImplementedError("Must be implemented in derived class") Or perhaps: class FooABC(object): def __init__(self): if self.frobnicate is NotImplemented: raise ValueError("Must override 'frobnicate' in derived class") frobnicate = NotImplemented But, again, what is the use case? Is it strong enough to take away the ability of the derived class's implementor (who is, remember, a consenting adult) to take what they want from a class and leave the rest? -- \ ?We can't depend for the long run on distinguishing one | `\ bitstream from another in order to figure out which rules | _o__) apply.? ?Eben Moglen, _Anarchism Triumphant_, 1999 | Ben Finney From greg.ewing at canterbury.ac.nz Sun Mar 29 00:55:15 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Mar 2009 11:55:15 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CE29BF.3040502@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> Message-ID: <49CEB8E3.20407@canterbury.ac.nz> Jacob Holm wrote: > I currently don't think that a special case for GeneratorExit is > needed. Can you give me an example showing that it is? Someone said something that made me think it was needed, but I think you're right, it shouldn't be there. -- Greg From ncoghlan at gmail.com Sun Mar 29 00:55:10 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 09:55:10 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CE29BF.3040502@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C2EBAA.9020106@improva.dk> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> Message-ID: <49CEB8DE.8060200@gmail.com> > However, with the current expansion they are different. Only the > version not using "yield from" will print "CLOSED" in this case: > > g = outer() > g.next() # prints 1 > g.close() # should print "CLOSED", but doesn't because the > GeneratorExit is reraised by yield-from > > > I currently don't think that a special case for GeneratorExit is > needed. Can you give me an example showing that it is? Take your example, replace the "print val" with a "yield val" and you get a broken generator that will yield again when close() is called. Generators that catch and do anything with GeneratorExit other than turn it into StopIteration are almost always going to be broken - the new expression needs to avoid making it easy to do that accidentally. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From g.brandl at gmx.net Sun Mar 29 01:03:15 2009 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 28 Mar 2009 19:03:15 -0500 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <874oxds21j.fsf@benfinney.id.au> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <874oxds21j.fsf@benfinney.id.au> Message-ID: Ben Finney schrieb: > P?ter Szab? writes: > >> If Python had method decorators @final (meaning: it is an error to >> override this method in any subclass) > > What use case is there for this? It would have to be quite strong to > override the Python philosophy that ?we're all consenting adults > here?, and that the programmer of the subclass is the one who knows > best whether a method needs overriding. I agree. This goes in the same direction as suggesting private attributes. Georg -- Thus spake the Lord: Thou shalt indent with four spaces. No more, no less. Four shall be the number of spaces thou shalt indent, and the number of thy indenting shall be four. Eight shalt thou not indent, nor either indent thou two, excepting that thou then proceed to four. Tabs are right out. From greg.ewing at canterbury.ac.nz Sun Mar 29 01:12:05 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Mar 2009 12:12:05 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CEB8DE.8060200@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C316E9.1090103@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> Message-ID: <49CEBCD5.7020107@canterbury.ac.nz> Nick Coghlan wrote: > Generators that catch and do anything with GeneratorExit other than turn > it into StopIteration are almost always going to be broken - the new > expression needs to avoid making it easy to do that accidentally. However, as this example shows, the suggested solution of reraising GeneratorExit is not viable because it violates the inlining principle. The basic problem is that there's no way of telling the difference between a StopIteration that means "it's okay, I've finalized myself" and "I really mean to return normally here". -- Greg From ncoghlan at gmail.com Sun Mar 29 01:24:01 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 10:24:01 +1000 Subject: [Python-ideas] suggestion for try/except program flow In-Reply-To: References: Message-ID: <49CEBFA1.2050109@gmail.com> Mark Donald wrote: > Unless something has changed in Python 3+, I believe Nick's idiom > requires the generic handler code to be copied into a second except > clause to achieve identical behaviour, as follows... Guido already said this, but yes, something did change in 3.0: unlike the 2.x series, the raise statement in 3.x only accepts instances of BaseException, so having both an "except BaseException:" clause and a bare "except:" clause becomes redundant. Running 2.x code with the -3 flag to enable Py3k deprecation warnings actually points this out whenever a non-instance of BaseException is raised. 'Normal' exceptions are encouraged to inherit from Exception, with only 'terminal' exceptions (currently only SystemExit, GeneratorExit, KeyboardInterrupt) outside that heirarchy. I agree that in 2.x, this means that if you want to handle non-Exception exceptions along with well-behaved exceptions, you need to use sys.exc_info() to adapt my previous example: except: _et, e, _tb = sys.exc_info() if isinstance(e, Cheese): # handle cheese # handle all manner of stuff, including cheese Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sun Mar 29 01:40:38 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 10:40:38 +1000 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <874oxds21j.fsf@benfinney.id.au> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <874oxds21j.fsf@benfinney.id.au> Message-ID: <49CEC386.4050204@gmail.com> Ben Finney wrote: > P?ter Szab? writes: > >> If Python had method decorators @final (meaning: it is an error to >> override this method in any subclass) > > What use case is there for this? It would have to be quite strong to > override the Python philosophy that ?we're all consenting adults > here?, and that the programmer of the subclass is the one who knows > best whether a method needs overriding. Agreed - the base class author has no right to tell subclass authors that they *can't* do something. They can give hints that something shouldn't be messed with by using a leading underscore and leaving it undocumented (or advising against overriding it in the documentation). That said, if a @suggest_final decorator was paired with an @override_final decorator, I could actually see the point: one thing that can happen with undocumented private methods and attributes in a large class heirarchy is a subclass *accidentally* overriding them, which can then lead to bugs which are tricky to track down (avoiding such conflicts is actually one of the legitimate use cases for name mangling). A suggest_final/override_final decorator pair would flag accidental naming conflicts in complicated heirarchies at class definition time, while still granting the subclass author the ability to replace the nominally 'final' methods if they found it necessary. >> and @override (meaning: it is an error not having this method in a >> superclass) > > I'm not sure I understand this one, but if I'm right this is supported > now with: > > class FooABC(object): > def frobnicate(self): > raise NotImplementedError("Must be implemented in derived class") Even better: >>> import abc >>> class FooABC(): # use metaclass keyword arg in Py3k ... __metaclass__ = abc.ABCMeta ... @abc.abstractmethod ... def must_override(): ... raise NotImplemented ... >>> x = FooABC() Traceback (most recent call last): File "", line 1, in TypeError: Can't instantiate abstract class FooABC with abstract methods must_override >>> class Fail(FooABC): pass ... >>> x = Fail() Traceback (most recent call last): File "", line 1, in TypeError: Can't instantiate abstract class Fail with abstract methods must_override >>> class Succeed(FooABC): ... def must_override(self): ... print "Overridden!" ... >>> x = Succeed() >>> x.must_override() Overridden! Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From greg.ewing at canterbury.ac.nz Sun Mar 29 01:47:01 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 29 Mar 2009 12:47:01 +1200 Subject: [Python-ideas] Yield-From: GeneratorReturn exception In-Reply-To: <49CEBCD5.7020107@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> Message-ID: <49CEC505.4060906@canterbury.ac.nz> While attempting to update the PEP to incorporate a GeneratorReturn exception, I've thought of a potential difficulty in making the exception type depend on whether the return statement had a value. Currently the StopIteration exception is created after the return statement has unwound the stack frame, by which time we've lost track of whether it had an expression. -- Greg From ptspts at gmail.com Sun Mar 29 01:48:00 2009 From: ptspts at gmail.com (=?ISO-8859-1?Q?P=E9ter_Szab=F3?=) Date: Sun, 29 Mar 2009 01:48:00 +0100 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <874oxds21j.fsf@benfinney.id.au> Message-ID: <4fa38b910903281748u6efd745ai22c55fc97187e034@mail.gmail.com> Hi, Thanks for pointing out that a @final decorator in the superclass can be an obstacle for code reuse if we assume that the author of the subclass has no (or has only limited) control over the code of the superclass. I'll come up with features with which the subclass can bypass some or all decorators imposed in the superclass. (One easy way to do this right now is saying ``__metaclass__ = type'' in the subclass.) I agree that the programmer of the subclass is the one who knows best whether a method needs overriding. We have to give the programmer the power to enforce his will. But I think the errors coming from the decorators are very useful for notifying the programmer of the subclass that he is trying to something unexpected -- then he should make his decision to reconsider or enforce (e.g. @override_final as suggested by Nick Coghlan). By raising an error we inform the programmer that there is a decision he has to make. I think using a metaclass for implementing the checks based on the decorator is more appropriate than just overriding __new__ -- because we want to hook class creation (for which a metaclass is the adequate choice), not instance creation (for which overriding __new__ is the right choice). I definitely don't want any check at instance creation time, not even once. If I managed to create the class, it should be possible to create instances from it without decorator checks. By the way, as far as I can imaginge, using __new__ instead of the metaclass wouldn't make the implementations I can come up with simpler or shorter. A nice English docstring saying ``please don't override this method'' wouldn't make me happy. In my use case a few programmers including me are co-developing a fairly complex system in Python. There are tons of classes, tons of methods, each of them with docstrings. When I add some methods, I sometimes assume @final or @override, and I'm sure the system would break or function incorrectly if somebody added a subclass or changed a superclass ignoring my assumptions. Let's suppose this happens, but we don't notice it early enough; it becomes obvious only days or weeks later that the system cannot work this way, and the original reason of the problem was that somebody ignored a @final or @override assumption, because he didn't pay close attention to the thousands of docstrings. So we wast hours or days fixing the system. How can we avoid this problem in the future? Option A. Rely on writing and reading docstrings, everybody always correctly. Option B. Get an exception if a @final or @override assumption is violated. Option B is acceptable for me, Option A is not, because with option A there is no guarantee that the overlooking won't happen again. With Option B the programmer gets notified early, and he can reconsider his code or refactor my code early, must faster than fixing it weeks later. Best regards, P?ter From ncoghlan at gmail.com Sun Mar 29 01:56:44 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 10:56:44 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CEBCD5.7020107@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> Message-ID: <49CEC74C.4030508@gmail.com> Greg Ewing wrote: > Nick Coghlan wrote: > >> Generators that catch and do anything with GeneratorExit other than turn >> it into StopIteration are almost always going to be broken - the new >> expression needs to avoid making it easy to do that accidentally. > > However, as this example shows, the suggested solution > of reraising GeneratorExit is not viable because it > violates the inlining principle. > > The basic problem is that there's no way of telling the > difference between a StopIteration that means "it's okay, > I've finalized myself" and "I really mean to return > normally here". Well, there is a way to tell the difference - if we just threw GeneratorExit in, then it finalised itself, otherwise it is finishing normally. The only question is what to do in the outer scope in the first case. 1. Accept the StopIteration as a normal termination of the subiterator and continue execution of the delegating generator instead of finalising it. This is very bad as it will lead to any generator that yields again after a yield from expression almost certainly being broken [1]. 2. Reraise the original GeneratorExit. 3. Reraise the subiterator's StopIteration exception. 4. Return immediately from the delegating generator. I actually quite like option 4, as I believe it best reflects what the subiterator has done by trapping GeneratorExit and turning it into "normal" termination of the subiterator, without creating a situation where generators that use yield from a likely to accidentally ignore GeneratorExit. Cheers, Nick. [1] By "broken" in this context, I mean "close() will raise RuntimeError", as would occur if Jacob's example used "yield val" instead of "print val", or as occurs in the following normal generator: >>> def gen(): ... try: ... yield ... except GeneratorExit: ... pass ... yield ... >>> g = gen() >>> g.next() >>> g.close() Traceback (most recent call last): File "", line 1, in RuntimeError: generator ignored GeneratorExit -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sun Mar 29 03:00:24 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 11:00:24 +1000 Subject: [Python-ideas] Yield-From: GeneratorReturn exception In-Reply-To: <49CEC505.4060906@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CEC505.4060906@canterbury.ac.nz> Message-ID: <49CEC828.9060608@gmail.com> Greg Ewing wrote: > While attempting to update the PEP to incorporate a > GeneratorReturn exception, I've thought of a potential > difficulty in making the exception type depend on > whether the return statement had a value. > > Currently the StopIteration exception is created after > the return statement has unwound the stack frame, by > which time we've lost track of whether it had an > expression. Does it become easier if "return None" raises StopIteration instead of raising GeneratorReturn(None)? I think I'd prefer that to having to perform major surgery on the eval loop to make it do something else... (Guido may have other ideas, obviously). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From ncoghlan at gmail.com Sun Mar 29 03:09:44 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 11:09:44 +1000 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <4fa38b910903281748u6efd745ai22c55fc97187e034@mail.gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <874oxds21j.fsf@benfinney.id.au> <4fa38b910903281748u6efd745ai22c55fc97187e034@mail.gmail.com> Message-ID: <49CECA58.9030204@gmail.com> P?ter Szab? wrote: > (One easy way > to do this right now is saying ``__metaclass__ = type'' in the > subclass.) Actually, that doesn't work as you might think... >>> class TryNotToBeAnABC(FooABC): ... __metaclass__ = type ... >>> type(TryNotToBeAnABC) The value assigned to '__metaclass__' (or the metaclass keyword argument in Py3k) is only one candidate metaclass that the metaclass determination algorithm considers - the metaclasses of all base classes are also candidates, and the algorithm picks the one which is a subclass of all of the candidate classes. If none of the candidates meet that criteria, then it complains loudly: >>> class OtherMeta(type): pass ... >>> class TryNotToBeAnABC(FooABC): ... __metaclass__ = OtherMeta ... Traceback (most recent call last): File "", line 1, in TypeError: Error when calling the metaclass bases metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases And remember, as far as @overrides goes, I believe @abc.abstractmethod already does what you want - it's only the @suggest_final/@override_final part of the idea that doesn't exist. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From guido at python.org Sun Mar 29 04:50:16 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Mar 2009 21:50:16 -0500 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CD680D.2020502@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49CB1A19.2000305@canterbury.ac.nz> <49CB217C.6000603@gmail.com> <49CB3EAA.7090909@canterbury.ac.nz> <49CB5B25.4070105@gmail.com> <49CC5D94.7000609@canterbury.ac.nz> <49CCAFB9.1090800@gmail.com> <49CCB3A9.40300@canterbury.ac.nz> <49CCC3C3.4050100@gmail.com> <49CD680D.2020502@canterbury.ac.nz> Message-ID: On Fri, Mar 27, 2009 at 6:58 PM, Greg Ewing wrote: > Nick Coghlan wrote: >> >> The part that >> isn't clicking for me is that I still don't understand *why* 'yield >> from' should include implicit finalisation as part of its definition. > >> >> It's the generalisation of that to all other iterators that happen to >> offer a close() method that seems somewhat arbitrary. > > It's a matter of opinion. I would find it surprising if > generators behaved differently from all other iterators > in this respect. It would be un-ducktypish. > > I think we need a BDFL opinion to settle this one. To be honest, I don't follow this in detail yet, but I believe I don't really care that much either way, and I'd like to recommend that you do whatever makes the specification (and hence hopefully the implementation) have the least special cases. There are several Python Zen rules about this. :-) -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Mar 29 04:52:44 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Mar 2009 21:52:44 -0500 Subject: [Python-ideas] Yield-From: GeneratorReturn exception In-Reply-To: <49CEC828.9060608@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CEC505.4060906@canterbury.ac.nz> <49CEC828.9060608@gmail.com> Message-ID: On Sat, Mar 28, 2009 at 8:00 PM, Nick Coghlan wrote: > Greg Ewing wrote: >> While attempting to update the PEP to incorporate a >> GeneratorReturn exception, I've thought of a potential >> difficulty in making the exception type depend on >> whether the return statement had a value. >> >> Currently the StopIteration exception is created after >> the return statement has unwound the stack frame, by >> which time we've lost track of whether it had an >> expression. > > Does it become easier if "return None" raises StopIteration instead of > raising GeneratorReturn(None)? > > I think I'd prefer that to having to perform major surgery on the eval > loop to make it do something else... (Guido may have other ideas, > obviously). I think my first response on this (yesterday?) already mentioned that I didn't mind so much whether "return None" was treated more like "return" or more like "return ". So please do whatever can be implemented easily. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Sun Mar 29 04:55:23 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Mar 2009 21:55:23 -0500 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <49CEC386.4050204@gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <874oxds21j.fsf@benfinney.id.au> <49CEC386.4050204@gmail.com> Message-ID: On Sat, Mar 28, 2009 at 7:40 PM, Nick Coghlan wrote: > Ben Finney wrote: >> P?ter Szab? writes: >> >>> If Python had method decorators @final (meaning: it is an error to >>> override this method in any subclass) >> >> What use case is there for this? It would have to be quite strong to >> override the Python philosophy that ?we're all consenting adults >> here?, and that the programmer of the subclass is the one who knows >> best whether a method needs overriding. > > Agreed - the base class author has no right to tell subclass authors > that they *can't* do something. I'm sorry, but this is going too far. There are plenty of situations where, indeed, this ought to be only a hint, but I think it goes to far to say that a base class can never have the last word about something. Please note that I already suggested this be put in a 3rd party package -- I'm not about to make these builtins. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From ncoghlan at gmail.com Sun Mar 29 05:14:42 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 13:14:42 +1000 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <874oxds21j.fsf@benfinney.id.au> <49CEC386.4050204@gmail.com> Message-ID: <49CEE7A2.8090604@gmail.com> Guido van Rossum wrote: > On Sat, Mar 28, 2009 at 7:40 PM, Nick Coghlan wrote: >> Agreed - the base class author has no right to tell subclass authors >> that they *can't* do something. > > I'm sorry, but this is going too far. There are plenty of situations > where, indeed, this ought to be only a hint, but I think it goes to > far to say that a base class can never have the last word about > something. Sorry, what I wrote was broader in scope than what I actually meant. I only intended to refer to otherwise arbitrary non-functional constraints like marking elements of the base as "private" or "final" without giving a subclass author a way to override them (after all, even name mangling can be reversed with sufficient motivation). A base class obviously needs to impose some real constraints on subclasses in practice, or it isn't going to be a very useful (if nothing else, it needs to set down the details of the shared API). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From guido at python.org Sun Mar 29 05:18:38 2009 From: guido at python.org (Guido van Rossum) Date: Sat, 28 Mar 2009 22:18:38 -0500 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <49CEE7A2.8090604@gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <874oxds21j.fsf@benfinney.id.au> <49CEC386.4050204@gmail.com> <49CEE7A2.8090604@gmail.com> Message-ID: On Sat, Mar 28, 2009 at 10:14 PM, Nick Coghlan wrote: > Guido van Rossum wrote: >> On Sat, Mar 28, 2009 at 7:40 PM, Nick Coghlan wrote: >>> Agreed - the base class author has no right to tell subclass authors >>> that they *can't* do something. >> >> I'm sorry, but this is going too far. There are plenty of situations >> where, indeed, this ought to be only a hint, but I think it goes to >> far to say that a base class can never have the last word about >> something. > > Sorry, what I wrote was broader in scope than what I actually meant. I > only intended to refer to otherwise arbitrary non-functional constraints > like marking elements of the base as "private" or "final" without giving > a subclass author a way to override them (after all, even name mangling > can be reversed with sufficient motivation). To paraphrase a cliche: "Having 'private' (or 'final') in a language doesn't cause unusable software. People using 'private' (or 'final') indiscriminately cause unusable software." :-) > A base class obviously needs to impose some real constraints on > subclasses in practice, or it isn't going to be a very useful (if > nothing else, it needs to set down the details of the shared API). -- --Guido van Rossum (home page: http://www.python.org/~guido/) From foobarmus at gmail.com Sun Mar 29 07:06:29 2009 From: foobarmus at gmail.com (Mark Donald) Date: Sun, 29 Mar 2009 13:06:29 +0800 Subject: [Python-ideas] suggestion for try/except program flow In-Reply-To: <49CEBFA1.2050109@gmail.com> References: <49CEBFA1.2050109@gmail.com> Message-ID: > Guido already said this, but yes, something did change in 3.0: unlike > the 2.x series, the raise statement in 3.x only accepts instances of > BaseException, so having both an "except BaseException:" clause and a > bare "except:" clause becomes redundant. Ah, apologies... I need to update myself. 'uncatch' to subsequently execute the else clause is still going to be impossible, but I don't have a real-world need for that as yet. Cheers From ncoghlan at gmail.com Sun Mar 29 08:33:31 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 29 Mar 2009 16:33:31 +1000 Subject: [Python-ideas] suggestion for try/except program flow In-Reply-To: References: <49CEBFA1.2050109@gmail.com> Message-ID: <49CF163B.90409@gmail.com> Mark Donald wrote: >> Guido already said this, but yes, something did change in 3.0: unlike >> the 2.x series, the raise statement in 3.x only accepts instances of >> BaseException, so having both an "except BaseException:" clause and a >> bare "except:" clause becomes redundant. > > Ah, apologies... I need to update myself. > > 'uncatch' to subsequently execute the else clause is still going to be > impossible, but I don't have a real-world need for that as yet. If you really find yourself doing this kind of exception interrogation a lot, you may find it easier to do it in the __exit__ method of a context manager. Those are *always* invoked regardless of how the with statement ends and you can then do whatever flow control you like based on the type of the first argument (and whether or not it is None). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From Scott.Daniels at Acm.Org Sun Mar 29 09:16:54 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun, 29 Mar 2009 00:16:54 -0700 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Message-ID: Scott David Daniels wrote: > P?ter Szab? wrote: >> If Python had method decorators @final (meaning: it is an error to >> override this method in any subclass) and @override (meaning: it is an >> error not having this method in a superclass), I would use them in my >> projects (some of them approaching 20 000 lines of Python code) and >> I'll feel more confident writing object-oriented Python code.... > > I have no idea why you want these, and severe trepidation about > dealing with code that uses them "just to be safe." It smacks of > the over-use I see of doubled underscores. For @override, just > because you've built a base class for one kind of object does not > mean I have not thought of an interesting way to use 40% of your > code to accomplish my own end. Why make me cut and paste? You > are not responsible for the correctness of my flea-brained idea > whether I inherit from your class or not. For @final, "how dare > you" for similar reasons. Java at least has an excuse (compilation > can proceed differently). I was asked off-group to give an example where use of @override prevents reusing some code. First, the above is an overstatement of my case, probably an attempt to "bully" you off that position. For that bullying, I apologize. Second, what follows below is one example of what @overrides prevents me from doing. Say you've built a class named "MostlyAbstract" with comparisons: class MostlyAbstract(object): @override def __hash__(self, other): pass @override def __lt__(self, other): pass @override def __eq__(self, other): pass def __le__(self, other): return self.__lt__(other) or self.__eq__(other) def __gt__(self, other): return other.__lt__(self) def __ge__(self, other): return self.__gt__(other) or self.__eq__(other) and I decide the comparison should works a bit differently: class MostAbstract(MostlyAbstract): def __gt__(self, other): return not self.__le__(self) This choice of mine won't work, even when I'm trying to just do a slight change to your abstraction. Similarly, If I want to monkey-path in a debugging print or two, I cannot do it without having to create a bunch of vacuous implementations. Also, a @final will prevent me from sneaking in aextra print when I'm bug-chasing. That being said, a mechanism like the following could be used as a facility to implement your two desires, by providing a nice simple place called as each class definition is completed: class Type(type): '''A MetaClass to call __initclass__ for freshly defined classes.''' def __new__(class_, name, supers, methods): if '__initclass__' in methods and not isinstance( methods['__initclass__'], classmethod): method = methods['__initclass__'] methods['__initclass__'] = classmethod(method) return type.__new__(class_, name, supers, methods) def __init__(self, name, supers, methods): type.__init__(self, name, supers, methods) if hasattr(self, '__initclass__'): self.__initclass__() In 2.5, for example, you'd use it like: class Foo(SomeParent): __metaclass__ = Type def __init_class__(self): --Scott David Daniels Scott.Daniels at Acm.Org From jh at improva.dk Sun Mar 29 14:33:51 2009 From: jh at improva.dk (Jacob Holm) Date: Sun, 29 Mar 2009 14:33:51 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CEBCD5.7020107@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C36306.4040002@improva.dk> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> Message-ID: <49CF6AAF.70109@improva.dk> Greg Ewing wrote: > Nick Coghlan wrote: > >> Generators that catch and do anything with GeneratorExit other than turn >> it into StopIteration are almost always going to be broken - the new >> expression needs to avoid making it easy to do that accidentally. > > However, as this example shows, the suggested solution > of reraising GeneratorExit is not viable because it > violates the inlining principle. > > The basic problem is that there's no way of telling the > difference between a StopIteration that means "it's okay, > I've finalized myself" and "I really mean to return > normally here". > Would it be possible to attach the current exception (if any) to the StopIteration/GeneratorReturn raised by a return statement in a finally clause? (Using the __traceback__ and __cause__ attributes from PEP-3134) Then the PEP expansion could check for and reraise the attached exception. Now that I think about it, this is almost required by the inlining/refactoring principle. Consider this example: def inner(): try: yield 1 yield 2 yield 3 finally: return 'VALUE' def outer(): val = yield from inner() print val Which I think should be equivalent to: def outer(): try: yield 1 yield 2 yield 3 finally: val = 'VALUE' print val The problem is that any exception thrown into inner is converted to a GeneratorReturn, which is then swallowed by the yield-from instead of being reraised. - Jacob From castironpi-ng at comcast.net Sun Mar 29 15:42:22 2009 From: castironpi-ng at comcast.net (castironpi-ng at comcast.net) Date: Sun, 29 Mar 2009 13:42:22 +0000 (UTC) Subject: [Python-ideas] python-like garbage collector & workaround In-Reply-To: <49CEA57B.5070004@canterbury.ac.nz> Message-ID: <318523839.906611238334142946.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net> ----- Original Message ----- From: "Greg Ewing" < greg.ewing at canterbury.ac.nz > To: castironpi-ng at comcast.net Cc: Python-ideas at python.org Sent: Saturday, March 28, 2009 5:32:27 PM GMT -06:00 US/Canada Central Subject: Re: [Python-ideas] python-like garbage collector & workaround castironpi-ng at comcast.net wrote: > I'm considering a workaround that performs GC in two steps. First, it > requests the objects to drop their references that participate in the > cycle. Then, it enqueues the decref'ed object for an unnested > destruction. I don't see how that solves anything. The problem is that the destructors might depend on other objects in the cycle that have already been deallocated. Deferring the calling of the destructors doesn't help with that. The only thing that will help is decoupling the destructor from the object being destroyed. You can do that now by storing a weak reference to the object with the destructor as a callback. But the destructor needs to be designed so that it can work without holding any reference to the object being destroyed, since it will no longer exist by the time the destructor is called. -- Greg ==================================== Nice response time. > Deferring the calling of the destructors doesn't help with that. I beg to differ. There is a complex example in the test code at the address. Here is a simple one. 'A' has a reference to 'B' and 'B' has a reference to 'A'. They both need to call each other's methods during their respective finalizations. 1. Ref counts: A-1, B-1 2. Request A to drop ref. to B. 3. Ref counts: A-1, B-0. 4. Finalize & deallocate B. 5. ... B drops ref. to A 6. Ref counts: A-0 7. Finalize & deallocate A. 'A' performs its final call to 'B' in step 2, still having a reference to it. It empties the attribute of its own that refers to B. 'B's reference count goes to 0. 'B' performs its final call to 'A' in step 5, still having a reference to it. 'A' still has control of its fields, and can make remaining subordinate calls if necessary. 'B' releases its reference to 'A', and 'A's reference count goes to zero. 'B' is deallocated. 'A' performs its finalization, and should check its field to see if it still has the reference to B. If it did, it would perform the call in step 2. In this case, it doesn't, and it can keep a record of the fact that it already made that final call. 'A's finalizer exits without any calls to 'B', because the field that held its reference to 'B' is clear. 'A' is deallocated. > But the destructor needs to be designed so > that it can work without holding any reference to the > object being destroyed I want to give 'A' control of that. To accomplish this, I bring it to A's attention the fact that it has left reachability, /and/ is in a cycle with B. It can perform its normal finalization at this time and maintain its consistency of state. I believe it solves the problem of failing to call the destructor at all, but I may have just shirked it. Will it work? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at improva.dk Sun Mar 29 17:47:23 2009 From: jh at improva.dk (Jacob Holm) Date: Sun, 29 Mar 2009 17:47:23 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CF6AAF.70109@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> Message-ID: <49CF980B.6030400@improva.dk> Jacob Holm wrote: > > Would it be possible to attach the current exception (if any) to the > StopIteration/GeneratorReturn raised by a return statement in a > finally clause? (Using the __traceback__ and __cause__ attributes from > PEP-3134) Then the PEP expansion could check for and reraise the > attached exception. Based on that idea, here is the 3.0-based expansion I propose: _i = iter(EXPR) try: _t = None _y = next(_i) while 1: try: _s = yield _y except BaseException as _e: _t = _e _m = getattr(_i, 'throw', None) if _m is None: raise _y = _m(_t) else: _t = None if _s is None: _y = next(_i) else: _y = _i.send(_s) except StopIteration as _e: if _e is _t: # If _e is the exception that we have just thrown to the subiterator, reraise it. if _m is None: # If there was no "throw" method, explicitly close the iterator before reraising. _m = getattr(_i, 'close', None) if _m is not None: _m() raise if _e.__cause__ is not None: # If the return was from inside a finally clause with an active exception, reraise that exception. raise _e.__cause__ # Normal return RESULT = _e.value I have moved the code around a bit to use fewer try blocks while preserving semantics, then removed the check for GeneratorExit and added a different check for __cause__. Even if the __cause__ idea is shot down, I think I prefer the way this expansion reads. It makes it easier to see at a glance what is part of the loop and what is part of the cleanup. What do you think? - Jacob From rhamph at gmail.com Sun Mar 29 20:04:53 2009 From: rhamph at gmail.com (Adam Olsen) Date: Sun, 29 Mar 2009 12:04:53 -0600 Subject: [Python-ideas] About adding a new iteratormethodcalled "shuffled" In-Reply-To: <20090328154008.GB7421@panix.com> References: <6a5569ec0903240848o7c642404q3311811567d4f4d0@mail.gmail.com> <200903252328.49177.steve@pearwood.info> <3f4107910903251441i2707a29dga584296ddd31e5e1@mail.gmail.com> <200903261058.59164.steve@pearwood.info> <87iqlwy3rf.fsf@xemacs.org> <87eiwjy7qc.fsf@xemacs.org> <20090328154008.GB7421@panix.com> Message-ID: On Sat, Mar 28, 2009 at 9:40 AM, Aahz wrote: > On Fri, Mar 27, 2009, Adam Olsen wrote: >> >> The irony is that we only seed with 128 bits, so rather than 2**19937 >> combinations, there's just 2**128. ?That drops our "safe" list size >> down to 34. ?Weee! > > That's probably worth a bug report or RFE if one doesn't already exist. It seems sufficient to me. We don't want to needlessly drain the system's entropy pool. How about a counter proposal? We add an orange or red box in the random docs that explain a few things together: * What a cryptographically secure RNG is, that ours isn't it, and that ours is unacceptable any time money or security is involved. * Specifically, 624 "iterates" allows you to predict the full state, and thus all future (and past?) output * The limitations of our default seed, and how it isn't a practical problem, overshadowed by the above two things * The limitations on shuffling a large list, how equidistance means it's not a practical problem, and is overshadowed by all of the above Some of that already exists, but is inline. IMO, security issues deserve a few flashing lights. The context of other problems also gives the proper light to shuffling's limitations. -- Adam Olsen, aka Rhamphoryncus From ptspts at gmail.com Sun Mar 29 20:37:38 2009 From: ptspts at gmail.com (=?ISO-8859-1?Q?P=E9ter_Szab=F3?=) Date: Sun, 29 Mar 2009 20:37:38 +0200 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Message-ID: <4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com> > class MostlyAbstract(object): > @override > def __hash__(self, other): > pass > @override > def __lt__(self, other): > pass > @override > def __eq__(self, other): > pass > def __le__(self, other): > return self.__lt__(other) or self.__eq__(other) > def __gt__(self, other): > return other.__lt__(self) > def __ge__(self, other): > return self.__gt__(other) or self.__eq__(other) > > and I decide then comparison should works a bit differently: > > class MostAbstract(MostlyAbstract): > def __gt__(self, other): > return not self.__le__(self) > > > This choice of mine won't work, even when I'm trying to just do a > slight change to your abstraction. I think we have a different understanding what @override means. I define @override like this: ``class B(A): @override def F(self): pass'' is OK only if A.F is defined, i.e. there is a method F to override. What I understand about your mails is that your definition is: if there is @override on A.F, then any subclass of A must override A.F. Do I get the situation of the different understanding right? If so, do you find anything in my definition which prevents code reuse? (I don't.) > def __init__(self, name, supers, methods): > type.__init__(self, name, supers, methods) > if hasattr(self, '__initclass__'): > self.__initclass__() Thanks for the idea, this sounds generic enough for various uses, and it gives power to the author of the subclass. I'll see if my decorators can be implemented using __initclass__. From Scott.Daniels at Acm.Org Sun Mar 29 21:26:16 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun, 29 Mar 2009 12:26:16 -0700 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com> Message-ID: P?ter Szab? wrote: ... > I think we have a different understanding what @override means. I > define @override like this: ``class B(A): @override def F(self): > pass'' is OK only if A.F is defined, i.e. there is a method F to > override. What I understand about your mails is that your definition > is: if there is @override on A.F, then any subclass of A must override > A.F. Do I get the situation of the different understanding right? If > so, do you find anything in my definition which prevents code reuse? > (I don't.) Nor do I. I completely misunderstood what you meant by override, and I agree that what you are specifying there _is_ a help to those writing code (I'd document it as a way of marking an intentional override). As to @final, I'd prefer a warning to an error when I override a final method. Overriding is a rich way of debugging, and if the point is to catch coding "misteaks", ignoring warnings is easier than changing package code when debugging. --Scott David Daniels Scott.Daniels at Acm.Org From janssen at parc.com Sun Mar 29 22:45:57 2009 From: janssen at parc.com (Bill Janssen) Date: Sun, 29 Mar 2009 13:45:57 PDT Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> Message-ID: <47913.1238359557@parc.com> P?ter Szab? wrote: > If Python had method decorators @final (meaning: it is an error to > override this method in any subclass) and @override (meaning: it is an > error not having this method in a superclass), I would use them in my > projects (some of them approaching 20 000 lines of Python code) and > I'll feel more confident writing object-oriented Python code. Java > already has similar decorators or specifiers. Do you think it is a > good idea to have these in Python? No on @final (I've had more trouble with ill-considered Java "final" classes than I can believe), but @override sounds interesting. I can see the point of that. Should do the check at compile time, right? Bill From steve at pearwood.info Sun Mar 29 23:30:43 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 30 Mar 2009 08:30:43 +1100 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com> Message-ID: <200903300830.44820.steve@pearwood.info> On Mon, 30 Mar 2009 06:26:16 am Scott David Daniels wrote: > P?ter Szab? wrote: > ... > > > I think we have a different understanding what @override means. I > > define @override like this: ``class B(A): @override def F(self): > > pass'' is OK only if A.F is defined, i.e. there is a method F to > > override. What I understand about your mails is that your > > definition is: if there is @override on A.F, then any subclass of A > > must override A.F. Do I get the situation of the different > > understanding right? If so, do you find anything in my definition > > which prevents code reuse? (I don't.) > > Nor do I. I completely misunderstood what you meant by override, and > I agree that what you are specifying there _is_ a help to those > writing code (I'd document it as a way of marking an intentional > override). Perhaps I just haven't worked on enough 20,000 line projects, but I don't get the point of @override. It doesn't prevent somebody from writing (deliberately or accidentally) B.F in the absence of A.F, since the coder can simply leave off the @override. If @override is just a way of catching spelling mistakes, perhaps it would be better in pylint or pychecker. What have I missed? -- Steven D'Aprano From ncoghlan at gmail.com Sun Mar 29 23:46:15 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 30 Mar 2009 07:46:15 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CF6AAF.70109@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> Message-ID: <49CFEC27.1050005@gmail.com> Jacob Holm wrote: > The problem is that any exception thrown into inner is converted to a > GeneratorReturn, which is then swallowed by the yield-from instead of > being reraised. That actually only happens if inner *catches and suppresses* the thrown in exception. Otherwise throw() will reraise the original exception automatically: >>> def gen(): ... try: ... yield ... except: ... print "Suppressed" ... >>> g = gen() >>> g.next() >>> g.throw(AssertionError) Suppressed Traceback (most recent call last): File "", line 1, in StopIteration >>> def gen(): ... try: ... yield ... finally: ... print "Not suppressed" ... >>> g = gen() >>> g.next() >>> g.throw(AssertionError) Not suppressed Traceback (most recent call last): File "", line 1, in File "", line 3, in gen AssertionError Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From Scott.Daniels at Acm.Org Mon Mar 30 00:03:03 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun, 29 Mar 2009 15:03:03 -0700 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <200903300830.44820.steve@pearwood.info> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <4fa38b910903291137p656569eel532e6ec930bf5c9d@mail.gmail.com> <200903300830.44820.steve@pearwood.info> Message-ID: Steven D'Aprano wrote: > ... Perhaps I just haven't worked on enough 20,000 line projects, but I > don't get the point of @override. It doesn't prevent somebody from > writing (deliberately or accidentally) B.F in the absence of A.F, since > the coder can simply leave off the @override. ? B.F vs. A.F? Could you expand this a trifle? > If @override is just a way of catching spelling mistakes, perhaps it > would be better in pylint or pychecker. What have I missed? If, for example, you have a huge testing framework, and some developers are given the task of developing elements from the framework by (say) overriding the test_sources and test_outcome methods, They can be handed an example module with @override demonstrating where to make the changes. class TestMondoDrive(DriveTestBase): @override def test_sources(self): return os.listdir('/standard/mondo/tests') @override def test_outcome(self, testname, outcome): if outcome != 'success': self.failures('At %s %s failed: %s' % ( time.strftime('%Y.%m.%d %H:%M:%S'), test_name, outcome)) else: assert False, "I've no idea how to deal with success" The resulting tests will be a bit easier to read, because you can easily distinguish between support methods and framework methods. Further, the entire warp drive test is not started if we stupidly spell the second "test_result" (as it was on the Enterprise tests). --Scott David Daniels Scott.Daniels at Acm.Org From jh at improva.dk Mon Mar 30 00:21:19 2009 From: jh at improva.dk (Jacob Holm) Date: Mon, 30 Mar 2009 00:21:19 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CFEC27.1050005@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49CFEC27.1050005@gmail.com> Message-ID: <49CFF45F.8090304@improva.dk> Nick Coghlan wrote: > Jacob Holm wrote: > >> The problem is that any exception thrown into inner is converted to a >> GeneratorReturn, which is then swallowed by the yield-from instead of >> being reraised. >> > > That actually only happens if inner *catches and suppresses* the thrown > in exception. Having a return in the finally clause like in my example is sufficient to suppress the exception. > Otherwise throw() will reraise the original exception > automatically: > I am not sure what your point is. Yes, this is a corner case. I am trying to make sure we have the corner cases working as well. In the example I gave I think it was pretty clear what should happen according to the inlining principle. The suppression of the initial exception is an accidental side effect of the refactoring. It looks to me like using the __cause__ attribute on the GeneratorReturn will allow us to reraise the exception. This seems like exactly the kind of thing that the __cause__ and __context__ attributes from PEP 3134 was designed for. - Jacob From steve at pearwood.info Mon Mar 30 00:29:28 2009 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 30 Mar 2009 09:29:28 +1100 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <200903300830.44820.steve@pearwood.info> Message-ID: <200903300929.28589.steve@pearwood.info> On Mon, 30 Mar 2009 09:03:03 am Scott David Daniels wrote: > Steven D'Aprano wrote: > > ... Perhaps I just haven't worked on enough 20,000 line projects, > > but I don't get the point of @override. It doesn't prevent somebody > > from writing (deliberately or accidentally) B.F in the absence of > > A.F, since the coder can simply leave off the @override. > > ? B.F vs. A.F? Could you expand this a trifle? Classes B and A, method F. In case it is still unclear, I'm referencing P?ter Szab?'s post, which we both quoted: > > I think we have a different understanding what @override means. I > > define @override like this: ``class B(A): @override def F(self): > > pass'' is OK only if A.F is defined, i.e. there is a method F to > > override. The intention is that this will fail: class A: pass class B(A): @override def F(self): pass but this will be okay: class A: def F(self): pass class B(A): @override def F(self): pass But if I leave out the @override then I can define B.F regardless of whether or not A.F exists, so it doesn't prevent the creation of B.F. > > If @override is just a way of catching spelling mistakes, perhaps > > it would be better in pylint or pychecker. What have I missed? > > If, for example, you have a huge testing framework, and some > developers are given the task of developing elements from the > framework by (say) overriding the test_sources and test_outcome > methods, They can be handed an example module with @override > demonstrating where to make the changes. "# OVERRIDE" or "# TODO" will do that just as well. > class TestMondoDrive(DriveTestBase): > > @override > def test_sources(self): > return os.listdir('/standard/mondo/tests') > > @override > def test_outcome(self, testname, outcome): > if outcome != 'success': > self.failures('At %s %s failed: %s' % ( > time.strftime('%Y.%m.%d %H:%M:%S'), > test_name, outcome)) > else: > assert False, "I've no idea how to deal with > success" > > The resulting tests will be a bit easier to read, because you can > easily distinguish between support methods and framework methods. Maybe so, but comments and/or naming conventions do that too. > Further, the entire warp drive test is not started if we stupidly > spell the second "test_result" (as it was on the Enterprise tests). That's a reasonable benefit, but it still sounds to me like something that should go in pylint. I don't really have any objection to this, and Guido has already said it should go into a third party module first. Thank you for explaining the use-case. -- Steven D'Aprano From mrs at mythic-beasts.com Sun Mar 29 23:57:03 2009 From: mrs at mythic-beasts.com (Mark Seaborn) Date: Sun, 29 Mar 2009 22:57:03 +0100 (BST) Subject: [Python-ideas] CapPython's use of unbound methods In-Reply-To: References: <20090319.231249.343185657.mrs@localhost.localdomain> Message-ID: <20090329.225703.432823651.mrs@localhost.localdomain> Guido van Rossum wrote: > On Thu, Mar 19, 2009 at 4:12 PM, Mark Seaborn wrote: > > Guido van Rossum wrote: > >> > Guido said, "I don't understand where the function object f gets its > >> > magic powers". > >> > > >> > The answer is that function definitions directly inside class > >> > statements are treated specially by the verifier. > >> > >> Hm, this sounds like a major change in language semantics, and if I > >> were Sun I'd sue you for using the name "Python" in your product. :-) > > > > Damn, the makers of Typed Lambda Calculus had better watch out for > > legal action from the makers of Lambda Calculus(tm) too... :-) ?Is it > > really a major change in semantics if it's just a subset? ;-) > > Well yes. The empty subset is also a subset. :-) As a side note, it is interesting to compare CapPython to ECMAScript 3.1's strict mode, which, as I understand it, changes the semantics of ECMAScript's attribute access such that doing X.A when X does not have an attribute A raises an exception rather than returning undefined. Since existing Javascript implementations lack this feature, Cajita (a fail-stop subset of Javascript, part of the Caja project) has to go to some lengths to emulate it. This seems to be the main reason that Cajita rewrites Javascript code, to add attribute existence checks. Fortunately CapPython does not have to make this kind of semantic change. Interestingly, in Javascript is is easier to add this kind of change on a per-module basis than in Python, because dynamic attribute access in Javascript is done via a builtin syntax (x[a]) rather than via a function (getattr in Python). However, CPython's restricted execution mode (which Tav is proposing to resurrect) does change the semantics of attribute access. It's not yet clear to me how this works, and how it applies to the getattr function. I suspect it involves looking up the stack. > More seriously, IIUC you are disallowing all use of attribute names > starting with underscores, which not only invalidates most Python > code in practical use (though you might not care about that) but > also disallows the use of many features that are considered part of > the language, such as access to __dict__ and many other > introspective attributes. This is true. I'm not claiming that a lot of Python code will pass the verifier. It might not accept all idiomatic code; I'm just claiming that code using encapsulated objects under CapPython can still be idiomatic. We could probably allow reading self.__dict__ safely in CapPython. The term "introspection" covers a lot of language features. Some are OK in an object-capability language and some are not. For example, some might consider dir() to be an introspective feature, and this function is fine if suitably wrapped. x.__class__.__name__ is a common idiom. Although we can't allow x.__class__ on its own, we could provide a get_class_name function and rewrite "x.__class__.__name__" to "get_class_name(x)". "type(x) is C" is another common idiom. Again, CapPython doesn't provide type() but it can provide a type_is() function: def type_is(x, t): return type(x) is t The "locals" builtin is not something CapPython can allow in general. Any function that can look up the stack in this way is potentially dangerous. But it might be OK to allow "locals()", i.e. the case where "locals" is called as a function and not used as a first class value. I would prefer not to have to do that though. > > To some extent the verifier's check of only accessing private > > attributes through self is just checking a coding style that I already > > follow when writing Python code (except sometimes for writing test > > cases). > > You might wish this to be true, but for most Python programmers, it > isn't. Introspection is a commonly-used part of the language (probably > more so than in Java). So is the use of attribute names starting with > a single underscore outside the class tree, e.g. by "friend" > functions. The friend function pattern is an example of something that CapPython could support, with some extra notation in order to make it explicit. It is a case of what is known as rights amplification in capability systems. Here's an example of how I envisage it would work in CapPython: class C(object): def _get_foo(self): return self._foo _get_foo = C._get_foo Although C._get_foo would normally be rejected, the verifier would allow reading C._get_foo immediately after the class definition as a special case. The resulting _get_foo function would only be able to operate on instances of C (assuming the presence of unbound methods in the language). > > Of course some of the verifier's checks, such as only allowing > > attribute assignments through self, are a lot more draconian than > > coding style checks. > > That also sounds like a rather serious hindrance to writing Python as > most people think of it. Attribute assignment is something that we could handle by rewriting. For example, x.y = z could be rewritten to x.set_attribute("y", z) x's class definition would have to declare that attribute y is assignable. The problem with attribute assignment in Python as it stands is that it is opt-out. Attributes can be made read-only (by using "property" or defining __setattr__), but this is not the default. > > Whether these function definitions are accepted by the verifier > > depends on their context. > > But this isn't. > > Are you saying that the verifier accepts the use of self._foo in a > method? Yes. > That would make the scenario of potentially passing a class > defined by Alice into Bob's code much harder to verify -- now suddenly > Alice has to know about a lot of things before she can be sure that > she doesn't leave open a backdoor for Bob. In most cases Alice would not want Bob to extend classes that she has defined, so she would not give Bob access to the unwrapped class objects. She would just give Bob the constructor. If Alice wants to be sure that she does that, she can add a decorator to all her class definitions: def constructor_only(klass): def wrapper(*args, **kwargs): return klass(*args, **kwargs) return wrapper @constructor_only class C(object): ... (However, this assumes that class decorators are available, and CapPython does not support Python 2.6 yet.) > > The default environment doesn't provide the real getattr() function. > > It provides a wrapped version that rejects private attribute names. > > Do you have a web page describing the precise list of limitations you > apply in your "subset" of Python? I started some wiki pages to explain the verifier rules and which builtins are allowed, blocked or wrapped: http://plash.beasts.org/wiki/CapPython/VerifierRules http://plash.beasts.org/wiki/CapPython/Builtins I hope that will make things clearer. > Does it support import of some form? Yes, it supports import: http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builtins-in.html The safeeval module allows callers to provide their own __import__ function when evalling code. Mark From ncoghlan at gmail.com Mon Mar 30 00:43:12 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 30 Mar 2009 08:43:12 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CFF45F.8090304@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49CFEC27.1050005@gmail.com> <49CFF45F.8090304@improva.dk> Message-ID: <49CFF980.90606@gmail.com> Jacob Holm wrote: > Nick Coghlan wrote: >> Jacob Holm wrote: >> >>> The problem is that any exception thrown into inner is converted to a >>> GeneratorReturn, which is then swallowed by the yield-from instead of >>> being reraised. >>> >> >> That actually only happens if inner *catches and suppresses* the thrown >> in exception. > Having a return in the finally clause like in my example is sufficient > to suppress the exception. Ah, I did miss that - I think it just means the code has been refactored incorrectly though. >> Otherwise throw() will reraise the original exception >> automatically: >> > I am not sure what your point is. Yes, this is a corner case. I am > trying to make sure we have the corner cases working as well. I think the refactoring is buggy, because it has changed the code from leaving exceptions alone to suppressing them. Consider what it would mean to do the same refactoring with normal functions: def inner(): try: perform_operation() finally: return 'VALUE' def outer(): val = inner() print val That code does NOT do the same thing as: def outer(): try: perform_operation() finally: val = 'VALUE' print val A better refactoring would keep the return outside the finally clause in the inner generator: Either: def inner(): try: yield 1 yield 2 yield 3 finally: val = 'VALUE' return val Or else: def inner(): try: yield 1 yield 2 yield 3 finally: pass return 'VALUE' Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From jh at improva.dk Mon Mar 30 00:59:48 2009 From: jh at improva.dk (Jacob Holm) Date: Mon, 30 Mar 2009 00:59:48 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CFF980.90606@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49CFEC27.1050005@gmail.com> <49CFF45F.8090304@improva.dk> <49CFF980.90606@gmail.com> Message-ID: <49CFFD64.8080902@improva.dk> Nick Coghlan wrote: > Jacob Holm wrote: > >> Having a return in the finally clause like in my example is sufficient >> to suppress the exception. >> > > Ah, I did miss that - I think it just means the code has been refactored > incorrectly though. > > Ok > I think the refactoring is buggy, because it has changed the code from > leaving exceptions alone to suppressing them. Consider what it would > mean to do the same refactoring with normal functions: > > def inner(): > try: > perform_operation() > finally: > return 'VALUE' > > def outer(): > val = inner() > print val > > That code does NOT do the same thing as: > > def outer(): > try: > perform_operation() > finally: > val = 'VALUE' > print val > Good point. Based on this observation, I withdraw the proposal about storing the active exception on the GeneratorReturn and reraising it in yield-from. I still think we should get rid of the check for GeneratorExit, because of the other example I gave. - Jacob From Scott.Daniels at Acm.Org Mon Mar 30 01:34:53 2009 From: Scott.Daniels at Acm.Org (Scott David Daniels) Date: Sun, 29 Mar 2009 16:34:53 -0700 Subject: [Python-ideas] method decorators @final and @override in Python 2.4 In-Reply-To: <200903300929.28589.steve@pearwood.info> References: <4fa38b910903280744s5343524fyc33c421cfd068a77@mail.gmail.com> <200903300830.44820.steve@pearwood.info> <200903300929.28589.steve@pearwood.info> Message-ID: Steven D'Aprano wrote: > On Mon, 30 Mar 2009 09:03:03 am Scott David Daniels wrote: >> Steven D'Aprano wrote: >>> ... Perhaps I just haven't worked on enough 20,000 line projects, >>> but I don't get the point of @override. It doesn't prevent somebody >>> from writing (deliberately or accidentally) B.F in the absence of >>> A.F, since the coder can simply leave off the @override. >> ? B.F vs. A.F? Could you expand this a trifle? > > Classes B and A, method F. ... > The intention is that this will fail: ... > but this will be okay: ... > But if I leave out the @override then I can define B.F regardless of > whether or not A.F exists, so it doesn't prevent the creation of B.F. Thanks for the expansion. The check is not so much to prevent creating B.F, as it is asserting we are plugging into a framework here. >> Further, the entire warp drive test is not started if we stupidly >> spell the second "test_result" (as it was on the Enterprise tests). > > That's a reasonable benefit, but it still sounds to me like something > that should go in pylint. Yes, after all we did lose all of sector 4.66.73 on that unfortunate accident :-). I agree that it does feel a bit pylint-ish, but I have work on large unwieldy frameworks where large machines get powered on by the framework as part of running a test, and it is nice to see the whole test not even start in such circumstances. This is why I wrote that (easily ponied in) possible addition to type named "__initclass__", it seemed a more-useful technique that could be be used by the OP to implement his desires, while providing a simple place to put class initialization code that allows people to get a bit fancier with their classes without having to do the metaclass dance themselves. I'll try to putting up an ActiveState recipe for this in the coming week. --Scott David Daniels Scott.Daniels at Acm.Org From greg.ewing at canterbury.ac.nz Mon Mar 30 06:37:45 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 30 Mar 2009 16:37:45 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CFF980.90606@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49CFEC27.1050005@gmail.com> <49CFF45F.8090304@improva.dk> <49CFF980.90606@gmail.com> Message-ID: <49D04C99.7000407@canterbury.ac.nz> Nick Coghlan wrote: > I think the refactoring is buggy, because it has changed the code from > leaving exceptions alone to suppressing them. That looks like the right assessment to me. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 30 07:45:51 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 30 Mar 2009 17:45:51 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49CF6AAF.70109@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C3EF5E.1050807@improva.dk> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> Message-ID: <49D05C8F.3040800@canterbury.ac.nz> The problem of how to handle GeneratorExit doesn't seem to have any entirely satisfactory solution. On the one hand, the inlining principle requires that we never re-raise it if the subgenerator turns it into a StopIteration (or GeneratorReturn). On the other hand, not re-raising it means that a broken generator can easily result from innocuously combining two things that are individually legitimate. I think we just have to accept this, and state that refactoring only preserves semantics as long as the code block being factored out does not catch GeneratorExit without re-raising it. Then we're free to always re-raise GeneratorExit and prevent broken generators from occurring. I'm inclined to think this situation is a symptom that the idea of being able to catch GeneratorExit at all is flawed. If generator finalization were implemented by means of a forced return, or something equally uncatchable, instead of an exception, we wouldn't have so much of a problem. Earlier I said that I thought GeneratorExit was best regarded as an implementation detail of generators. I'd like to strengthen that statement and say that it should be considered a detail of the *present* implementation of generators, subject to change in future or alternate Pythons. Related to that, I'm starting to come back to my original instinct that GeneratorExit should not be thrown into the subiterator at all. Rather, it should be taken as an indication that the delegating generator is being finalized, and the subiterator's close() method called if it has one. Then there's never any question about whether to re-raise it -- we should always do so. -- Greg From mrts.pydev at gmail.com Mon Mar 30 12:04:26 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Mon, 30 Mar 2009 13:04:26 +0300 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> Message-ID: On Sat, Mar 28, 2009 at 5:26 AM, Guido van Rossum wrote: > There's way too much bikeshedding in this thread (not picking on you > specifically). I think the originally proposed API is fine, except it > should *not* reject duplicates. To add duplicates you'd just call it > multiple times, e.g. add_query_params(add_query_params(url, a='x'), > a='y'). It's a pretty minor use case anyways. So be it. I'll open a ticket and provide a patch, tests and documentation. For people concerned about ordering -- you can always use an odict for passing the kwargs: add_query_params('http://foo.com', **odict('a' = 1, 'b' = 2)) For people concerned about syntactically more restrictive rules than application/x-www-form-urlencoded allows -- pass in the kwargs via ordinary dict: add_query_params('http://foo.com', **{'|"-/': 1, '???': 2}) # note that py2k allows UTF-8 in argument names anyway The latter is bad practice anyway. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrts.pydev at gmail.com Mon Mar 30 12:06:17 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Mon, 30 Mar 2009 13:06:17 +0300 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> Message-ID: On Mon, Mar 30, 2009 at 1:04 PM, Mart S?mermaa wrote: > > add_query_params('http://foo.com', **{'|"-/': 1, '???': 2}) # note that > py2k allows UTF-8 in argument names anyway > > s/py2k/py3k/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Mon Mar 30 12:28:31 2009 From: eric at trueblade.com (Eric Smith) Date: Mon, 30 Mar 2009 05:28:31 -0500 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> Message-ID: <49D09ECF.5090407@trueblade.com> > For people concerned about ordering -- you can always use an odict for > passing the kwargs: > > add_query_params('http://foo.com ', **odict('a' = 1, > 'b' = 2)) Not that I want to continue the discussion about this particular issue, but I'd like to correct this statement, since this statement is wrong (beyond the syntax of creating the odict being incorrect). "**" converts the parameters to an ordinary dict. The caller does not receive the same object you call the function with. So any ordering of the values in the odict will be lost. $ ./python.exe Python 2.7a0 (trunk:70598, Mar 25 2009, 17:30:54) [GCC 4.0.1 (Apple Inc. build 5465)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> from collections import OrderedDict as odict >>> def foo(**kwargs): ... print type(kwargs) ... for k, v in kwargs.iteritems(): ... print k, v ... >>> o1=odict(); o1['a']=1; o1['b']=2 >>> o1 OrderedDict([('a', 1), ('b', 2)]) >>> o2=odict(); o2['b']=2; o2['a']=1 >>> o2 OrderedDict([('b', 2), ('a', 1)]) >>> foo(**o1) a 1 b 2 >>> foo(**o2) a 1 b 2 >>> Further, when an odict is created and arguments are supplied, the ordering is also lost: >>> odict(a=1, b=2) OrderedDict([('a', 1), ('b', 2)]) >>> odict(b=2, a=1) OrderedDict([('a', 1), ('b', 2)]) >>> 3.1 works the same way (once you change the print statement and use .items instead of .iteritems: I need to run 2to3 on my example!). I just want to make sure everyone realized the limitations. odict won't solve problems like this. I think these are both "gotchas" waiting to happen. Eric. From ncoghlan at gmail.com Mon Mar 30 12:47:00 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 30 Mar 2009 20:47:00 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D05C8F.3040800@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C43636.9080402@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> Message-ID: <49D0A324.1030701@gmail.com> Greg Ewing wrote: > I'm inclined to think this situation is a symptom that > the idea of being able to catch GeneratorExit at all > is flawed. If generator finalization were implemented > by means of a forced return, or something equally > uncatchable, instead of an exception, we wouldn't have > so much of a problem. Well, in theory people are meant to be writing "except Exception:" rather than using a bare except or catching BaseException - that's a big part of the reason SystemExit, KeyboardInterrupt and GeneratorExit *aren't* Exception subclasses. > Related to that, I'm starting to come back to my > original instinct that GeneratorExit should not be > thrown into the subiterator at all. Rather, it should > be taken as an indication that the delegating generator > is being finalized, and the subiterator's close() > method called if it has one. Then there's never any > question about whether to re-raise it -- we should > always do so. I think that's a simpler finalisation rule to remember, so I'd be fine with that approach. I don't think we're going to be able to completely eliminate the tricky subtleties from this expression, but we can at least try to keep them as simple as possible. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From mrts.pydev at gmail.com Mon Mar 30 12:55:28 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Mon, 30 Mar 2009 13:55:28 +0300 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <49D09ECF.5090407@trueblade.com> References: <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> <49D09ECF.5090407@trueblade.com> Message-ID: On Mon, Mar 30, 2009 at 1:28 PM, Eric Smith wrote: > "**" converts the parameters to an ordinary dict. The caller does not > receive the same object you call the function with. So any ordering of the > values in the odict will be lost. Right you are, sorry for the mental blunder. So what if the signature is as follows to support passing query parameters via an ordered dict: add_query_params(url, params_dict=None, **kwargs) with the following behaviour: >>> pd = odict() >>> pd['a'] = 1 >>> pd['b'] = 2 >>> add_query_params('http://foo.com/?a=0', pd, a=3) 'http://foo.com/?a=0&a=1&b=2&a=3' -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Mar 30 13:28:21 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 30 Mar 2009 21:28:21 +1000 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> <49D09ECF.5090407@trueblade.com> Message-ID: <49D0ACD5.5090209@gmail.com> Mart S?mermaa wrote: > Right you are, sorry for the mental blunder. So what if the signature is > as follows to support passing query parameters via an ordered dict: > > add_query_params(url, params_dict=None, **kwargs) > > with the following behaviour: > >>>> pd = odict() >>>> pd['a'] = 1 >>>> pd['b'] = 2 >>>> add_query_params('http://foo.com/?a=0', pd, a=3) > 'http://foo.com/?a=0&a=1&b=2&a=3 ' When setting up a dict.update style interface like that, it is often better to use *args for the two positional arguments - it avoids accidental name conflicts between the positional arguments and arbitrary keyword arguments. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From mrts.pydev at gmail.com Mon Mar 30 14:22:33 2009 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Mon, 30 Mar 2009 15:22:33 +0300 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: <49D0ACD5.5090209@gmail.com> References: <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> <49D09ECF.5090407@trueblade.com> <49D0ACD5.5090209@gmail.com> Message-ID: On Mon, Mar 30, 2009 at 2:28 PM, Nick Coghlan wrote: > Mart S?mermaa wrote: > > Right you are, sorry for the mental blunder. So what if the signature is > > as follows to support passing query parameters via an ordered dict: > > > > add_query_params(url, params_dict=None, **kwargs) > > > > with the following behaviour: > > > >>>> pd = odict() > >>>> pd['a'] = 1 > >>>> pd['b'] = 2 > >>>> add_query_params('http://foo.com/?a=0', pd, a=3) > > 'http://foo.com/?a=0&a=1&b=2&a=3 ' > > When setting up a dict.update style interface like that, it is often > better to use *args for the two positional arguments - it avoids > accidental name conflicts between the positional arguments and arbitrary > keyword arguments. Thanks, another good point. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Mar 30 17:29:08 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 30 Mar 2009 10:29:08 -0500 Subject: [Python-ideas] Proposed addtion to urllib.parse in 3.1 (and urlparse in 2.7) In-Reply-To: References: <19919.1238170437@parc.com> <49CD2100.3070502@trueblade.com> <49CD2930.4080307@cornell.edu> <91ad5bf80903271728ka18360cpd514aa5dd93cd74a@mail.gmail.com> Message-ID: On Mon, Mar 30, 2009 at 5:04 AM, Mart S?mermaa wrote: > On Sat, Mar 28, 2009 at 5:26 AM, Guido van Rossum wrote: >> >> There's way too much bikeshedding in this thread (not picking on you >> specifically). I think the originally proposed API is fine, except it >> should *not* reject duplicates. To add duplicates you'd just call it >> multiple times, e.g. add_query_params(add_query_params(url, a='x'), >> a='y'). It's a pretty minor use case anyways. > > So be it. I'll open a ticket and provide a patch, tests and documentation. > > For people concerned about ordering -- you can always use an odict for > passing the kwargs: > > add_query_params('http://foo.com', **odict('a' = 1, 'b' = 2)) Alas, that doesn't work -- f(**X) copies X into a real dict. But web apps that care about the order are crazy IMO. > For people concerned about syntactically more restrictive rules than > application/x-www-form-urlencoded allows -- pass in the kwargs via ordinary > dict: > > add_query_params('http://foo.com', **{'|"-/': 1, '???': 2}) # note that py2k > allows UTF-8 in argument names anyway > > The latter is bad practice anyway. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From guido at python.org Mon Mar 30 23:19:51 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 30 Mar 2009 16:19:51 -0500 Subject: [Python-ideas] CapPython's use of unbound methods In-Reply-To: <20090329.225703.432823651.mrs@localhost.localdomain> References: <20090319.231249.343185657.mrs@localhost.localdomain> <20090329.225703.432823651.mrs@localhost.localdomain> Message-ID: On Sun, Mar 29, 2009 at 4:57 PM, Mark Seaborn wrote: > As a side note, it is interesting to compare CapPython to ECMAScript > 3.1's strict mode, which, as I understand it, changes the semantics of > ECMAScript's attribute access such that doing X.A when X does not have > an attribute A raises an exception rather than returning undefined. > > Since existing Javascript implementations lack this feature, Cajita (a > fail-stop subset of Javascript, part of the Caja project) has to go to > some lengths to emulate it. ?This seems to be the main reason that > Cajita rewrites Javascript code, to add attribute existence checks. > > Fortunately CapPython does not have to make this kind of semantic > change. Well of course it makes a much more severe semantic change by declaring illegal all use of attribute names starting with underscore. > Interestingly, in Javascript is is easier to add this kind of change > on a per-module basis than in Python, because dynamic attribute access > in Javascript is done via a builtin syntax (x[a]) rather than via a > function (getattr in Python). I guess if you wanted to override getattr on a per-module basis you could give each module a separate __builtins__. > However, CPython's restricted execution mode (which Tav is proposing > to resurrect) does change the semantics of attribute access. It does not change the general semantics of attribute access -- it only takes away a small set of *specific* attributes (e.g. __code__ and func_code) from a small set of *specific* object types (e.g. function objects). This is because every object has the ability to override getting attributes (via __getattribute__ in Python, or tp_getattro in C). > It's not > yet clear to me how this works, and how it applies to the getattr > function. ?I suspect it involves looking up the stack. No, it does not look at the stack. It looks at the globals, which contain a special magic entry __builtins__ (with an 's') which is the dict where built-in functions are looked up. When this dict is the same object as the *default* built-in dict (which is __builtin__.__dict__ where __builtin__ -- without 's' -- is the module defining the built-in functions), it gives you supervisor privileges; if it is any other object, it disallows access to those specific attributes I referred to above. I really recommend that you study the CPython implementation. Without understanding it you stand a chance of creating a secure subset. The getattr() function and the x.y notation both invoke the same implementation (PyObject_GetAttr()). This in turn defers to the tp_getattro slot of the object x. And if the object is implemented in Python, this in turn defers to the object's __getattribute__ method. Then object.__getattribute__ defines the default lookup code, which searches into the object's __dict__ if there is one, then in the class's __dict__ and walking the MRO, and finally (just before raising AttributeError) calls the __getattr__ hook if it exists (don't confuse the latter with __getattribute__). > Guido van Rossum wrote: >> More seriously, IIUC you are disallowing all use of attribute names >> starting with underscores, which not only invalidates most Python >> code in practical use (though you might not care about that) but >> also disallows the use of many features that are considered part of >> the language, such as access to __dict__ and many other >> introspective attributes. > > This is true. ?I'm not claiming that a lot of Python code will pass > the verifier. ?It might not accept all idiomatic code; I'm just > claiming that code using encapsulated objects under CapPython can > still be idiomatic. For some definition of idiomatic. There are a lot of well-known Python idioms involving attribute names starting with underscore. (I hate to question your Python proficiency, but I do have to wonder -- how much Python have you written in your life? Where did you learn Python?) > We could probably allow reading self.__dict__ safely in CapPython. Though that's not enough -- peeking in other.__dict__ is also somewhat common. > The term "introspection" covers a lot of language features. ?Some are > OK in an object-capability language and some are not. Agreed. And many introspection features aren't that important or commonly used. But some others are, and this includes using __dict__ and __class__. > For example, some might consider dir() to be an introspective feature, It is. > and this function is fine if suitably wrapped. You'd have to look at the C implementation to see what it might do though. > x.__class__.__name__ is a common idiom. ?Although we can't allow > x.__class__ on its own, we could provide a get_class_name function and > rewrite "x.__class__.__name__" to "get_class_name(x)". > > "type(x) is C" is another common idiom. Though in most cases isinstance(x, C) is preferred. > Again, CapPython doesn't > provide type() but it can provide a type_is() function: > def type_is(x, t): > ? ?return type(x) is t And slowly we slide down the path of writing less and less idiomatic Python... > The "locals" builtin is not something CapPython can allow in general. > Any function that can look up the stack in this way is potentially > dangerous. ?But it might be OK to allow "locals()", i.e. the case > where "locals" is called as a function and not used as a first class > value. ?I would prefer not to have to do that though. Using locals() isn't that idiomatic anyway, so this is probably fine. It's mostly used by beginners who are still exploring the extreme end of the language's dynamism. :-) >> > To some extent the verifier's check of only accessing private >> > attributes through self is just checking a coding style that I already >> > follow when writing Python code (except sometimes for writing test >> > cases). >> >> You might wish this to be true, but for most Python programmers, it >> isn't. Introspection is a commonly-used part of the language (probably >> more so than in Java). So is the use of attribute names starting with >> a single underscore outside the class tree, e.g. by "friend" >> functions. > > The friend function pattern is an example of something that CapPython > could support, with some extra notation in order to make it explicit. > It is a case of what is known as rights amplification in capability > systems. > > Here's an example of how I envisage it would work in CapPython: > > class C(object): > ? ?def _get_foo(self): > ? ? ? ?return self._foo > _get_foo = C._get_foo > > Although C._get_foo would normally be rejected, the verifier would > allow reading C._get_foo immediately after the class definition as a > special case. ?The resulting _get_foo function would only be able to > operate on instances of C (assuming the presence of unbound methods in > the language). I'm not sure how useful this is -- friends aren't necessarily in the same module as the class, otherwise they might as well be declared as static methods. >> > Of course some of the verifier's checks, such as only allowing >> > attribute assignments through self, are a lot more draconian than >> > coding style checks. >> >> That also sounds like a rather serious hindrance to writing Python as >> most people think of it. > > Attribute assignment is something that we could handle by rewriting. > For example, > > ?x.y = z > > could be rewritten to > > ?x.set_attribute("y", z) Why not x.set_y(z) ? > x's class definition would have to declare that attribute y is > assignable. ?The problem with attribute assignment in Python as it > stands is that it is opt-out. ?Attributes can be made read-only (by > using "property" or defining __setattr__), but this is not the > default. This will encourage people to write "Java in Python" which is an unfortunately common anti-pattern. >> > Whether these function definitions are accepted by the verifier >> > depends on their context. >> >> But this isn't. >> >> Are you saying that the verifier accepts the use of self._foo in a >> method? > > Yes. > >> That would make the scenario of potentially passing a class >> defined by Alice into Bob's code much harder to verify -- now suddenly >> Alice has to know about a lot of things before she can be sure that >> she doesn't leave open a backdoor for Bob. > > In most cases Alice would not want Bob to extend classes that she has > defined, so she would not give Bob access to the unwrapped class > objects. ?She would just give Bob the constructor. Or perhaps, better, a factory function, right? > If Alice wants to > be sure that she does that, she can add a decorator to all her class > definitions: > > def constructor_only(klass): > ? ?def wrapper(*args, **kwargs): > ? ? ? ?return klass(*args, **kwargs) > ? ?return wrapper > > @constructor_only > class C(object): > ? ?... Clever. It does meant that even the class body of C cannot refer to C-the-class, which prevents certain idioms (mostly involving updating class variables -- perhaps not all that common). > (However, this assumes that class decorators are available, and > CapPython does not support Python 2.6 yet.) Well you can always do this manually: class C(object): ... C = constructor_only(C) >> > The default environment doesn't provide the real getattr() function. >> > It provides a wrapped version that rejects private attribute names. >> >> Do you have a web page describing the precise list of limitations you >> apply in your "subset" of Python? > > I started some wiki pages to explain the verifier rules and which > builtins are allowed, blocked or wrapped: > http://plash.beasts.org/wiki/CapPython/VerifierRules > http://plash.beasts.org/wiki/CapPython/Builtins > I hope that will make things clearer. Ok, I'll try to remember to look there before responding next time. >> Does it support import of some form? > > Yes, it supports import: > http://lackingrhoticity.blogspot.com/2008/09/dealing-with-modules-and-builtins-in.html > > The safeeval module allows callers to provide their own __import__ > function when evalling code. Ok. Have you done a security contest like Tav did yet? Implementing import correctly *and* safely is fiendishly difficult. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From greg.ewing at canterbury.ac.nz Tue Mar 31 00:12:01 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Mar 2009 10:12:01 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D0A324.1030701@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C43D05.3010903@improva.dk> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> Message-ID: <49D143B1.9040009@canterbury.ac.nz> Nick Coghlan wrote: > Well, in theory people are meant to be writing "except Exception:" > rather than using a bare except or catching BaseException - that's a big > part of the reason SystemExit, KeyboardInterrupt and GeneratorExit > *aren't* Exception subclasses. Yes, it probably isn't something people will do very often. But as long as GeneratorExit is documented as an official part of the language, we need to explain how we're dealing with it. BTW, how official *is* it meant to be? There seems to be very little said about it in either the Language or Library Reference. The Library Ref says it's the "exception raised when a generator's close() method is called". The Language Ref says that the close() method "allows finally clauses to run", but doesn't say how that is accomplished. And I can't find throw() mentioned anywhere! -- Greg From guido at python.org Tue Mar 31 00:22:07 2009 From: guido at python.org (Guido van Rossum) Date: Mon, 30 Mar 2009 17:22:07 -0500 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D143B1.9040009@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> Message-ID: On Mon, Mar 30, 2009 at 5:12 PM, Greg Ewing wrote: > Nick Coghlan wrote: > >> Well, in theory people are meant to be writing "except Exception:" >> rather than using a bare except or catching BaseException - that's a big >> part of the reason SystemExit, KeyboardInterrupt and GeneratorExit >> *aren't* Exception subclasses. > > Yes, it probably isn't something people will do very > often. But as long as GeneratorExit is documented as > an official part of the language, we need to explain > how we're dealing with it. > > BTW, how official *is* it meant to be? There seems to > be very little said about it in either the Language or > Library Reference. That's one of our many doc bugs. (Maybe someone at the PyCon sprints can fix these?) PEP 342 defines GeneratorExit, inheriting from Exception. However a later change to the code base made it inherit from BaseException. > The Library Ref says it's the "exception raised when a > generator's close() method is called". The Language Ref > says that the close() method "allows finally clauses to > run", but doesn't say how that is accomplished. > > And I can't find throw() mentioned anywhere! Also defined in PEP 342. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jh at improva.dk Tue Mar 31 01:22:22 2009 From: jh at improva.dk (Jacob Holm) Date: Tue, 31 Mar 2009 01:22:22 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D143B1.9040009@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C44674.5030107@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> Message-ID: <49D1542E.7070503@improva.dk> Greg Ewing wrote: > Nick Coghlan wrote: > >> Well, in theory people are meant to be writing "except Exception:" >> rather than using a bare except or catching BaseException - that's a big >> part of the reason SystemExit, KeyboardInterrupt and GeneratorExit >> *aren't* Exception subclasses. > > Yes, it probably isn't something people will do very > often. But as long as GeneratorExit is documented as > an official part of the language, we need to explain > how we're dealing with it. As my last (flawed) example shows, it is easy to accidently convert the GeneratorExit (along with any other uncaught exception) to a StopIteration if you are using a finally clause. You don't need to explicitly catch anything. Code that does this should be considered broken. Not so much because it is swallowing GeneratorExit, but because it swallows *any* exception. I don't think we should add special cases to the yield-from semantics to cater for broken code. I even think it might have been a mistake in PEP 342 to let close swallow StopIteration. It might have been better if a throw to an already-closed generator just raised the thrown exception, and close only swallowed GeneratorExit. That way, you would quickly discover that the generator was swallowing exceptions because a call to close would cause a StopIteration. With that definition, we would consider any generator that did not (under normal conditions) raise GeneratorExit when thrown a GeneratorExit to be broken. Had that been the definition, I think we would long ago have agreed to let yield-from treat GeneratorExit like any other exception. Unfortunately that is not how things work, and I am afraid that changing it would "break" too much code. I put "break" in quotes, because I think most such code is already broken in the sense that it can swallow exceptions that it shouldn't, such as KeyboardInterrupt and SystemExit. Even without changing throw and close, I still think we should forward GeneratorExit like any other exception, and not do anything special to reraise it or call close on the subiterator. To me that sounds like the cleaner solution, and it is what the inlining principle suggests. It is unfortunate that you have to be a bit more careful about not swallowing GeneratorExit, but I think that care is needed anyway to avoid swallowing other exceptions as well. > > BTW, how official *is* it meant to be? There seems to > be very little said about it in either the Language or > Library Reference. > > The Library Ref says it's the "exception raised when a > generator's close() method is called". The Language Ref > says that the close() method "allows finally clauses to > run", but doesn't say how that is accomplished. > > And I can't find throw() mentioned anywhere! > All the generator methods are described here: http://docs.python.org/reference/expressions.html#yield-expressions - Jacob From greg.ewing at canterbury.ac.nz Tue Mar 31 02:23:48 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Mar 2009 12:23:48 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D1542E.7070503@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C4D67B.4010109@improva.dk> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk> Message-ID: <49D16294.9030205@canterbury.ac.nz> Jacob Holm wrote: > Even without changing throw and close, I still think we should forward > GeneratorExit like any other exception, and not do anything special to > reraise it or call close on the subiterator. But that allows you to inadvertently create a broken generator by calling another generator that, according to the rules you've just acknowledged we can't change, is behaving correctly. Asking users not to call such generators would require them to have knowledge about the implementation of every generator they call, which I don't think is acceptable. -- Greg From jan.kanis at phil.uu.nl Tue Mar 31 02:36:06 2009 From: jan.kanis at phil.uu.nl (Jan Kanis) Date: Tue, 31 Mar 2009 02:36:06 +0200 Subject: [Python-ideas] python-like garbage collector & workaround In-Reply-To: <318523839.906611238334142946.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net> References: <49CEA57B.5070004@canterbury.ac.nz> <318523839.906611238334142946.JavaMail.root@sz0050a.emeryville.ca.mail.comcast.net> Message-ID: <59a221a0903301736y40a348b5ib18b938643fad98b@mail.gmail.com> 2009/3/29 : > ----- Original Message ----- > From: "Greg Ewing" > To: castironpi-ng at comcast.net > Cc: Python-ideas at python.org > Sent: Saturday, March 28, 2009 5:32:27 PM GMT -06:00 US/Canada Central > Subject: Re: [Python-ideas] python-like garbage collector & workaround > > castironpi-ng at comcast.net wrote: > > ?> I'm considering a workaround that performs GC in two steps. ?First, it >> requests the objects to drop their references that participate in the >> cycle. ?Then, it enqueues the decref'ed object for an unnested >> destruction. > Castironpi, I don't think your solution solves the problem. In a single stage finalization design, it is allways possible to call the destructors of the objects in the cycle in random order. The problem is that now when A gets finalized, it cannot use its reference to B anymore because B may have already been finalized, and thus we cannot assume B can still be used for anything usefull. The problem, of course, is that one of A or B may still need the other during its finalization. In your solution, the real question is what the state of an object is supposed to be when it is in between the two stages of finalization. Is it still supposed to be a fully functional object, that handles all operations just as if it were still fully alive? In that case the object can only drop the references that it doesn't actually need to perform any of its operations (not just finalization). But if we assume that an object has all its references for a reason, there is nothing it can drop. (except if it uses a reference for caching or similar things. But I think that is only a minority of all use cases.) If you propose an object counts as 'finalized' (or at least, no longer fully functional) when it is in between stages of finalization, we have the same problem as in the single stage random order finalization: other objects that refer to it can no longer use it for anything usefull. The only option that is left is to have the object be in some in-between state. But that really complicates Pythons object model, because every object now has two visible states: alive and about-to-die. So every object that wants to support this form of finalization has to specify what kind of operations are still available in its about-to-die state, and all destructors of all objects need to restrict themselves to only these kind of operations. And then, of course, there is still the question of what to do if there are still cycles left after the first stage. If you still think your proposal is usefull, you'll probably need to explain why these problems don't matter enough or whether there are important use cases that it solves. From jh at improva.dk Tue Mar 31 03:30:47 2009 From: jh at improva.dk (Jacob Holm) Date: Tue, 31 Mar 2009 03:30:47 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D16294.9030205@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C55F9A.6070305@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk> <49D16294.9030205@canterbury.ac.nz> Message-ID: <49D17247.20705@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: > >> Even without changing throw and close, I still think we should >> forward GeneratorExit like any other exception, and not do anything >> special to reraise it or call close on the subiterator. > > But that allows you to inadvertently create a broken > generator by calling another generator that, according to > the rules you've just acknowledged we can't change, is > behaving correctly. According to the rules for generator finalization it might behave correctly. However, in most cases this will be code that is breaking the rule about not catching KeyboardInterrupt and SystemExit. This is broken code IMNSHO, and I don't think we should complicate the yield-from expression to cater for it. Yes there might be existing code that is not broken even by that standard and that still converts GeneratorExit to StopIteration. I don't think that is common enough that we have to care. If you use such a generator in a yield-from expression, you will get a RuntimeError('generator ignored GeneratorExit') on close, telling you that something is wrong. > > Asking users not to call such generators would require > them to have knowledge about the implementation of every > generator they call, which I don't think is acceptable. > I think that getting a RuntimeError on close is sufficient indication that such a generator should not be used in yield-from. That said, I don't really care much either way. Both versions are acceptable to me, and it is your PEP. - Jacob From greg.ewing at canterbury.ac.nz Tue Mar 31 05:25:37 2009 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 31 Mar 2009 15:25:37 +1200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D17247.20705@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C57698.7030808@gmail.com> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk> <49D16294.9030205@canterbury.ac.nz> <49D17247.20705@improva.dk> Message-ID: <49D18D31.9000008@canterbury.ac.nz> Jacob Holm wrote: > in most cases this will be code that is breaking the rule about > not catching KeyboardInterrupt and SystemExit. Not necessarily, it could be doing except GeneratorExit: return > If you use such a generator in a yield-from > expression, you will get a RuntimeError('generator ignored > GeneratorExit') on close, telling you that something is wrong. But it won't be at all clear *what* is wrong or what to do about it. The caller is making a perfectly ordinary yield-from call, and he's calling what looks to all the world like a perfectly well-behaved iterator. Where's the mistake? Remember that the generator being called may have been written by someone else. The caller may not know anything about its internals or be in a position to fix them if he did. > I think that getting a RuntimeError on close is sufficient indication > that such a generator should not be used in yield-from. But it's a perfectly valid generator by current standards. I don't want to declare some existing class of generators as being second-class citizens with respect to yield-from, especially based on some internal implementation detail unknowable to its caller. -- Greg From jh at improva.dk Tue Mar 31 11:44:06 2009 From: jh at improva.dk (Jacob Holm) Date: Tue, 31 Mar 2009 11:44:06 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D18D31.9000008@canterbury.ac.nz> References: <49AB1F90.7070201@canterbury.ac.nz> <49C5F3AE.4060402@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk> <49D16294.9030205@canterbury.ac.nz> <49D17247.20705@improva.dk> <49D18D31.9000008@canterbury.ac.nz> Message-ID: <49D1E5E6.5000007@improva.dk> Greg Ewing wrote: > Jacob Holm wrote: >> in most cases this will be code that is breaking the rule about > > not catching KeyboardInterrupt and SystemExit. > > Not necessarily, it could be doing > > except GeneratorExit: > return I said *most* cases, not all. I don't have any proof of this, just a gut feeling that the majority of generators that convert GeneratorExit to StopIteration do so because they are using a return in a finally clause. > >> If you use such a generator in a yield-from expression, you will get >> a RuntimeError('generator ignored GeneratorExit') on close, telling >> you that something is wrong. > > But it won't be at all clear *what* is wrong or what to > do about it. The caller is making a perfectly ordinary > yield-from call, and he's calling what looks to all the > world like a perfectly well-behaved iterator. Where's > the mistake? If this was documented in the PEP, I would say the mistake was in using such a generator in yield-from that wasn't the final yield. Note that it is perfectly ok to use such a generator in a yield-from as long as no outer generator yields afterwards. > > Remember that the generator being called may have been > written by someone else. The caller may not know anything > about its internals or be in a position to fix them if > he did. Right, that makes it harder to fix the source of the problem. > > > I think that getting a RuntimeError on close is sufficient indication > > that such a generator should not be used in yield-from. > > But it's a perfectly valid generator by current standards. > I don't want to declare some existing class of generators > as being second-class citizens with respect to yield-from, > especially based on some internal implementation detail > unknowable to its caller. > I get that. As I see it we have the following options, listed in my order of preference: 1. Don't throw GeneratorExit to the subiterator but raise it in the outer generator, and don't explicitly call close. This is the only version where sharing a subgenerator does not require special care. It has the problem that it behaves differently in refcounting and non-refcounting implementations due to the implicit close that would happen after the yield-from in refcounting implementations. It also breaks the inlining principle in the case of throw(GeneratorExit). 2. Do throw GeneratorExit and don't try to reraise it. This is the version that most closely follows the inlining principle. It has the problem that generators that convert GeneratorExit to StopIteration can only be used in a yield-from if none of the outer generators do a yield afterwards. Breaking this rule gives a RuntimeError('generator ignored GeneratorExit') on close. 3. Do throw GeneratorExit to the subiterator, and explicitly reraise it if it was converted to a StopIteration. It has the problem that it breaks the inlining principle for generators that convert GeneratorExit to StopIteration. 4. Don't throw GeneratorExit to the subiterator, instead explicitly call close before raising it in the outer generator. This is the behavior that #1 would have for non-shared generators in a refcounting implementation. Same problem as #3 and hides the GeneratorExit from non-generators. My guess is that your preference is more like 4, 3, 2, 1. #3 is closest to what is in the current PEP, and is probably what it meant to say. (The PEP checks if the thrown exception was GeneratorExit, then does a bare raise instead of raising the thrown exception). - Jacob From ncoghlan at gmail.com Tue Mar 31 14:08:30 2009 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 31 Mar 2009 22:08:30 +1000 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D1E5E6.5000007@improva.dk> References: <49AB1F90.7070201@canterbury.ac.nz> <49C60430.7030108@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk> <49D16294.9030205@canterbury.ac.nz> <49D17247.20705@improva.dk> <49D18D31.9000008@canterbury.ac.nz> <49D1E5E6.5000007@improva.dk> Message-ID: <49D207BE.8090909@gmail.com> Jacob Holm wrote: > My guess is that your preference is more like 4, 3, 2, 1. #3 is closest > to what is in the current PEP, and is probably what it meant to say. > (The PEP checks if the thrown exception was GeneratorExit, then does a > bare raise instead of raising the thrown exception). 4, 3, 2, 1 is the position I've come around to. Since using send(), throw() and close() on a shared subiterator doesn't make any sense, and the whole advantage of the new expression over a for loop is to make it easy to delegate send() throw() and close() correctly, I now believe that shared subiterators are best handled by actually *iterating* over them in a for loop rather than by delegating to them with "yield from". So the fact that a definition of yield from that provides prompt finalisation guarantees isn't friendly to using it with shared subiterators is actually now a *bonus* in my book - it should hopefully serve as a hint to developers that they're misusing the tool. By adopting position 4, I believe the guarantees for the exception handling in the new expression become as simple as possible: - if the subiterator does not provide a throw() method, or the exception thrown in is GeneratorExit, then the subiterator's close() method (if any) is called and the thrown in exception raised in the current frame - otherwise, the exception (including traceback) is passed down to the subiterator's throw() method With these semantics, subiterators will be finalised promptly when the outermost generator is finalised without any special effort on the developer's part and it won't be trivially easy to accidentally suppress GeneratorExit. To my mind, the practical benefits of such an approach are enough to justify the deviation from the general 'inline behaviour' guideline. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia --------------------------------------------------------------- From dangyogi at gmail.com Tue Mar 31 16:09:50 2009 From: dangyogi at gmail.com (Bruce Frederiksen) Date: Tue, 31 Mar 2009 10:09:50 -0400 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D207BE.8090909@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk> <49D16294.9030205@canterbury.ac.nz> <49D17247.20705@improva.dk> <49D18D31.9000008@canterbury.ac.nz> <49D1E5E6.5000007@improva.dk> <49D207BE.8090909@gmail.com> Message-ID: <49D2242E.9040302@gmail.com> Nick Coghlan wrote: > 4, 3, 2, 1 is the position I've come around to. [...] What he said. I think that 4 also has the advantage of raising RuntimeError in the inner generator's close method (using the definition of close provided in PEP 342) when the inner generator doesn't obey the rules for GeneratorExit laid out in PEP 342. Throwing GeneratorExit to the inner generator causes the outer generator's close to report the RuntimeError, which pins the blame on the wrong generator (in the stack traceback, which won't even show the inner generator). -bruce frederiksen From jh at improva.dk Tue Mar 31 19:41:16 2009 From: jh at improva.dk (Jacob Holm) Date: Tue, 31 Mar 2009 19:41:16 +0200 Subject: [Python-ideas] Yield-From: Finalization guarantees In-Reply-To: <49D207BE.8090909@gmail.com> References: <49AB1F90.7070201@canterbury.ac.nz> <49C6F695.1050100@gmail.com> <49C789F3.30301@improva.dk> <49C7F0C3.10904@gmail.com> <49C81621.9040600@canterbury.ac.nz> <49C81A45.1070803@canterbury.ac.nz> <49C94CF6.5070301@gmail.com> <49C9D162.5040907@canterbury.ac.nz> <49CA20F2.7040207@gmail.com> <49CA4029.6050703@improva.dk> <49CABFC6.1080207@canterbury.ac.nz> <49CAC0FE.5010305@improva.dk> <49CACB39.3020708@canterbury.ac.nz> <49CAD15D.2090008@improva.dk> <49CB155E.4040504@canterbury.ac.nz> <49CB8E4A.3050108@improva.dk> <49CC5D85.30409@canterbury.ac.nz> <49CE29BF.3040502@improva.dk> <49CEB8DE.8060200@gmail.com> <49CEBCD5.7020107@canterbury.ac.nz> <49CF6AAF.70109@improva.dk> <49D05C8F.3040800@canterbury.ac.nz> <49D0A324.1030701@gmail.com> <49D143B1.9040009@canterbury.ac.nz> <49D1542E.7070503@improva.dk> <49D16294.9030205@canterbury.ac.nz> <49D17247.20705@improva.dk> <49D18D31.9000008@canterbury.ac.nz> <49D1E5E6.5000007@improva.dk> <49D207BE.8090909@gmail.com> Message-ID: <49D255BC.6080503@improva.dk> Nick Coghlan wrote: > 4, 3, 2, 1 is the position I've come around to. > > [...snip...] > > By adopting position 4, I believe the guarantees for the exception > handling in the new expression become as simple as possible: > - if the subiterator does not provide a throw() method, or the > exception thrown in is GeneratorExit, then the subiterator's close() > method (if any) is called and the thrown in exception raised in the > current frame > - otherwise, the exception (including traceback) is passed down to the > subiterator's throw() method > Below I have attached a heavily annotated version of the expansion that I expect for #4. This version fixes an issue I have forgotten to mention where the subiterator is not closed due to an AttributeError caused by a missing send method. > With these semantics, subiterators will be finalised promptly when the > outermost generator is finalised without any special effort on the > developer's part and it won't be trivially easy to accidentally suppress > GeneratorExit. > The way I see it, it will actually be hard to do even on purpose, unless you are willing to take a significant performance hit by using a non-generator wrapper for every generator. > To my mind, the practical benefits of such an approach are enough to > justify the deviation from the general 'inline behaviour' guideline. > > I disagree, but it seems like I am the only one here that does. It will eliminate a potential pitfall, but will also remove some behavior that could have been useful, such as the ability to suppress the GeneratorExit if you know what you are doing. - Jacob ------------------------------------------------------------------------ _i = iter(EXPR) # Raises TypeError if not an iterable. try: _x = None # No current exception. _y = _i.__next__() # Guaranteed to be there by iter(). while 1: try: _s = yield _y except BaseException as _e: # An exception was thrown in, either by a call to throw() on the generator or implicitly by a call # to close(). _x = _e # Save the thrown-in exception as current. if isinstance(_x, GeneratorExit): _m = None # Don't forward GeneratorExit. else: _m = getattr(_i, 'throw', None) # Forward any other exception if there is a throw() method. if _m is None: # Not forwarding. Exit loop and go to finally clause (possibly via "except StopIteration"), # which will close _i before reraising _x. raise _y = _m(_x) else: if _s is None: # Either a send(None) or a __next__(), forward as __next__(). _x = None # No current exception _y = _i.__next__() # Guaranteed to be there by iter(). else: # A send(non-None). We need to handle the case where the subiterator has no send() method. try: _m = _i.send except AttributeError as _e: # No send method. Ensure that the subiterator is closed, then reraise the AttributeError. _x = _e # Save the AttributeError as the current exception. _m = None # Clear _m so we know _x has not been forwarded. raise # Exit loop and go to finally clause, which will close _i before reraising _x. else: _x = None # No current exception. _y = _m(s) except StopIteration as _e: if _e is _x: # If _e was just thrown in, reraise it. If the exception has been forwarded to the subiterator, # the subiterator is assumed closed. In that case _m will be non-None, so the subiterator will not be # closed again by the finally clause. Conversely, if the exception was not forwarded _m will be None # and the finally clause takes care of closing it before reraising the exception. raise # Normal return. If we get here, the StopIteration was raised by a __next__(), send() or throw() on the # subiterator which will therefore already be closed. In this case either _x is None or _m is not None, so # the the subiterator will not be closed again by the finally clause. RESULT = _e.value finally: if _x is not None and _m is None: # An exception is active and was not raised by the subiterator. Explicitly call close before the # exception is automatically reraised by the finally clause. If close raises an exception, that will # take over. _m = getattr(_i, 'close', None) if _m is not None: _m()