From g.nius.ck at gmail.com Mon Aug 1 02:36:05 2011 From: g.nius.ck at gmail.com (Christopher King) Date: Sun, 31 Jul 2011 20:36:05 -0400 Subject: [Python-ideas] Re module repeat Message-ID: Dear Idealists, I notice that when you do use the re module grouping, that it only tells you what it matched last: Dumb Real Python Code: >>> import re >>> match=re.search('^(?P[a-z])*$', 'abcz') >>> match.groupdict() {'letter': '0'} What happened to all the other matches. Now here is a cool idea. Cool Improved Python Code >>> import re >>> match=re.search('^(?P[a-z])*$', 'abcz') >>> match.groupdict() {'number': '0'} {'letter.0':'a', 'letter.1':'b', 'letter.2':'c', 'letter.3':'z'} * * Now, we see all that it matched. Now the problem with this and all ideas is reverse compatibility. So an addition is also too. >>> import re >>> match=re.search('^(?P*P*[a-z])** and also (?PP=letter.0)(?PP=letter.-1)*$', 'abcz* and also az*') >>> match.groupdict() {'letter.0':'a', 'letter.1':'b', 'letter.2':'c', 'letter.3':'z'} Notice how I added an extra P. I also made it so that matching it in the text is also more adaptable. Please consider this idea. Sincerely, Me -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Mon Aug 1 02:41:45 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 31 Jul 2011 20:41:45 -0400 Subject: [Python-ideas] Re module repeat In-Reply-To: References: Message-ID: Could you elaborate on the change? I don't understand your modification. The regex is a different one than the original, as well. I do agree that remembering all the groups would be nice, at least if it could be done reasonably. Devin On Sun, Jul 31, 2011 at 8:36 PM, Christopher King wrote: > Dear Idealists, > ? ? I notice that when you do use the re module grouping, that it only tells > you what it matched last: > Dumb Real Python Code: >>>> import re >>>> match=re.search('^(?P[a-z])*$', 'abcz') >>>> match.groupdict() > {'letter': '0'} > What happened to all the other matches. Now here is a cool idea. > Cool Improved Python Code >>>> import re >>>> match=re.search('^(?P[a-z])*$', 'abcz') >>>> match.groupdict() > {'number': '0'} > {'letter.0':'a', 'letter.1':'b', 'letter.2':'c', 'letter.3':'z'} > > Now, we see all that it matched. Now the problem with this and all ideas is > reverse?compatibility. So an addition is also too. >>>> import re >>>> match=re.search('^(?PP[a-z])* and also >>>> (?PP=letter.0)(?PP=letter.-1)$', 'abcz and also az') >>>> match.groupdict() > {'letter.0':'a', 'letter.1':'b', 'letter.2':'c', 'letter.3':'z'} > Notice how I added an extra P. I also made it so that matching it in the > text is also more?adaptable. Please consider this idea. > Sincerely, > ? ? Me > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From g.nius.ck at gmail.com Mon Aug 1 02:56:44 2011 From: g.nius.ck at gmail.com (Christopher King) Date: Sun, 31 Jul 2011 20:56:44 -0400 Subject: [Python-ideas] Re module repeat In-Reply-To: References: Message-ID: On Sun, Jul 31, 2011 at 8:41 PM, Devin Jeanpierre wrote: > Could you elaborate on the change? I don't understand your > modification. The regex is a different one than the original, as well. > What do you mean by elaborate on the change. You mean explain. I guess I could do it in more detail. What would happen is if you do something like. match=re.search('^(?*PP*[a-z])*$', 'abc') Then the match.groupdict() would return {'tag.0':'a', 'tag.1':'b', 'tag.2':'c', 'tag.-1':'c', 'tag.-2':'b', 'tag.-3':'a'} notice the PP. This means that it will save all the times it matches. It does this by adding a decimal after the tag to show the index. It also supports negative indexing in case you want the last time it matched. All these can be used with the old (?P=tag.-2) with it. Also, are there any forbidden characters in a tag. That would be good to add so it won't mess with current tags. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tlesher at gmail.com Mon Aug 1 03:42:51 2011 From: tlesher at gmail.com (Tim Lesher) Date: Sun, 31 Jul 2011 21:42:51 -0400 Subject: [Python-ideas] Re module repeat In-Reply-To: References: Message-ID: On Jul 31, 2011 8:57 PM, "Christopher King" wrote: > What would happen is if you do something like. > > match=re.search('^(?PP[a-z])*$', 'abc') > Then the match.groupdict() would return > {'tag.0':'a', 'tag.1':'b', 'tag.2':'c', 'tag.-1':'c', 'tag.-2':'b', 'tag.-3':'a'} > notice the PP. This means that it will save all the times it matches. If you want to return something that supports negative indexing, why not return a list instead of an ad-hoc string representation? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Aug 1 05:57:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 Aug 2011 13:57:05 +1000 Subject: [Python-ideas] Re module repeat In-Reply-To: References: Message-ID: On Mon, Aug 1, 2011 at 10:56 AM, Christopher King wrote: > > > On Sun, Jul 31, 2011 at 8:41 PM, Devin Jeanpierre > wrote: >> >> Could you elaborate on the change? I don't understand your >> modification. The regex is a different one than the original, as well. > > What do you mean by elaborate on the change. You mean explain. I guess I > could do it in more detail. By elaborate on the change, I expect Devin meant a more accurate description of the problem you're trying to solve without the confusing and irrelevant noise about named groups. Specifically: >>> match=re.search('^([a-z])*$', 'abcz') >>> match.groups() ('z',) You're asking for '*' and '+' to change the group numbers based on the number of matches that actually occur. This is untenable, which should become clear as soon as another group is placed after the looping constructs: >>> match=re.search('^([a-y])*(.*)$', 'abcz') >>> match.groups() ('c', 'z') Group names/numbers are assigned when the regex is compiled. They cannot be affected by runtime information based on the string being processed. The way to handle this (while still using the re module to do the parsing) is multi-level parsing: >>> match=re.search('^([a-z]*)$', 'abcz') >>> relevant = match.group(0) >>> pattern = re.compile('([a-z])') >>> for match in pattern.finditer(relevant): ... print(match.groups()) ... ('a',) ('b',) ('c',) ('z',) There's no reason to try to embed the functionality of finditer() into the regex itself (and it's utterly impractical to do so anyway). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Mon Aug 1 07:12:38 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Mon, 01 Aug 2011 07:12:38 +0200 Subject: [Python-ideas] Re module repeat In-Reply-To: References: Message-ID: Am 01.08.2011 02:36, schrieb Christopher King: > Dear Idealists, > I notice that when you do use the re module grouping, that it only tells you > what it matched last: > > Dumb Real Python Code: >>>> import re >>>> match=re.search('^(?P[a-z])*$', 'abcz') >>>> match.groupdict() > {'letter': '0'} > What happened to all the other matches. Now here is a cool idea. > > Cool Improved Python Code >>>> import re >>>> match=re.search('^(?P[a-z])*$', 'abcz') >>>> match.groupdict() > {'number': '0'} > {'letter.0':'a', 'letter.1':'b', 'letter.2':'c', 'letter.3':'z'} The "regex" module by Matthew Barnett already supports this: https://code.google.com/p/mrab-regex-hg/ Georg From greg.ewing at canterbury.ac.nz Mon Aug 1 08:19:20 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 01 Aug 2011 18:19:20 +1200 Subject: [Python-ideas] Re module repeat In-Reply-To: References: Message-ID: <4E364568.5030800@canterbury.ac.nz> Tim Lesher wrote: > > On Jul 31, 2011 8:57 PM, "Christopher King" @gmail.com > wrote: > > > > {'tag.0':'a', 'tag.1':'b', 'tag.2':'c', 'tag.-1':'c', 'tag.-2':'b', > 'tag.-3':'a'} > > why not > return a list instead of an ad-hoc string representation? That's my thought, too. The proposed scheme looks very unpythonic. -- Greg From ncoghlan at gmail.com Mon Aug 1 11:13:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 1 Aug 2011 19:13:53 +1000 Subject: [Python-ideas] Re module repeat In-Reply-To: References: Message-ID: On Mon, Aug 1, 2011 at 3:12 PM, Georg Brandl wrote: > The "regex" module by Matthew Barnett already supports this: > > https://code.google.com/p/mrab-regex-hg/ The PyPI page is more helpful, since it has the docs: http://pypi.python.org/pypi/regex (the relevant section is the captures() API under "Repeated captures") So clearly it sets up the additional storage under the hood when the pattern is compiled. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ghostwriter402 at gmail.com Mon Aug 1 18:05:53 2011 From: ghostwriter402 at gmail.com (Spectral One) Date: Mon, 01 Aug 2011 11:05:53 -0500 Subject: [Python-ideas] Python-ideas Digest, Vol 56, Issue 63 In-Reply-To: References: Message-ID: <4E36CEE1.80401@gmail.com> The only purposes of an enum that pop to my mind are: Specifying allowed values, the better to limit input to valid choices Assigning mnemonic indeces. the better to remember how to call something These can be independent of each other. When the actual allowed values are based off of operations using the mnemonics, (e.g. ANDing the values , which are all powers of two.) the actual values are not limited to the set, but rather to a linear combination of the set. That's significantly different from a specific set of options. (in which the meaning is based off of the selection, and not the result. e.g. Cardinal directions: {Up, Down, Left,Right}) In the case where the mnemonic is a single option form a set, it would be good to be able to pull the index value back out from the value passed. One could approximate this behavior by providing a constant set inside the appropriate namespace, which contains tuples of the values and their keys inside a dictionary. (or whatever is more efficient. I'm still new at Python, so I'm not sure how, if it's even possible, to extract an index string from a dictionary.) The value actually passed is whatever the object stored in that tuple would be, which could be an honest integer, if that mattered. Come to think of it, pulling the set of values that defined the entry as, for example, a list or iterable, would be really useful some times, as well. At any rate, what occurs to me is that enums are trying to fill a couple roles at once and that seems to be what makes their 'proper' behavior hard to define. A 2-way index pair structure handles most everything I see enums providing when used as unordered values. That way the recipient code can easily analyze the value entered and UIs can easily provide the mnemonic (whether authoring or debugging) which assists a user of a library and that library's author in coordinating. As to the composite case, well, the most common I've seen is a composite of powers of two. That sort of thing could assign values to bits, and may include multiple mnemonics for specific patterns. Again, not terribly complex to think about, though someone would need to decide whether composite or specific bit patterns dominate when they overlap, if the structure is to spit the values back out. Patterns impossible to generate should be rejected as such, since that's rather the point of such a structure. The composite value should probably spit out as an integer, though whether a fixed size or python's flexible length integer, signed or unsigned, and so forth are not options I've pondered. As noted, the scope of these objects should work like pretty much any array type with, no new scoping rules required. From ghostwriter402 at gmail.com Mon Aug 1 20:58:13 2011 From: ghostwriter402 at gmail.com (Spectral One) Date: Mon, 01 Aug 2011 13:58:13 -0500 Subject: [Python-ideas] Incomplete message In-Reply-To: References: Message-ID: <4E36F745.3060302@gmail.com> Crud. Sorry about that last post. It went out while I was still editing it. My apologies. -Nate From pyideas at rebertia.com Mon Aug 1 22:59:52 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Mon, 1 Aug 2011 13:59:52 -0700 Subject: [Python-ideas] Incomplete message In-Reply-To: <4E36F745.3060302@gmail.com> References: <4E36F745.3060302@gmail.com> Message-ID: On Mon, Aug 1, 2011 at 11:58 AM, Spectral One wrote: > Crud. > Sorry about that last post. It went out while I was still editing it. > > My apologies. Also, please avoid replying to the digest. It breaks threading. Reply to the individual message instead. At least you didn't quote the entire digest though; that's better than most digest replies. Cheers, Chris From mmcduff at gmail.com Wed Aug 3 03:11:28 2011 From: mmcduff at gmail.com (Mark McDuff) Date: Tue, 2 Aug 2011 18:11:28 -0700 Subject: [Python-ideas] combine for/with statement Message-ID: I find that I am often writing code in the following pattern: foo = MyContextManager(*args) for bar in my_iter: with foo: # do stuff I think it would be much cleaner to be able to write: for bar in my_iter with MyContextManager(*args): # do stuff -Mark From cmjohnson.mailinglist at gmail.com Wed Aug 3 03:41:11 2011 From: cmjohnson.mailinglist at gmail.com (Carl Matthew Johnson) Date: Tue, 2 Aug 2011 15:41:11 -1000 Subject: [Python-ideas] combine for/with statement In-Reply-To: References: Message-ID: On Aug 2, 2011, at 3:11 PM, Mark McDuff wrote: > I find that I am often writing code in the following pattern: > > foo = MyContextManager(*args) > for bar in my_iter: > with foo: > # do stuff > > I think it would be much cleaner to be able to write: > > for bar in my_iter with MyContextManager(*args): > # do stuff It's not clear to me whether that means > for bar in my_iter: > with MyContextManager(*args): > # do stuff or > with MyContextManager(*args): > for bar in my_iter: > # do stuff -0. From ncoghlan at gmail.com Wed Aug 3 03:56:52 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 3 Aug 2011 11:56:52 +1000 Subject: [Python-ideas] combine for/with statement In-Reply-To: References: Message-ID: On Wed, Aug 3, 2011 at 11:11 AM, Mark McDuff wrote: > I find that I am often writing code in the following pattern: > > foo = MyContextManager(*args) > for bar in my_iter: > ? with foo: > ? ? ? # do stuff > > I think it would be much cleaner to be able to write: > > for bar in my_iter with MyContextManager(*args): > ? # do stuff I'm not sure why you think putting the context manager way over on the right hand side is an improvement, but no, merging arbitrary statements is never going to happen (even the comprehension inspired merging of for and if statements has been explicitly rejected many many times). As Carl notes, the ambiguity of the propose syntax is also not good - it is unclear whether the context manager is recreated on each pass around the loop, reused on each pass, or applied once to cover the entire loop operation. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Wed Aug 3 05:53:11 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 3 Aug 2011 13:53:11 +1000 Subject: [Python-ideas] combine for/with statement In-Reply-To: References: Message-ID: -1 On Wed, Aug 3, 2011 at 11:56 AM, Nick Coghlan wrote: > On Wed, Aug 3, 2011 at 11:11 AM, Mark McDuff wrote: >> I find that I am often writing code in the following pattern: >> >> foo = MyContextManager(*args) >> for bar in my_iter: >> ? with foo: >> ? ? ? # do stuff >> >> I think it would be much cleaner to be able to write: >> >> for bar in my_iter with MyContextManager(*args): >> ? # do stuff > > I'm not sure why you think putting the context manager way over on the > right hand side is an improvement, but no, merging arbitrary > statements is never going to happen (even the comprehension inspired > merging of for and if statements has been explicitly rejected many > many times). > > As Carl notes, the ambiguity of the propose syntax is also not good - > it is unclear whether the context manager is recreated on each pass > around the loop, reused on each pass, or applied once to cover the > entire loop operation. > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From pyideas at rebertia.com Wed Aug 3 06:32:01 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 2 Aug 2011 21:32:01 -0700 Subject: [Python-ideas] combine for/with statement In-Reply-To: References: Message-ID: On Tue, Aug 2, 2011 at 6:11 PM, Mark McDuff wrote: > I find that I am often writing code in the following pattern: > > foo = MyContextManager(*args) > for bar in my_iter: > ? with foo: > ? ? ? # do stuff > > I think it would be much cleaner to be able to write: > > for bar in my_iter with MyContextManager(*args): > ? # do stuff Some have similarly suggested: for x in y if foo(x): # do stuff Where would it end? What if someone wants: for bar in foo with context if baz: # stuff ? Even just the Cartesian product of all Python's control structures with themselves quickly becomes unwieldy. Down this path lies Perl (particularly its control-structures-as-statement-suffixes feature). The simplicity and regularity gained is worth having to suffer 1 additional level of indentation now and then; refactor your code if the number of levels of indentation in it is becoming problematic. Cheers, Chris -- http://rebertia.com From bruce at leapyear.org Wed Aug 3 07:10:49 2011 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 2 Aug 2011 22:10:49 -0700 Subject: [Python-ideas] combine for/with statement In-Reply-To: References: Message-ID: On Tue, Aug 2, 2011 at 6:11 PM, Mark McDuff wrote: > I find that I am often writing code in the following pattern: > > foo = MyContextManager(*args) > for bar in my_iter: > with foo: > # do stuff > > I think it would be much cleaner to be able to write: > > for bar in my_iter with MyContextManager(*args): > # do stuff > The parts of the for statement have *no connection at all* to the parts of the with statement. They're just stuck together which doesn't make much sense to me. When I read the subject of the original mail I immediately thought of this case: with open(foo) as _: for line in _: # stuff which would at least make some sense if we could splice these together as with line in open(foo): #stuff But no matter how common this might be, I have to agree with: On Tue, Aug 2, 2011 at 9:32 PM, Chris Rebert wrote: > > ... Down this path lies Perl ... > For every combination like this, there's another one just past it on the road to Perl. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Wed Aug 3 10:11:25 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 03 Aug 2011 20:11:25 +1200 Subject: [Python-ideas] combine for/with statement In-Reply-To: References: Message-ID: <4E3902AD.6090302@canterbury.ac.nz> Chris Rebert wrote: > Where would it end? What if someone wants: > for bar in foo with context if baz: > # stuff With a slight relaxation of the rules concerning statements on a single line, one could write for bar in foo: with context: if baz: # stuff Not that I'd really advocate that, but it might help to shut up the people who keep requesting this sort of thing. -- Greg From julian at grayvines.com Wed Aug 3 15:39:39 2011 From: julian at grayvines.com (Julian Berman) Date: Wed, 3 Aug 2011 09:39:39 -0400 Subject: [Python-ideas] itertools.documentation.ncycles is a bit of a sore thumb Message-ID: <60DCE8E0-2A0F-43D9-BC69-A38E2D576A26@grayvines.com> The top of the recipe section in the itertools documentation says: "The superior memory performance is kept by processing elements one at a time rather than bringing the whole iterable into memory all at once." but itertools.ncycles doesn't really. It's one line, so I'll inline it here: def ncycles(iterable, n): return chain.from_iterable(repeat(tuple(iterable), n)) Somewhere along the line something was going to need to buffer that, but it'd be nicer if the example didn't have the buffering done all at once at the outset as it is now. Perhaps more importantly, though, is there a specific objection to just adding a count to itertools.cycle much like itertools.repeat has now for this? Thanks. JB From benjamin at python.org Wed Aug 3 16:02:49 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 3 Aug 2011 14:02:49 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?itertools=2Edocumentation=2Encycles_is_a?= =?utf-8?q?_bit_of_a_sore=09thumb?= References: <60DCE8E0-2A0F-43D9-BC69-A38E2D576A26@grayvines.com> Message-ID: Julian Berman writes: > > The top of the recipe section in the itertools documentation says: > > "The superior memory performance is kept by processing elements one at a time rather than bringing the > whole iterable into memory all at once." > > but itertools.ncycles doesn't really. It's one line, so I'll inline it here: In this case, I think it means that iterable*n is never held all in memory. From ram at rachum.com Wed Aug 3 16:09:26 2011 From: ram at rachum.com (Ram Rachum) Date: Wed, 3 Aug 2011 10:09:26 -0400 Subject: [Python-ideas] Cookies insanity Message-ID: Hello folks, About a week ago I was trying to do a tiny web project in Python. I wanted to send a request to some website, get a cookie, and then send that cookie on the next request. I could not believe how convoluted that task has turned out to be. Check out my StackOverflow question: http://stackoverflow.com/questions/6878418/putting-a-cookie-in-a-cookiejar Unless I'm missing something, this looks insane to me. Why is so much code needed to do such a simple task? Why can't this be done with 3-4 lines of Python? Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Wed Aug 3 16:30:15 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 3 Aug 2011 10:30:15 -0400 Subject: [Python-ideas] Cookies insanity In-Reply-To: References: Message-ID: I understand your frustration, but this isn't the right place to get help. Python-Ideas is for ideas intended to improve Python. Stack Overflow was a good idea, there are also other mailing lists like Python-Help. Also, there is an IRC channel, #python , on the Freenode network (irc.freenode.org). I hope you can get an answer in one of these places! Devin Jeanpierre On Wed, Aug 3, 2011 at 10:09 AM, Ram Rachum wrote: > Hello folks, > About a week ago I was trying to do a tiny web project in Python. > I wanted to send a request to some website, get a cookie, and then send that > cookie on the next request. > I could not believe how convoluted that task has turned out to be. > Check out my StackOverflow question: > http://stackoverflow.com/questions/6878418/putting-a-cookie-in-a-cookiejar > Unless I'm missing something, this looks insane to me. Why is so much code > needed to do such a simple task? Why can't this be done with 3-4 lines of > Python? > > Ram. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From jimjjewett at gmail.com Wed Aug 3 17:44:03 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 3 Aug 2011 11:44:03 -0400 Subject: [Python-ideas] multiple intro statements [was: combine for/with statement] Message-ID: On Wed, Aug 3, 2011 at 4:11 AM, Greg Ewing wrote: > With a slight relaxation of the rules concerning statements > on a single line, one could write > > ? ?for bar in foo: with context: if baz: > ? ? ? ?# stuff > > Not that I'd really advocate that, but it might help to > shut up the people who keep requesting this sort of thing. hmm... I actually sort of like that ... statement: statement: statement: suite is equivalent to statement: statement: statement: suite On its own, it is a pure win to say "This statement really only does one thing; it is the subordinate statement that has a suite". That said, I'm wondering if the colon is used for enough other things that it would end up causing confusion in practice with slices or formatting or ... -jJ From ram at rachum.com Wed Aug 3 17:48:10 2011 From: ram at rachum.com (Ram Rachum) Date: Wed, 3 Aug 2011 11:48:10 -0400 Subject: [Python-ideas] Cookies insanity In-Reply-To: References: Message-ID: Devin, I'm not after help; I already had enough help doing the convoluted things I need to do in order to handle cookies in Python. I am talking on python-ideas because I think that it's a problem that handling cookies is so technical in Python and I think that Python should provide simple cookie-handling modules. On Wed, Aug 3, 2011 at 10:30 AM, Devin Jeanpierre wrote: > I understand your frustration, but this isn't the right place to get > help. Python-Ideas is for ideas intended to improve Python. Stack > Overflow was a good idea, there are also other mailing lists like > Python-Help. Also, there is an IRC channel, #python , on the Freenode > network (irc.freenode.org). I hope you can get an answer in one of > these places! > > Devin Jeanpierre > > On Wed, Aug 3, 2011 at 10:09 AM, Ram Rachum wrote: > > Hello folks, > > About a week ago I was trying to do a tiny web project in Python. > > I wanted to send a request to some website, get a cookie, and then send > that > > cookie on the next request. > > I could not believe how convoluted that task has turned out to be. > > Check out my StackOverflow question: > > > http://stackoverflow.com/questions/6878418/putting-a-cookie-in-a-cookiejar > > Unless I'm missing something, this looks insane to me. Why is so much > code > > needed to do such a simple task? Why can't this be done with 3-4 lines of > > Python? > > > > Ram. > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Wed Aug 3 17:54:25 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 3 Aug 2011 11:54:25 -0400 Subject: [Python-ideas] itertools.documentation.ncycles is a bit of a sore thumb In-Reply-To: <60DCE8E0-2A0F-43D9-BC69-A38E2D576A26@grayvines.com> References: <60DCE8E0-2A0F-43D9-BC69-A38E2D576A26@grayvines.com> Message-ID: <01ABC3BE-DEAD-4CDE-8201-37BB418FAE41@gmail.com> On Aug 3, 2011, at 9:39 AM, Julian Berman wrote: > def ncycles(iterable, n): > return chain.from_iterable(repeat(tuple(iterable), n)) > > Somewhere along the line something was going to need to buffer that, but it'd be nicer if the example didn't have the buffering done all at once at the outset as it is now. > > Perhaps more importantly, though, is there a specific objection to just adding a count to itertools.cycle much like itertools.repeat has now for this? The optional argument has never been requested and I don't see ncycles() being used enough in-practice to warrant adding complexity to the API. The recipe for ncycles() is included in the docs as a way of teaching how itertools can be composed. Raymond See: http://www.google.com/codesearch#search/&q=ncycles%20lang:%5Epython -------------- next part -------------- An HTML attachment was scrubbed... URL: From dag.odenhall at gmail.com Wed Aug 3 19:17:12 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Wed, 3 Aug 2011 19:17:12 +0200 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: > obj = object(foo=1, bar=lambda x: x) > obj.foo >>>> 1 > obj.bar(2) >>>> 2 Problem? :) >>> from argparse import Namespace >>> obj = Namespace(foo=1, bar=lambda x: x) >>> obj.foo 1 >>> obj.bar(2) 2 From ron3200 at gmail.com Wed Aug 3 18:21:00 2011 From: ron3200 at gmail.com (ron3200) Date: Wed, 03 Aug 2011 11:21:00 -0500 Subject: [Python-ideas] multiple intro statements [was: combine for/with statement] In-Reply-To: References: Message-ID: <1312388460.7526.19.camel@Gutsy> On Wed, 2011-08-03 at 11:44 -0400, Jim Jewett wrote: > On Wed, Aug 3, 2011 at 4:11 AM, Greg Ewing wrote: > > With a slight relaxation of the rules concerning statements > > on a single line, one could write > > > > for bar in foo: with context: if baz: > > # stuff > > > > Not that I'd really advocate that, but it might help to > > shut up the people who keep requesting this sort of thing. > > hmm... I actually sort of like that ... > > statement: statement: statement: > suite > > is equivalent to > > statement: > statement: > statement: > suite > > On its own, it is a pure win to say "This statement really only does > one thing; it is the subordinate statement that has a suite". That > said, I'm wondering if the colon is used for enough other things that > it would end up causing confusion in practice with slices or > formatting or ... > > -jJ I think this is in the category of ... If it isn't broke, don't fix it. -1 This tries to do two distinct things. 1. Put multiple statements on a single line. 2. Have them apply to a common block of code. The main issues are with how those statements interact with the block. Some statements are only done once, while others are meant to be done on each iteration. So the order becomes something of importance, and it also becomes something of a problem to parse mentally. Cheers, Ron Cheers, Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From jh at improva.dk Wed Aug 3 20:41:09 2011 From: jh at improva.dk (Jacob Holm) Date: Wed, 03 Aug 2011 20:41:09 +0200 Subject: [Python-ideas] itertools.documentation.ncycles is a bit of a sore thumb In-Reply-To: <01ABC3BE-DEAD-4CDE-8201-37BB418FAE41@gmail.com> References: <60DCE8E0-2A0F-43D9-BC69-A38E2D576A26@grayvines.com> <01ABC3BE-DEAD-4CDE-8201-37BB418FAE41@gmail.com> Message-ID: <4E399645.9020909@improva.dk> On 2011-08-03 17:54, Raymond Hettinger wrote: > > On Aug 3, 2011, at 9:39 AM, Julian Berman wrote: > >> def ncycles(iterable, n): >> return chain.from_iterable(repeat(tuple(iterable), n)) >> >> Somewhere along the line something was going to need to buffer that, but it'd be nicer if the example didn't have the buffering done all at once at the outset as it is now. >> >> Perhaps more importantly, though, is there a specific objection to just adding a count to itertools.cycle much like itertools.repeat has now for this? > > > The optional argument has never been requested and I don't see ncycles() being used enough in-practice to warrant adding complexity to the API. The recipe for ncycles() is included in the docs as a way of teaching how itertools can be composed. > How about using this alternate recipe then: def ncycles(iterable, n): return chain.from_iterable(tee(iterable, n)) Or even: from copy import copy def ncycles(iterable, n): it, = tee(iterable, 1) return chain.from_iterable( copy(it) if i References: Message-ID: <68715.1312404329@parc.com> IMO, a lot of the stdlib Web modules are sadly out-of-date. A promising project, httplib2, is working on some of these issues, and I see that your note about cookies is one of their enhancement proposals, http://code.google.com/p/httplib2/issues/detail?id=11. Unfortunately, no one seems to be working on it. You might want to chime in and either add support there, or at least mention your issue there. Bill From raymond.hettinger at gmail.com Wed Aug 3 23:22:38 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 3 Aug 2011 17:22:38 -0400 Subject: [Python-ideas] itertools.documentation.ncycles is a bit of a sore thumb In-Reply-To: <4E399645.9020909@improva.dk> References: <60DCE8E0-2A0F-43D9-BC69-A38E2D576A26@grayvines.com> <01ABC3BE-DEAD-4CDE-8201-37BB418FAE41@gmail.com> <4E399645.9020909@improva.dk> Message-ID: <063FB822-7C05-4BFD-B33B-0566D2AB63F3@gmail.com> On Aug 3, 2011, at 2:41 PM, Jacob Holm wrote: > How about using this alternate recipe then: > > def ncycles(iterable, n): > return chain.from_iterable(tee(iterable, n)) Really? Think about what that does for a large value of n. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Wed Aug 3 23:28:43 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 04 Aug 2011 07:28:43 +1000 Subject: [Python-ideas] Cookies insanity In-Reply-To: References: Message-ID: <4E39BD8B.9030503@pearwood.info> Ram Rachum wrote: > Hello folks, > > About a week ago I was trying to do a tiny web project in Python. > > I wanted to send a request to some website, get a cookie, and then send that > cookie on the next request. > > I could not believe how convoluted that task has turned out to be. > > Check out my StackOverflow question: > http://stackoverflow.com/questions/6878418/putting-a-cookie-in-a-cookiejar > > Unless I'm missing something, this looks insane to me. Why is so much code > needed to do such a simple task? Why can't this be done with 3-4 lines of > Python? Do you have an actual idea to propose, or are you just looking for sympathy? -- Steven From solipsis at pitrou.net Wed Aug 3 23:41:35 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 3 Aug 2011 23:41:35 +0200 Subject: [Python-ideas] Cookies insanity References: <68715.1312404329@parc.com> Message-ID: <20110803234135.121679b4@pitrou.net> On Wed, 3 Aug 2011 13:45:29 PDT Bill Janssen wrote: > IMO, a lot of the stdlib Web modules are sadly out-of-date. > > A promising project, httplib2, is working on some of these issues, and I > see that your note about cookies is one of their enhancement proposals, > http://code.google.com/p/httplib2/issues/detail?id=11. Unfortunately, > no one seems to be working on it. You might want to chime in and either > add support there, or at least mention your issue there. Well, why not suggest that these improvements be contributed to the stdlib instead? That would make them accessible more immediately than through a third-party lib. Regards Antoine. From jh at improva.dk Thu Aug 4 01:16:14 2011 From: jh at improva.dk (Jacob Holm) Date: Thu, 04 Aug 2011 01:16:14 +0200 Subject: [Python-ideas] itertools.documentation.ncycles is a bit of a sore thumb In-Reply-To: <063FB822-7C05-4BFD-B33B-0566D2AB63F3@gmail.com> References: <60DCE8E0-2A0F-43D9-BC69-A38E2D576A26@grayvines.com> <01ABC3BE-DEAD-4CDE-8201-37BB418FAE41@gmail.com> <4E399645.9020909@improva.dk> <063FB822-7C05-4BFD-B33B-0566D2AB63F3@gmail.com> Message-ID: <4E39D6BE.1050900@improva.dk> On 2011-08-03 23:22, Raymond Hettinger wrote: > > On Aug 3, 2011, at 2:41 PM, Jacob Holm wrote: > >> How about using this alternate recipe then: >> >> def ncycles(iterable, n): >> return chain.from_iterable(tee(iterable, n)) > > Really? Think about what that does for a large value of n. It predictably creates an n-tuple of small objects (the cpython "tee" implementation is really quite efficient), and a deque for storing the values from the iterable as it gets them. For small values of n and unknown (potentially large/expensive) iterables this is still an improvement over the recipe in the documentation if you stop iterating before reaching the end of the first repetition. The other recipe I mentioned was specifically designed to handle large n as well. Both my recipes have the nice property requested by the OP that they don't compute values until they are needed. This is relevant if the values are expensive to compute and there is a chance that you might not need them all. In other words, exactly the case where you are most likely to want to use itertools. For iterables with say 50 items or less that are cheap to compute, and/or where you know that you will exhaust the iterable anyway, and where you don't mind "looking ahead", the current recipe in the itertools documentation is just about perfect. - Jacob From ben+python at benfinney.id.au Thu Aug 4 01:44:29 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 04 Aug 2011 09:44:29 +1000 Subject: [Python-ideas] Cookies insanity References: <68715.1312404329@parc.com> <20110803234135.121679b4@pitrou.net> Message-ID: <87k4aujc0y.fsf@benfinney.id.au> Antoine Pitrou writes: > Bill Janssen wrote: > > A promising project, httplib2, is working on some of these issues, [?] > Well, why not suggest that these improvements be contributed to the > stdlib instead? What do you mean by ?instead?? A common path for inclusion in the standard library is to first prove the code is viable as a third-party implementation. Are you suggesting that should be circumvented in this case? Why? -- \ ?A thing moderately good is not so good as it ought to be. | `\ Moderation in temper is always a virtue; but moderation in | _o__) principle is always a vice.? ?Thomas Paine | Ben Finney From solipsis at pitrou.net Thu Aug 4 02:09:38 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 4 Aug 2011 02:09:38 +0200 Subject: [Python-ideas] Cookies insanity References: <68715.1312404329@parc.com> <20110803234135.121679b4@pitrou.net> <87k4aujc0y.fsf@benfinney.id.au> Message-ID: <20110804020938.1a37e737@pitrou.net> On Thu, 04 Aug 2011 09:44:29 +1000 Ben Finney wrote: > Antoine Pitrou > writes: > > > Bill Janssen wrote: > > > A promising project, httplib2, is working on some of these issues, > [?] > > > Well, why not suggest that these improvements be contributed to the > > stdlib instead? > > What do you mean by ?instead?? A common path for inclusion in the > standard library is to first prove the code is viable as a third-party > implementation. There's some confusion here. If you look at the NEWS file there are many features added without going through such a "common path" first. The "publish it outside first" rule applies mostly to whole modules, or controversial additions. Not to incremental improvements. Regards Antoine. From stephen at xemacs.org Thu Aug 4 05:34:40 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 04 Aug 2011 12:34:40 +0900 Subject: [Python-ideas] multiple intro statements [was: combine for/with statement] In-Reply-To: <1312388460.7526.19.camel@Gutsy> References: <1312388460.7526.19.camel@Gutsy> Message-ID: <87pqklhmsv.fsf@uwakimon.sk.tsukuba.ac.jp> ron3200 writes: > This tries to do two distinct things. > 1. Put multiple statements on a single line. > 2. Have them apply to a common block of code. No. Each statement applies to the block following it, which starts on the same line. There's no common block involved, rather several strictly nested blocks. > So the order becomes something of importance, Sure, just like short-circuited logical expressions. However, it is unambiguously expressed. > and it also becomes something of a problem to parse mentally. Agreed (and Greg at least agrees, too). From herman at swebpage.com Thu Aug 4 05:49:49 2011 From: herman at swebpage.com (Herman Sheremetyev) Date: Thu, 4 Aug 2011 12:49:49 +0900 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: On Thu, Aug 4, 2011 at 2:17 AM, dag.odenhall at gmail.com wrote: >> obj = object(foo=1, bar=lambda x: x) >> obj.foo >>>>> 1 >> obj.bar(2) >>>>> 2 > > Problem? :) > >>>> from argparse import Namespace >>>> obj = Namespace(foo=1, bar=lambda x: x) >>>> obj.foo > 1 >>>> obj.bar(2) > 2 Good find, who'd have ever thought to look in argparse for something like Namespace.. After some more poking around in the stdlib for __init__ methods that take keyword args I found: plistlib.Dict and plistlib.Plist Both of them do the same thing as well as give access to the values using [] notation. There are probably other examples in third party libraries. -Herman From jeanpierreda at gmail.com Thu Aug 4 06:39:43 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 4 Aug 2011 00:39:43 -0400 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: On Wed, Aug 3, 2011 at 11:49 PM, Herman Sheremetyev wrote: > Both of them do the same thing as well as give access to the values > using [] notation. There are probably other examples in third party > libraries. Sure. Another stdlib-ish example is Sphinx, which does it in sphinx.util.attrdict. And you can search google code for 'self.__dict__ = self', which reveals a particular(ly nasty) pattern for it with many instances. Devin On Wed, Aug 3, 2011 at 11:49 PM, Herman Sheremetyev wrote: > On Thu, Aug 4, 2011 at 2:17 AM, dag.odenhall at gmail.com > wrote: >>> obj = object(foo=1, bar=lambda x: x) >>> obj.foo >>>>>> 1 >>> obj.bar(2) >>>>>> 2 >> >> Problem? :) >> >>>>> from argparse import Namespace >>>>> obj = Namespace(foo=1, bar=lambda x: x) >>>>> obj.foo >> 1 >>>>> obj.bar(2) >> 2 > > Good find, who'd have ever thought to look in argparse for something > like Namespace.. After some more poking around in the stdlib for > __init__ methods that take keyword args I found: > > plistlib.Dict and plistlib.Plist > > Both of them do the same thing as well as give access to the values > using [] notation. There are probably other examples in third party > libraries. > > -Herman > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From greg.ewing at canterbury.ac.nz Thu Aug 4 13:21:26 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 04 Aug 2011 23:21:26 +1200 Subject: [Python-ideas] Cookies insanity In-Reply-To: References: Message-ID: <4E3A80B6.9040908@canterbury.ac.nz> Ram Rachum wrote: > I am talking on python-ideas because I think that it's a problem that > handling cookies is so technical in Python and I think that Python > should provide simple cookie-handling modules. Do you have any suggestions as to what these cookie handling facilities might look like? Perhaps you could give us an example of the 3-4 lines you would like to have been able to write for this task. -- Greg From jimjjewett at gmail.com Thu Aug 4 15:18:46 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 4 Aug 2011 09:18:46 -0400 Subject: [Python-ideas] Cookies insanity In-Reply-To: <4E3A80B6.9040908@canterbury.ac.nz> References: <4E3A80B6.9040908@canterbury.ac.nz> Message-ID: On Thu, Aug 4, 2011 at 7:21 AM, Greg Ewing wrote: > Ram Rachum wrote: >> >> I am talking on python-ideas because I think that it's a problem that >> handling cookies is so technical in Python and I think that Python should >> provide simple cookie-handling modules. > > Do you have any suggestions as to what these cookie > handling facilities might look like? > > Perhaps you could give us an example of the 3-4 lines > you would like to have been able to write for this > task. Intentionally ignoring the current API, so as to avoid getting bogged down in details... from httpclient import fetcher client=fetcher() # support for cookies should be the default response1=client.fetch(url1) response2=client.fetch(url2) -jJ From senthil at uthcode.com Thu Aug 4 15:52:04 2011 From: senthil at uthcode.com (Senthil Kumaran) Date: Thu, 4 Aug 2011 21:52:04 +0800 Subject: [Python-ideas] Cookies insanity In-Reply-To: References: <4E3A80B6.9040908@canterbury.ac.nz> Message-ID: <20110804135204.GC2354@mathmagic> On Thu, Aug 04, 2011 at 09:18:46AM -0400, Jim Jewett wrote: > Intentionally ignoring the current API, so as to avoid getting bogged > down in details... > > > from httpclient import fetcher > client=fetcher() # support for cookies should be the default > response1=client.fetch(url1) > response2=client.fetch(url2) That's a good feature request for the current module too. It would be good idea to explore the positives and negatives of this before deciding upon. One thing to keep in mind is that http and url handling modules in stdlib are libraries, which provide a lot of facilities with reasonable defaults. -- Senthil From dag.odenhall at gmail.com Thu Aug 4 15:54:00 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Thu, 4 Aug 2011 15:54:00 +0200 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: On 4 August 2011 06:39, Devin Jeanpierre wrote: > On Wed, Aug 3, 2011 at 11:49 PM, Herman Sheremetyev wrote: >> Both of them do the same thing as well as give access to the values >> using [] notation. There are probably other examples in third party >> libraries. > > Sure. Another stdlib-ish example is Sphinx, which does it in > sphinx.util.attrdict. And you can search google code for > 'self.__dict__ = self', which reveals a particular(ly nasty) pattern > for it with many instances. > > Devin I heard that particular pattern leaks memory? From jeanpierreda at gmail.com Thu Aug 4 16:05:13 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 4 Aug 2011 10:05:13 -0400 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: > I heard that particular pattern leaks memory? I don't believe it does anymore, because of the cyclic garbage collector. Devin On Thu, Aug 4, 2011 at 9:54 AM, dag.odenhall at gmail.com wrote: > On 4 August 2011 06:39, Devin Jeanpierre wrote: >> On Wed, Aug 3, 2011 at 11:49 PM, Herman Sheremetyev wrote: >>> Both of them do the same thing as well as give access to the values >>> using [] notation. There are probably other examples in third party >>> libraries. >> >> Sure. Another stdlib-ish example is Sphinx, which does it in >> sphinx.util.attrdict. And you can search google code for >> 'self.__dict__ = self', which reveals a particular(ly nasty) pattern >> for it with many instances. >> >> Devin > > I heard that particular pattern leaks memory? > From menno at freshfoo.com Thu Aug 4 23:35:47 2011 From: menno at freshfoo.com (Menno Smits) Date: Thu, 04 Aug 2011 22:35:47 +0100 Subject: [Python-ideas] New imaplib implementation for Python 3.2+ standard library In-Reply-To: References: Message-ID: <4E3B10B3.5070302@freshfoo.com> On 28/07/11 02:35, Maxim Khitrov wrote: >> It requires too much effort on behalf of the caller. Your example.py >> highlights how datetimes are returned as strings that need to be >> converted to real datetimes and FETCH response keys need to be >> uppercased to ensure consistency. The need to jump through the same >> non-trivial hoops each time I used imaplib was one of the frustrations >> that led to the creation of IMAPClient. Please consider having >> imaplib2 >> do a little more work so that every user doesn't have to. > > Part of this will be addressed by the higher-level interface that I'm > currently working on. As for imaplib2, there are two reasons why I > decided not to do any sort of automatic normalization of the responses > (with the exception of CAPABILITY): ... > So basically, I think that in a low-level library such as this, it > should be the caller's decision whether an INTERNALDATE value is > converted to Unix time (or some other format), or if the FETCH > response keys are changed to upper case. I'm happy to provide > additional utility functions for such conversions, but trying to > handle these things automatically could be a source of many additional > bugs. Think about the separation between zlib and gzip, or binascii > and base64 modules. My library is the low-level interface and I'm > working on something that will be easier to use at the cost of some > control. Fair enough. If you're planning a higher-level interface and helper functions that means less repeated work for each user of imaplib2 then that's great. >> Similarly, UID support could be better. IMAPClient has a boolean >> attribute which lets you select whether you want UIDs to be >> transparently used for future commands. Having to specify whether you >> want UID support enabled on each call is a little clumsy. It's >> unlikely >> that a user of imaplib2 would want to toggle between using UIDs and >> not >> on every call. > > I have to disagree with you here. The application that I wrote this > library for does depend on the ability to run UID and regular FETCH > commands in the same connection. I was actually very surprised to see > that IMAPClient requires you pick one or the other at creation time. That's not quite right. UID selection can be set at creation time but also be changed at any point by using the use_uid attribute. > In some applications you may need to discover and use the > relationships between SNs and UIDs, or use a command like UID EXPUNGE > (from UIDPLUS extension) and a regular EXPUNGE in the same session. I > think that you do have to let the user make this decision on a > per-command basis. I think that having to pass the flag on each call is a little awkward but that's a minor issue really. Maybe you could allow the user to specify a default value to use it's not specified for a given command? > My library does need more testing. Although I tried to follow the > robustness principle (be conservative in what you send; be liberal in > what you accept) when writing the command generator and response > parser, there probably are some bugs remaining, but hopefully not > many. > > Which IMAP servers do you test against and how did you go about > getting the test accounts? I regularly test against: Gmail, Fastmail.fm (a Cyrus variant), vanilla Cyrus, Dovecot, Courier and MS Exchange. Gmail and Fastmail have free accounts and I run the other test servers myself except for the Exchange server which is at my employer. >> [1] - Are you aware there's already another project with the same >> name? >> http://www.janeelix.com/piers/python/imaplib2.html > > Hmm... I probably should have tried searching before using that name. > I'm happy to go with something else, since my library is not in > wide-spread use right now. Would suggesting imaplib3 for stdlib be a > bit confusing? :/ Possibly! I'm not sure of a better name though. IMAPClient was the best I could come up with and that conflicts with a Perl package of the same name and functionality :) All the best, Menno From ubershmekel at gmail.com Fri Aug 5 10:36:49 2011 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Fri, 5 Aug 2011 11:36:49 +0300 Subject: [Python-ideas] shelve and sqlite Message-ID: Hi Pydeas, I encountered exceptions like File "c:\python27\lib\site-packages\filecache-0.67-py2.7.egg\filecache\__init__.py", line 82, in function_with_cache rv = function.__db[key] File "c:\python27\lib\shelve.py", line 122, in __getitem__ value = Unpickler(f).load() cPickle.UnpicklingError: could not find MARK With my filecache module which I have a feeling are related to database failures upon abrupt termination of the process. If I write the class SqliteShelf, would it be of use to anyone? Would it be a viable patch for shelve.py? Note that I don't mean to affect the shelve.py "open" function, just to add a reliable alternative class. http://docs.python.org/dev/library/shelve.html Cheers, --Yuval Greenfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Fri Aug 5 12:47:59 2011 From: phd at phdru.name (Oleg Broytman) Date: Fri, 5 Aug 2011 14:47:59 +0400 Subject: [Python-ideas] shelve and sqlite In-Reply-To: References: Message-ID: <20110805104759.GB18565@iskra.aviel.ru> On Fri, Aug 05, 2011 at 11:36:49AM +0300, Yuval Greenfield wrote: > If I write the class SqliteShelf, would it be of use to anyone? Would it be > a viable patch for shelve.py? > > Note that I don't mean to affect the shelve.py "open" function, just to add > a reliable alternative class. Something like this: http://bugs.python.org/issue3783 ? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From dag.odenhall at gmail.com Fri Aug 5 22:00:23 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Fri, 5 Aug 2011 22:00:23 +0200 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: 2011/8/4 Matej Lieskovsk? : > Um, I'm kinda new round here but here goes: > If I understood the problem, we are looking for a way of creating an object > which has no specific class. > My proposal: > perhaps a "namespace" statement would do, behaving somewhat like this: > namespace MyObject: > ? ? statements > is equivalent to: > class MyClass(object): > ? ? statements > MyObject = MyClass() > MyClass = None > I chose "namespace" as that's what I think we are trying to create > constructive criticism is welcome Not useful enough to warrant the introduction of a new keyword, IMO. There's some overlap here with the proposed 'given' keyword, and as you noted yourself it can already be done with the class construct. You could use a metaclass/base class or a class decorator to disable instantiation, if you really feel like it. Also note that you can read attributes directly on a class. Finally, you'll want to use the 'del' keyword on your last line. This: del MyClass removes the reference to the class in the namespace, but doesn't actually delete the class itself (until garbage collection and when there's no remaining references - and MyObject is still "referencing" it here). From mikegraham at gmail.com Fri Aug 5 22:49:55 2011 From: mikegraham at gmail.com (Mike Graham) Date: Fri, 5 Aug 2011 16:49:55 -0400 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: 2011/8/5 dag.odenhall at gmail.com > 2011/8/4 Matej Lieskovsk? : > > ... > > perhaps a "namespace" statement would do, behaving somewhat like this: > > namespace MyObject: > > ? ? statements > > is equivalent to: > > class MyClass(object): > > ? ? statements > > MyObject = MyClass() > > MyClass = None > > ... As a somewhat off-topic remark, this could already be accomplished with metaclasses. We could write def namespace(name, bases, d): return type(name, bases, d)() trivially then use it like class MyObject(metaclass=namespace): statements which is really quite slick conceptually. Syntactically, this is ugly and the terminology we use seems to confuse people. If Python was to introduce new syntax for this, it would ideally solve the broader problem here. Mike From ubershmekel at gmail.com Fri Aug 5 23:48:15 2011 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Sat, 6 Aug 2011 00:48:15 +0300 Subject: [Python-ideas] shelve and sqlite In-Reply-To: <20110805104759.GB18565@iskra.aviel.ru> References: <20110805104759.GB18565@iskra.aviel.ru> Message-ID: Actually I was originally thinking of just directly interacting with sqlite3 in shelve.py but if you think this is the better approach (it does make sense to just use the dbm api), I'm willing to try and write up the patch. Some more shelve stack traces follow: Traceback (most recent call last): File "import_chess.py", line 282, in get_all_players() File "import_chess.py", line 182, in get_all_players get_player_tourneys(first, last) File "import_chess.py", line 172, in get_player_tourneys get_tournament(tournament_id) File "import_chess.py", line 139, in get_tournament tournament_table = get_tournament_by_type(tournament_id, 3) File "import_chess.py", line 90, in get_tournament_by_type tournament_html = get(url).decode('iso-8859-1') File "c:\python27\lib\site-packages\filecache-0.67-py2.7.egg\filecache\__init_ _.py", line 82, in function_with_cache rv = function.__db[key] File "c:\python27\lib\shelve.py", line 122, in __getitem__ value = Unpickler(f).load() cPickle.UnpicklingError: invalid load key, 'x'. Traceback (most recent call last): File "import_chess.py", line 286, in get_all_players() File "import_chess.py", line 186, in get_all_players get_player_tourneys(first, last) File "import_chess.py", line 174, in get_player_tourneys get_tournament(tournament_id) File "import_chess.py", line 137, in get_tournament tournament_table = get_tournament_by_type(tournament_id, 2) File "import_chess.py", line 90, in get_tournament_by_type tournament_html = get(url).decode('iso-8859-1') File "c:\python27\lib\site-packages\filecache-0.67-py2.7.egg\filecache\__init_ _.py", line 91, in function_with_cache function.__db[key] = __retval(_time.time(), _pickle.dumps(retval)) File "c:\python27\lib\shelve.py", line 133, in __setitem__ self.dict[key] = f.getvalue() File "c:\python27\lib\bsddb\__init__.py", line 279, in __setitem__ _DeadlockWrap(wrapF) # self.db[key] = value File "c:\python27\lib\bsddb\dbutils.py", line 68, in DeadlockWrap return function(*_args, **_kwargs) File "c:\python27\lib\bsddb\__init__.py", line 278, in wrapF self.db[key] = value bsddb.db.DBRunRecoveryError: (-30974, 'DB_RUNRECOVERY: Fatal error, run database recovery -- PANIC: Invalid argument') --Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmjohnson.mailinglist at gmail.com Sat Aug 6 08:37:47 2011 From: cmjohnson.mailinglist at gmail.com (Carl Matthew Johnson) Date: Fri, 5 Aug 2011 20:37:47 -1000 Subject: [Python-ideas] anonymous object support In-Reply-To: References: Message-ID: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> On Aug 5, 2011, at 10:49 AM, Mike Graham wrote: > 2011/8/5 dag.odenhall at gmail.com >> 2011/8/4 Matej Lieskovsk? : >>> ... >>> perhaps a "namespace" statement would do, behaving somewhat like this: >>> namespace MyObject: >>> statements >>> is equivalent to: >>> class MyClass(object): >>> statements >>> MyObject = MyClass() >>> MyClass = None >>> ... > > As a somewhat off-topic remark, this could already be accomplished > with metaclasses. I think a new keyword (possibly but not necessarily namespace) might good as a way of differentiating "real" classes from metaclass hacks. For example, some ORMs work using a "declarational" syntax with the class statement. That's fine? except that it looks like you can use the resulting objects like classes, when really there's a lot of meta-glue behind the scenes that means things will break if you try to use inheritance or other features of "real" classes. As another example, using metaclasses you can change namedtuple's syntax from the somewhat clunky Point = namedtuple('Point', 'x y') to the more elegant but confusing since it uses "class" class Point(NamedTuple): x y I sort of like this syntax, but I think it would crazy to use in production, because who besides the original creator of the hack would guess that class NamedTuple has a metaclass which uses the __prepare__ method to record the fields accessed during class creation? But with a namespace keyword you signal to the reader of the code: "Here be meta-dragons! Proceed with caution." The counterargument can be made, do we really want to encourage people to fiddle around with metaclasses any more than they already do? But my response to that is that people are already making ORMs and whatnot using the class keyword, so why not pave the cowpath? Also we could throw in some built in tools to make the metaclasses less confusing to work with. Also, hey free switch statement ;-) : key = random.choice(["case1", "case2"]) namespace dispatch(dict): def case1(): print("One") def case2(): print("Two") dispatch[key]() -- Carl From ericsnowcurrently at gmail.com Sat Aug 6 08:53:43 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 6 Aug 2011 00:53:43 -0600 Subject: [Python-ideas] anonymous object support In-Reply-To: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> References: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> Message-ID: On Sat, Aug 6, 2011 at 12:37 AM, Carl Matthew Johnson wrote: > As another example, using metaclasses you can change namedtuple's syntax from the somewhat clunky > > Point = namedtuple('Point', 'x y') > > to the more elegant but confusing since it uses "class" > > class Point(NamedTuple): > ? ? ? ?x > ? ? ? ?y > You don't even need to use __prepare__() if you make it look like this: @as_namedtuple class Point: x = ... y = ... Since Ellipsis became valid as a normal-use object this works. And that decorator is a snap to write. -eric > I sort of like this syntax, but I think it would crazy to use in production, because who besides the original creator of the hack would guess that class NamedTuple has a metaclass which uses the __prepare__ method to record the fields accessed during class creation? But with a namespace keyword you signal to the reader of the code: "Here be meta-dragons! Proceed with caution." The counterargument can be made, do we really want to encourage people to fiddle around with metaclasses any more than they already do? But my response to that is that people are already making ORMs and whatnot using the class keyword, so why not pave the cowpath? Also we could throw in some built in tools to make the metaclasses less confusing to work with. > > > > > > Also, hey free switch statement ;-) : > > key = random.choice(["case1", "case2"]) > namespace dispatch(dict): > ? ? ? ?def case1(): print("One") > ? ? ? ?def case2(): print("Two") > > dispatch[key]() > > -- Carl > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From aquavitae69 at gmail.com Sat Aug 6 10:10:06 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sat, 6 Aug 2011 10:10:06 +0200 Subject: [Python-ideas] Access to function objects Message-ID: Has anyone else ever thought that it might be useful to access a function object from within the call? I've come across this situation a few times recently, and thought it would be very useful to be able to do something like, for example: def a_function() as func: print(func.__doc__) It would be useful, for example, to record state between calls, or if the function wants to reuse its own properties (like in the above example). Consider a function which should print the number of times it has been called on every call def counter(add) as func: if not hasattr(func, 'count'): func.count = 0 func.count += 1 print(func.count) This could also be implemented using classes, i.e. class Counter: def __call__(self, add): if not hasattr(func, 'count'): func.count = 0 func.count += 1 print(func.count) counter = Counter() But this is much more clumsy, results in an extra object (the Counter class) and will be quite complicated if counter is a method rather than a function. The reason I've used "as" syntax is that it is consistent with other python statements (e.g. "with" and "except"), wouldn't require a new keyword and is backwardly compatible. Any thoughts? David -------------- next part -------------- An HTML attachment was scrubbed... URL: From pyideas at rebertia.com Sat Aug 6 10:16:05 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Sat, 6 Aug 2011 01:16:05 -0700 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: On Sat, Aug 6, 2011 at 1:10 AM, David Townshend wrote: > Has anyone else ever thought that it might be useful to access a function > object from within the call? Yes: http://mail.python.org/pipermail/python-list/2009-May/1203977.html Cheers, Chris From pyideas at rebertia.com Sat Aug 6 10:19:58 2011 From: pyideas at rebertia.com (Chris Rebert) Date: Sat, 6 Aug 2011 01:19:58 -0700 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: On Sat, Aug 6, 2011 at 1:16 AM, Chris Rebert wrote: > On Sat, Aug 6, 2011 at 1:10 AM, David Townshend wrote: >> Has anyone else ever thought that it might be useful to access a function >> object from within the call? > > Yes: And also: [Rejected] PEP 3130: Access to Current Module/Class/Function http://www.python.org/dev/peps/pep-3130/ Cheers, Chris From solipsis at pitrou.net Sat Aug 6 14:13:41 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 6 Aug 2011 14:13:41 +0200 Subject: [Python-ideas] Access to function objects References: Message-ID: <20110806141341.711b184e@pitrou.net> On Sat, 6 Aug 2011 01:19:58 -0700 Chris Rebert wrote: > On Sat, Aug 6, 2011 at 1:16 AM, Chris Rebert wrote: > > On Sat, Aug 6, 2011 at 1:10 AM, David Townshend wrote: > >> Has anyone else ever thought that it might be useful to access a function > >> object from within the call? > > > > Yes: > > And also: > [Rejected] PEP 3130: Access to Current Module/Class/Function > http://www.python.org/dev/peps/pep-3130/ The new magic super() uses a similar, hidden, compiler-activated hack to work properly: >>> class C: ... def f(): super ... >>> C.f.__closure__ (,) >>> C.f.__closure__[0].cell_contents >>> sup = super >>> class D: ... def f(): sup ... >>> D.f.__closure__ >>> Regards Antoine. From rob.cliffe at btinternet.com Sat Aug 6 14:41:55 2011 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Sat, 06 Aug 2011 13:41:55 +0100 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: <4E3D3693.6010903@btinternet.com> On 06/08/2011 09:10, David Townshend wrote: > Has anyone else ever thought that it might be useful to access a > function object from within the call? Also yes. Rob Cliffe From g.nius.ck at gmail.com Sat Aug 6 20:26:15 2011 From: g.nius.ck at gmail.com (Christopher King) Date: Sat, 6 Aug 2011 14:26:15 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: Whoops, sent it to the tutors On Sat, Aug 6, 2011 at 2:24 PM, Christopher King wrote: > > > On Sat, Aug 6, 2011 at 4:10 AM, David Townshend wrote: > >> def counter(add) as func: >> if not hasattr(func, 'count'): >> func.count = 0 >> func.count += 1 >> print(func.count) >> > You already can do that without an as statment. > > >>> def counter(add): > if not hasattr(counter, 'count'): > counter.count = 0 > counter.count += 1 > return counter.count > >>> counter('You ever notice how this parameter is never used anyway?') > Output: 1 > >>> counter('Oh well') > Output: 2 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Aug 6 21:50:52 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 06 Aug 2011 15:50:52 -0400 Subject: [Python-ideas] anonymous object support In-Reply-To: References: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> Message-ID: On 8/6/2011 2:53 AM, Eric Snow wrote: > On Sat, Aug 6, 2011 at 12:37 AM, Carl Matthew Johnson > wrote: >> As another example, using metaclasses you can change namedtuple's syntax from the somewhat clunky >> >> Point = namedtuple('Point', 'x y') >> >> to the more elegant but confusing since it uses "class" >> >> class Point(NamedTuple): >> x >> y If NamedTuple were documented as having a custom metaclass, and if this were a common idiom, that would not be too confusing as all. > You don't even need to use __prepare__() if you make it look like this: > > @as_namedtuple > class Point: > x = ... > y = ... > > Since Ellipsis became valid as a normal-use object this works. And > that decorator is a snap to write. This looks ok to me too. New keywords and syntax should be *very* rare and reserved for things that cannot be done so easily with what we have now. -- Terry Jan Reedy From tjreedy at udel.edu Sat Aug 6 22:28:33 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 06 Aug 2011 16:28:33 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: On 8/6/2011 4:19 AM, Chris Rebert wrote: > On Sat, Aug 6, 2011 at 1:16 AM, Chris Rebert wrote: >> On Sat, Aug 6, 2011 at 1:10 AM, David Townshend wrote: >>> Has anyone else ever thought that it might be useful to access a function > [Rejected] PEP 3130: Access to Current Module/Class/Function > http://www.python.org/dev/peps/pep-3130/ The first problem with this is that it is three proposals in one. Each should have been considered separately on its own merits. The second problem stems from the first: there is a blanket rejection of all three with no reference to the relative merits of the three different proposals. The __module__ proposal is particularly weak as the only use case given is replacing if __name__ == '__main__': ... with if __module__ is sys.main: ... I would reject the collective proposal just to reject this. The __class__ proposal seems to have been dealt with partly by revisions to super. I have not read it enough to know if anything more is left, but there is not much, if any, demand for more that I have seen. The proposal for access to a function from within the function has two important use cases, mentioned in the PEP. First is to make truly recursive functions. Second is to dependably access function attributes from within the function. (Without that, there are hardly used even a decade after their introduction.) In both cases, the idea is to make the function operate as desired independently of external namespace manipulations (even from outside the module) that the function and its author literally have no control over. Except purely for speed, function attributes could then replace pseudo-parameters with default args. This proposal/desire comes up constantly on python-list, generally with support. I believe it was part of two recent threads. I would like to know if the rejection of the idea so far is a rejection in principle (and if so, why) or a rejection of specifics. -- Terry Jan Reedy From dag.odenhall at gmail.com Sat Aug 6 23:15:06 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sat, 6 Aug 2011 23:15:06 +0200 Subject: [Python-ideas] anonymous object support In-Reply-To: References: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> Message-ID: > If NamedTuple were documented as having a custom metaclass, and if this were > a common idiom, that would not be too confusing as all. Or make the presence of a metaclass explicit, i.e. have NamedTuple be the metaclass. Working example in case anyone was curious how this is done: http://paste.pocoo.org/show/453814/ I think this might be acceptable in such edge cases as named tuples and enums, if the metaclass isn't hidden away in a base class. Might be difficult to explain to a beginner, but on the other hand they might find it easier to *read*. From jeanpierreda at gmail.com Sun Aug 7 00:53:21 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 6 Aug 2011 18:53:21 -0400 Subject: [Python-ideas] anonymous object support In-Reply-To: References: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> Message-ID: My favorite declarative-namedtuple hack is http://code.activestate.com/recipes/500261-named-tuples/#c16 Devin On Sat, Aug 6, 2011 at 3:50 PM, Terry Reedy wrote: > On 8/6/2011 2:53 AM, Eric Snow wrote: > >> On Sat, Aug 6, 2011 at 12:37 AM, Carl Matthew Johnson >> > >> wrote: >> >>> As another example, using metaclasses you can change namedtuple's syntax >>> from the somewhat clunky >>> >>> Point = namedtuple('Point', 'x y') >>> >>> to the more elegant but confusing since it uses "class" >>> >>> class Point(NamedTuple): >>> x >>> y >>> >> > If NamedTuple were documented as having a custom metaclass, and if this > were a common idiom, that would not be too confusing as all. > > > You don't even need to use __prepare__() if you make it look like this: >> >> @as_namedtuple >> class Point: >> x = ... >> y = ... >> >> Since Ellipsis became valid as a normal-use object this works. And >> that decorator is a snap to write. >> > > This looks ok to me too. New keywords and syntax should be *very* rare and > reserved for things that cannot be done so easily with what we have now. > > -- > Terry Jan Reedy > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dag.odenhall at gmail.com Sun Aug 7 01:21:59 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 01:21:59 +0200 Subject: [Python-ideas] Change repr(Ellipsis) to '...' Message-ID: Now that it is valid in any expressions, I'd argue the repr should reflect the literal syntax. There are however some reasons this might not be desirable: ellipsis is used to represent recursive objects, and by reprlib when summarizing long reprs. Thus there would be ambiguity. A counter-argument may be that a repr isn't intended to be completely unambiguous, reversible or parseable - in deed many objects mimic the literal syntax of builtin types even though they add special behavior. I was going to give os.environ as an example here, and then learned this is no longer the case in Python 3, so maybe it is after all seen as undesirable. :) Anyway: discuss! From dag.odenhall at gmail.com Sun Aug 7 01:34:07 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 01:34:07 +0200 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: On 29 July 2011 19:03, Guido van Rossum wrote: > I think it is fine if type(None)() returns None instead of raising an exception. +1, I've often wanted it and felt (lambda: None) was somewhat clunky. From cmjohnson.mailinglist at gmail.com Sun Aug 7 01:41:27 2011 From: cmjohnson.mailinglist at gmail.com (Carl Matthew Johnson) Date: Sat, 6 Aug 2011 13:41:27 -1000 Subject: [Python-ideas] anonymous object support In-Reply-To: References: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> Message-ID: <7B27CFDD-5C28-45E6-B124-D0C77EBA85C4@gmail.com> On Aug 6, 2011, at 12:53 PM, Devin Jeanpierre wrote: > My favorite declarative-namedtuple hack is http://code.activestate.com/recipes/500261-named-tuples/#c16 > > Devin For non-link followers: def _namedtuple(func): return namedtuple(func.__name__, func.__code__.co_varnames) @_namedtuple def Point(x,y): pass That is very clever, but it kind of illustrates my point about needing a new keyword. When you see "def" don't you naturally think, "OK, what comes out of this will be a function named Point." But what comes out of this is not a function. It's a namedtuple, which is quite different? A similar case can be made about @sort_list_with_keyfunc(my_list) def result(item): ... return normalized_item It's a neat way of getting out of writing the keyfunc before the sort, but it's a bad practice because you're def-ing a sorted list, not a function. Also a Ruby-like each can be done through abuse of decorators @each(my_list) def squared_list(item): return item ** 2 Neat but it breaks the reader's expectations. (Also, a list comprehension is shorter.) From dag.odenhall at gmail.com Sun Aug 7 01:50:09 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 01:50:09 +0200 Subject: [Python-ideas] Add an identity function Message-ID: Yes, I know, it's merely a (lambda x: x), but I find a need for this often enough that dedicated, documented function would help readability and encourage certain patterns. Whether it should go in builtins or functools is debatable. A name that is neither in conflict or easily confused with the id() builtin is another problem (the Haskell identity function is called 'id'). So what is the use case? A common example is the pattern of "key functions" as used with sorting: the default is typically the "identity function". Another example is gettext catalogs, which effectively are defaultdicts of the identity function. http://en.wikipedia.org/wiki/Identity_function From jeanpierreda at gmail.com Sun Aug 7 02:25:43 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 6 Aug 2011 20:25:43 -0400 Subject: [Python-ideas] Make compile('1\n2\n', '', 'single') raise an exception instead of silently truncating? In-Reply-To: References: Message-ID: Hi, I really would like some feedback on this. Or should I send it to the bug tracker? Was this the wrong place? Devin On Wed, Jul 20, 2011 at 10:47 AM, Devin Jeanpierre wrote: > > compile('1\n2\n', '','single') == compile('1\n', '','single'). > > That is, it ignores the second statement ('2\n'), > without offering a way for the caller to detect this. > > Considering that 'single' is primarily used to emulate the behaviour > of the Python interpreter, most of the time, giving it multiple > statements is an impossibility, and so that case doesn't matter and > could raise an exception without affecting existing code. For example, > the code module meets this description, as do debuggers and such. > > However, in cases where it _is_ possible to give the compiler multiple > statements, the user should be warned that his input isn't valid, > somehow. For example, the following doctest will mysteriously fail, > because it was written incorrectly (doctest uses 'single'): > > ? ?>>> import sys > ? ?... sys.stdout.write('foo\n') > ? ?foo > > This is because the second statement in the doctest was silently > discarded by compile(). It might not always be clear to users how to > fix this, and I think this kind of non-obvious error would exist in > any use of 'single' that can in theory involve multiple statements, > through user error or program bug. So I'd appreciate it if compile() > raised an exception in this case. Perhaps SyntaxError or ValueError. > > Devin From jxo6948 at rit.edu Sun Aug 7 02:50:03 2011 From: jxo6948 at rit.edu (John O'Connor) Date: Sat, 6 Aug 2011 17:50:03 -0700 Subject: [Python-ideas] Add an identity function In-Reply-To: References: Message-ID: Personally I find lambda x: x to be very readable. It seems quite close to f(x) = x - John On Sat, Aug 6, 2011 at 4:50 PM, dag.odenhall at gmail.com < dag.odenhall at gmail.com> wrote: > Yes, I know, it's merely a (lambda x: x), but I find a need for this > often enough that dedicated, documented function would help > readability and encourage certain patterns. Whether it should go in > builtins or functools is debatable. A name that is neither in conflict > or easily confused with the id() builtin is another problem (the > Haskell identity function is called 'id'). > > So what is the use case? A common example is the pattern of "key > functions" as used with sorting: the default is typically the > "identity function". Another example is gettext catalogs, which > effectively are defaultdicts of the identity function. > > http://en.wikipedia.org/wiki/Identity_function > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Aug 7 03:16:42 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 07 Aug 2011 11:16:42 +1000 Subject: [Python-ideas] Make compile('1\n2\n', '', 'single') raise an exception instead of silently truncating? In-Reply-To: References: Message-ID: <4E3DE77A.5030003@pearwood.info> Devin Jeanpierre wrote: > Hi, I really would like some feedback on this. Or should I send it to > the bug tracker? Was this the wrong place? I agree, the current behaviour seems wrong to me: errors should never pass silently. If I were you, I would add it to the bug tracker. -- Steven From steve at pearwood.info Sun Aug 7 03:28:36 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 07 Aug 2011 11:28:36 +1000 Subject: [Python-ideas] Add an identity function In-Reply-To: References: Message-ID: <4E3DEA44.80903@pearwood.info> dag.odenhall at gmail.com wrote: > Yes, I know, it's merely a (lambda x: x), but I find a need for this > often enough that dedicated, documented function would help > readability and encourage certain patterns. Whether it should go in > builtins or functools is debatable. A name that is neither in conflict > or easily confused with the id() builtin is another problem (the > Haskell identity function is called 'id'). > > So what is the use case? A common example is the pattern of "key > functions" as used with sorting: the default is typically the > "identity function". Another example is gettext catalogs, which > effectively are defaultdicts of the identity function. I frequently find myself wanting an identify function. Here's a (grossly simplified) example from one of my library functions: def len_sum(iterable, transformation=lambda x:x): """Sum iterable, returning length and sum in one pass.""" count = 0 total = 0 for x in iterable: count += 1 total += transformation(x) return count, total Except that the overhead of calling the identity function is significant. So I end up repeating myself: def len_sum(iterable, transformation=None): """Sum iterable, returning length and sum in one pass.""" count = 0 total = 0 if transformation is None: for x in iterable: count += 1 total += x else: ... # you get the picture return count, total So, while I want an identity function, I don't want an identity function which requires actually calling a function at runtime. What I really want is compiler black magic, so that I can write: def len_sum(iterable, transformation=None): """Sum iterable, returning length and sum in one pass.""" count = 0 total = 0 for x in iterable: count += 1 total += transformation(x) return count, total and the compiler is smart enough to do the Right Thing for me, without either the need to repeat code, or the function call overhead. (And also a pony.) Without that level of black magic, I don't think adding an identity function to the standard library is worth the time or effort. -- Steven From jeanpierreda at gmail.com Sun Aug 7 03:44:01 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 6 Aug 2011 21:44:01 -0400 Subject: [Python-ideas] Make compile('1\n2\n', '', 'single') raise an exception instead of silently truncating? In-Reply-To: <4E3DE77A.5030003@pearwood.info> References: <4E3DE77A.5030003@pearwood.info> Message-ID: Done. Thanks! Devin On Sat, Aug 6, 2011 at 9:16 PM, Steven D'Aprano wrote: > Devin Jeanpierre wrote: >> >> Hi, I really would like some feedback on this. Or should I send it to >> the bug tracker? Was this the wrong place? > > I agree, the current behaviour seems wrong to me: errors should never pass > silently. If I were you, I would add it to the bug tracker. > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From jeanpierreda at gmail.com Sun Aug 7 03:53:32 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 6 Aug 2011 21:53:32 -0400 Subject: [Python-ideas] Make compile('1\n2\n', '', 'single') raise an exception instead of silently truncating? In-Reply-To: References: <4E3DE77A.5030003@pearwood.info> Message-ID: It's been suggested that I link the ticket to Python-Ideas. Sorry for not doing so earlier: http://bugs.python.org/issue12705 Devin On Sat, Aug 6, 2011 at 9:44 PM, Devin Jeanpierre wrote: > Done. Thanks! > > Devin > > On Sat, Aug 6, 2011 at 9:16 PM, Steven D'Aprano wrote: >> Devin Jeanpierre wrote: >>> >>> Hi, I really would like some feedback on this. Or should I send it to >>> the bug tracker? Was this the wrong place? >> >> I agree, the current behaviour seems wrong to me: errors should never pass >> silently. If I were you, I would add it to the bug tracker. >> >> >> >> -- >> Steven >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > From dag.odenhall at gmail.com Sun Aug 7 04:07:26 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 04:07:26 +0200 Subject: [Python-ideas] Add an identity function In-Reply-To: <4E3DEA44.80903@pearwood.info> References: <4E3DEA44.80903@pearwood.info> Message-ID: > So, while I want an identity function, I don't want an identity function > which requires actually calling a function at runtime. What I really want is > compiler black magic, so that I can write: > > def len_sum(iterable, transformation=None): > ? ?"""Sum iterable, returning length and sum in one pass.""" > ? ?count = 0 > ? ?total = 0 > ? ?for x in iterable: > ? ? ? ?count += 1 > ? ? ? ?total += transformation(x) > ? ?return count, total > > > and the compiler is smart enough to do the Right Thing for me, without > either the need to repeat code, or the function call overhead. (And also a > pony.) Without that level of black magic, I don't think adding an identity > function to the standard library is worth the time or effort. -1 on silently pretending that None is callable as the identity function. If you have an actual function, an optimizer could probably strip it away in cases like len_sum. From jeanpierreda at gmail.com Sun Aug 7 04:18:49 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 6 Aug 2011 22:18:49 -0400 Subject: [Python-ideas] Add an identity function In-Reply-To: References: <4E3DEA44.80903@pearwood.info> Message-ID: Nonetheless, Steven does have a point: the identity function is very trivial to define in your own code (I do it lots), so there isn't much benefit adding it. The reason I would personally want it (and perhaps a few other helpful "base case" functions, like constant(x)(y) = x) is to encourage its use among people that haven't considered the concept. That isn't all that good a reason, though. +0? Devin On Sat, Aug 6, 2011 at 10:07 PM, dag.odenhall at gmail.com wrote: >> So, while I want an identity function, I don't want an identity function >> which requires actually calling a function at runtime. What I really want is >> compiler black magic, so that I can write: >> >> def len_sum(iterable, transformation=None): >> ? ?"""Sum iterable, returning length and sum in one pass.""" >> ? ?count = 0 >> ? ?total = 0 >> ? ?for x in iterable: >> ? ? ? ?count += 1 >> ? ? ? ?total += transformation(x) >> ? ?return count, total >> >> >> and the compiler is smart enough to do the Right Thing for me, without >> either the need to repeat code, or the function call overhead. (And also a >> pony.) Without that level of black magic, I don't think adding an identity >> function to the standard library is worth the time or effort. > > -1 on silently pretending that None is callable as the identity > function. If you have an actual function, an optimizer could probably > strip it away in cases like len_sum. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ericsnowcurrently at gmail.com Sun Aug 7 06:32:06 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 6 Aug 2011 22:32:06 -0600 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: On Sat, Aug 6, 2011 at 2:28 PM, Terry Reedy wrote: > On 8/6/2011 4:19 AM, Chris Rebert wrote: >> >> On Sat, Aug 6, 2011 at 1:16 AM, Chris Rebert ?wrote: >>> >>> On Sat, Aug 6, 2011 at 1:10 AM, David Townshend >>> ?wrote: >>>> >>>> Has anyone else ever thought that it might be useful to access a >>>> function > >> [Rejected] PEP 3130: Access to Current Module/Class/Function >> http://www.python.org/dev/peps/pep-3130/ > > The first problem with this is that it is three proposals in one. Each > should have been considered separately on its own merits. The second problem > stems from the first: there is a blanket rejection of all three with no > reference to the relative merits of the three different proposals. > > The __module__ proposal is particularly weak as the only use case given is > replacing > if __name__ == '__main__': ... > with > if __module__ is sys.main: ... > I would reject the collective proposal just to reject this. > > The __class__ proposal seems to have been dealt with partly by revisions to > super. I have not read it enough to know if anything more is left, but there > is not much, if any, demand for more that I have seen. > > The proposal for access to a function from within the function has two > important use cases, mentioned in the PEP. First is to make truly recursive > functions. Second is to dependably access function attributes from within > the function. (Without that, there are hardly used even a decade after their > introduction.) In both cases, the idea is to make the function operate as > desired independently of external namespace manipulations (even from outside > the module) that the function and its author literally have no control over. > Except purely for speed, function attributes could then replace > pseudo-parameters with default args. > > This proposal/desire comes up constantly on python-list, generally with > support. I believe it was part of two recent threads. I would like to know > if the rejection of the idea so far is a rejection in principle (and if so, > why) or a rejection of specifics. +1 Maybe a more straightforward effort would be appropriate if the other ideas sank the function part. Here's my 2c on how to make it work. Of the three code blocks, functions are the only ones for whom the resulting object and the execution of the code block are separate. So a code object could be executing for the original function or a different one that is sharing the code object. Why not bind the called function-object to the frame locals, rather than the one for which the code object was created, perhaps as "__function__"? To finish things off, bind to every new code object the function for which it was created, perhaps as "co_func". That way you will always know what function object was called and which one the code object came from originally. No new syntax. One new attribute on code objects. One new implicit name in locals(). Code to add the function object to the code object at definition time. Code to add the [other] function object at execution time. -eric > > -- > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From mikegraham at gmail.com Sun Aug 7 06:33:43 2011 From: mikegraham at gmail.com (Mike Graham) Date: Sun, 7 Aug 2011 00:33:43 -0400 Subject: [Python-ideas] Change repr(Ellipsis) to '...' In-Reply-To: References: Message-ID: On Sat, Aug 6, 2011 at 7:21 PM, dag.odenhall at gmail.com wrote: > Now that it is valid in any expressions, I'd argue the repr should > reflect the literal syntax. There are however some reasons this might > not be desirable: ellipsis is used to represent recursive objects, and > by reprlib when summarizing long reprs. Thus there would be ambiguity. > A counter-argument may be that a repr isn't intended to be completely > unambiguous, reversible or parseable - in deed many objects mimic the > literal syntax of builtin types even though they add special behavior. > I was going to give os.environ as an example here, and then learned > this is no longer the case in Python 3, so maybe it is after all seen > as undesirable. :) > > Anyway: discuss! I think the current state is a lot more helpful in debugging, and that making repr(...) be "..." would result in occasional confusion but no positive effects. Mike From steve at pearwood.info Sun Aug 7 07:36:05 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 07 Aug 2011 15:36:05 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: <4E3E2445.2070403@pearwood.info> Eric Snow wrote: > Of the three code blocks, functions are the only ones for whom the [grammar police] "Whom" is for people. You want "which". [/grammar police] (I don't normally correct people's grammar, but to me, this error broke the flow of the sentence like being hit in the face with a sock with half a brick in it.) > resulting object and the execution of the code block are separate. So > a code object could be executing for the original function or a > different one that is sharing the code object. Fairly unusual, but I grant it could happen. I've done it myself :) > Why not bind the called function-object to the frame locals, rather > than the one for which the code object was created, perhaps as > "__function__"? I'm afraid I can't interpret this (which may be my ignorance rather than your fault). The only guess I can make is based on what you say later: "One new implicit name in locals(). so I presume you mean that the function should see a local variable (perhaps called "me", or "this"?) that is bound to itself. Presumably if a function wants to use that same name as a local, nothing bad will happen, since the local assignment will just override the implicit assignment. But what about code that expects to see a nonlocal or global with the same name? What happens when two functions, sharing the same code object, get called from two threads at the same time? Are their locals independent? For most uses, standard recursion via the name is good enough, it's only a few corner cases where self-reflection (as I call it) is needed. And I say that as somebody who does want a way for functions to know themselves. I don't think that use-case is so important that it should be implicitly added to every function, on the off-chance it is needed, rather than explicitly on demand. > To finish things off, bind to every new code object > the function for which it was created, perhaps as "co_func". That way > you will always know what function object was called and which one the > code object came from originally. What benefit will this give? Have you ever looked at a code object and said, "I need a way of knowing which function this is from?" If so, I'd love to know what problem you were trying to solve at the time! Code objects don't always get created as part of a function. They can be returned by compile. What should co_func be set to then? Finally, if the function has a reference to the code object, and the code object has a reference to the function, you have a reference cycle. That's not the end of the world now as it used to be, in early Python before the garbage collector was added, but still, there better be a really good use-case to justify it. (Perhaps a weak reference might be more appropriate?) -- Steven From tjreedy at udel.edu Sun Aug 7 08:17:22 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 07 Aug 2011 02:17:22 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: On 8/7/2011 12:32 AM, Eric Snow wrote: > On Sat, Aug 6, 2011 at 2:28 PM, Terry Reedy wrote: >> This proposal/desire comes up constantly on python-list, generally with >> support. I believe it was part of two recent threads. I would like to know >> if the rejection of the idea so far is a rejection in principle (and if so, >> why) or a rejection of specifics. > > +1 > Of the three code blocks, functions are the only ones for whom the > resulting object and the execution of the code block are separate. So > a code object could be executing for the original function or a > different one that is sharing the code object. Now I remember that the separation between code object and function object and the possibility of reusing code objects has been given as a reason to reject the idea. On the other hand, reusing code objects is so rare that I question the need to cater to it much. > Why not bind the called function-object to the frame locals, rather > than the one for which the code object was created, perhaps as > "__function__"? I do not understand this. > To finish things off, bind to every new code object > the function for which it was created, perhaps as "co_func". That way > you will always know what function object was called and which one the > code object came from originally. I don't see that solving the problem of how the function code accesses the object. Adding something to the default arg tuple might make more sense. -- Terry Jan Reedy From ericsnowcurrently at gmail.com Sun Aug 7 09:46:08 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sun, 7 Aug 2011 01:46:08 -0600 Subject: [Python-ideas] Access to function objects In-Reply-To: <4E3E2445.2070403@pearwood.info> References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sat, Aug 6, 2011 at 11:36 PM, Steven D'Aprano wrote: > Eric Snow wrote: > >> Of the three code blocks, functions are the only ones for whom the > > [grammar police] > "Whom" is for people. You want "which". > [/grammar police] > > (I don't normally correct people's grammar, but to me, this error broke the > flow of the sentence like being hit in the face with a sock with half a > brick in it.) I aim to please! > > >> resulting object and the execution of the code block are separate. ?So >> a code object could be executing for the original function or a >> different one that is sharing the code object. > > Fairly unusual, but I grant it could happen. I've done it myself :) > > >> Why not bind the called function-object to the frame locals, rather >> than the one for which the code object was created, perhaps as >> "__function__"? > > I'm afraid I can't interpret this (which may be my ignorance rather than > your fault). No, the fault is likely mine. But it seems so clear to me. :) > The only guess I can make is based on what you say later: > > "One new implicit name in locals(). > > so I presume you mean that the function should see a local variable (perhaps > called "me", or "this"?) that is bound to itself. One called __function__ (or the like). A "dunder" name is used to indicate its special nature and limit conflict with existing code. The function that was called would be bound to that name at function execution time. Keep in mind that I am talking about the frame locals, not anything stored on the code object nor on the function object. Not to overdramatize it, but it would happen at the beginning of every call of every function. I don't know what that overhead would be. > > Presumably if a function wants to use that same name as a local, nothing bad > will happen, since the local assignment will just override the implicit > assignment. But what about code that expects to see a nonlocal or global > with the same name? > > What happens when two functions, sharing the same code object, get called > from two threads at the same time? Are their locals independent? I'm afraid I don't know. I expect that each would get executed in separate execution frames, and so have separate frame locals. > > For most uses, standard recursion via the name is good enough, it's only a > few corner cases where self-reflection (as I call it) is needed. And I say > that as somebody who does want a way for functions to know themselves. I > don't think that use-case is so important that it should be implicitly added > to every function, on the off-chance it is needed, rather than explicitly on > demand. For me the use case involves determining what function called my function. Currently you can tell in which execution frame a function was called, and thereby which code object, but reliably matching that to a function is not so simple. I recognize that my case is likely not a general one. > > >> To finish things off, bind to every new code object >> the function for which it was created, perhaps as "co_func". ?That way >> you will always know what function object was called and which one the >> code object came from originally. > > What benefit will this give? Have you ever looked at a code object and said, > "I need a way of knowing which function this is from?" If so, I'd love to > know what problem you were trying to solve at the time! You caught me! :) I don't already have a use case for this part. I had only considered that without this you could not determine where a code object came from, or if a function had borrowed another's code object. This is certainly only useful in the case that one function is using the code object of another, which we have all agreed is not that common. However, with a co_func I felt that all the bases would be covered. > > Code objects don't always get created as part of a function. They can be > returned by compile. What should co_func be set to then? None, since there was no function object created along with the code object. Same with generator expressions. > > Finally, if the function has a reference to the code object, and the code > object has a reference to the function, you have a reference cycle. That's > not the end of the world now as it used to be, in early Python before the > garbage collector was added, but still, there better be a really good > use-case to justify it. > > (Perhaps a weak reference might be more appropriate?) Good point. Mostly I am trying to look for an angle that works without a lot of trouble. Can't fault me for trying in my own incoherent way. :) -eric > > > > -- > Steven > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From g.brandl at gmx.net Sun Aug 7 10:12:04 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 07 Aug 2011 10:12:04 +0200 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: Am 07.08.2011 01:34, schrieb dag.odenhall at gmail.com: > On 29 July 2011 19:03, Guido van Rossum wrote: >> I think it is fine if type(None)() returns None instead of raising an exception. > > +1, I've often wanted it and felt (lambda: None) was somewhat clunky. It makes its intent much clearer than type(None) though. Georg From rob.cliffe at btinternet.com Sun Aug 7 10:21:45 2011 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Sun, 07 Aug 2011 09:21:45 +0100 Subject: [Python-ideas] Add an identity function In-Reply-To: References: <4E3DEA44.80903@pearwood.info> Message-ID: <4E3E4B19.3040909@btinternet.com> On 07/08/2011 03:07, dag.odenhall at gmail.com wrote: >> So, while I want an identity function, I don't want an identity function >> which requires actually calling a function at runtime. What I really want is >> compiler black magic, so that I can write: >> >> def len_sum(iterable, transformation=None): >> """Sum iterable, returning length and sum in one pass.""" >> count = 0 >> total = 0 >> for x in iterable: >> count += 1 >> total += transformation(x) >> return count, total >> >> >> and the compiler is smart enough to do the Right Thing for me, without >> either the need to repeat code, or the function call overhead. (And also a >> pony.) Without that level of black magic, I don't think adding an identity >> function to the standard library is worth the time or effort. > -1 on silently pretending that None is callable as the identity > function. If you have an actual function, an optimizer could probably > strip it away in cases like len_sum. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > How about (for this use case at any rate) total += x if transformation is None else transformation(x) # avoids unnecessary function call, fairly readable IMO Rob Cliffe From dag.odenhall at gmail.com Sun Aug 7 13:27:51 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 13:27:51 +0200 Subject: [Python-ideas] Conventions for Function Annotations Message-ID: This isn't really a proposal for any sort of language/stdlib change, rather I felt like discussing the potential for informal standard conventions with annotations. Probably for the better, there is no special syntax for annotating raised exceptions or yields from a generator. In line with how the "->" syntax sets the 'return' key, I suggest an informal standard of representing keywords this way, for example in a decorator for raise/yield annotations: # third-party decorator @raises(ValueError) def foo():pass assert foo.__annotations__['raise'] == ValueError This might be an obvious solution, but I just wanted to "put it out there" up front, before inconsistent workarounds emerge. This convention should work because it is primarily needed for control structures that are reserved keywords anyway. The one exception I can think of is the inverse of yield: generator.send() - "send" could conflict with a function argument, and should therefore not be put in __annotations__. (A hack could be to use a different but semantically related keyword like 'import', or an otherwise invalid identifier like 'send()', but it might be best to simply not use __annotations__ for this.) From ncoghlan at gmail.com Sun Aug 7 14:01:41 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Aug 2011 22:01:41 +1000 Subject: [Python-ideas] Change repr(Ellipsis) to '...' In-Reply-To: References: Message-ID: On Sun, Aug 7, 2011 at 2:33 PM, Mike Graham wrote: > On Sat, Aug 6, 2011 at 7:21 PM, dag.odenhall at gmail.com > wrote: >> Now that it is valid in any expressions, I'd argue the repr should >> reflect the literal syntax. There are however some reasons this might >> not be desirable: ellipsis is used to represent recursive objects, and >> by reprlib when summarizing long reprs. Thus there would be ambiguity. >> A counter-argument may be that a repr isn't intended to be completely >> unambiguous, reversible or parseable - in deed many objects mimic the >> literal syntax of builtin types even though they add special behavior. >> I was going to give os.environ as an example here, and then learned >> this is no longer the case in Python 3, so maybe it is after all seen >> as undesirable. :) >> >> Anyway: discuss! > > I think the current state is a lot more helpful in debugging, and that > making repr(...) be "..." would result in occasional confusion but no > positive effects. Interesting idea, but far too confusing in the interactive interpreter and in doctests: - '...' is also the default prompt for continuation lines (e.g. when defining a function) - '...' is used to mark 'match anything' sections in doctests The situation might have been different if the syntax had always been allowed everywhere, but there's no compelling reason to change it now. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Sun Aug 7 14:10:46 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Aug 2011 08:10:46 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 3:46 AM, Eric Snow wrote: > On Sat, Aug 6, 2011 at 11:36 PM, Steven D'Aprano wrote: >> Eric Snow wrote: >>> Why not bind the called function-object to the frame locals, rather >>> than the one for which the code object was created, perhaps as >>> "__function__"? >> >> I'm afraid I can't interpret this (which may be my ignorance rather than >> your fault). > > No, the fault is likely mine. ?But it seems so clear to me. :) > >> The only guess I can make is based on what you say later: >> >> "One new implicit name in locals(). >> >> so I presume you mean that the function should see a local variable (perhaps >> called "me", or "this"?) that is bound to itself. > > One called __function__ (or the like). ?A "dunder" name is used to > indicate its special nature and limit conflict with existing code. Without thinking too much about this I like it. > The function that was called would be bound to that name at function > execution time. ?Keep in mind that I am talking about the frame > locals, not anything stored on the code object nor on the function > object. ?Not to overdramatize it, but it would happen at the beginning > of every call of every function. ?I don't know what that overhead > would be. It could be made into a "cell", which is the same way all locals are normally represented. This is very fast. Further the code for it could be triggered by the appearance of __function__ (if that's the keyword we choose) in the function body. I don't really care what happens if people use locals() -- that's inefficient and outmoded anyway. (Note that!) >> Presumably if a function wants to use that same name as a local, nothing bad >> will happen, since the local assignment will just override the implicit >> assignment. But what about code that expects to see a nonlocal or global >> with the same name? That's why a __dunder__ name is used. >> What happens when two functions, sharing the same code object, get called >> from two threads at the same time? Are their locals independent? > > I'm afraid I don't know. ?I expect that each would get executed in > separate execution frames, and so have separate frame locals. The frames are completely independent. They all point to the same code object and under the proposal they will all point to the same function object. I see no problems here except self-inflicted, like using __function__ to hold state that can't be accessed concurrently safely; note that recursive invocations have the same issue. I see it as a non-problem. >> For most uses, standard recursion via the name is good enough, it's only a >> few corner cases where self-reflection (as I call it) is needed. Right. If it were expected that people would start writing recursive calls using __function__ routinely, in situations where a name reference works, I'd be very unhappy with the new feature. (And if someone wants to make the argument that recursive calls using __function__ are actually better in some way I am willing to filibuster.) >> And I say >> that as somebody who does want a way for functions to know themselves. I >> don't think that use-case is so important that it should be implicitly added >> to every function, on the off-chance it is needed, rather than explicitly on >> demand. > > For me the use case involves determining what function called my > function. ?Currently you can tell in which execution frame a function > was called, and thereby which code object, but reliably matching that > to a function is not so simple. ?I recognize that my case is likely > not a general one. But it is a nice one. It solves some issues that pdb currently solves by just using a file/line reference. >>> To finish things off, bind to every new code object >>> the function for which it was created, perhaps as "co_func". ?That way >>> you will always know what function object was called and which one the >>> code object came from originally. >> >> What benefit will this give? Have you ever looked at a code object and said, >> "I need a way of knowing which function this is from?" If so, I'd love to >> know what problem you were trying to solve at the time! > > You caught me! :) ?I don't already have a use case for this part. ?I > had only considered that without this you could not determine where a > code object came from, or if a function had borrowed another's code > object. ?This is certainly only useful in the case that one function > is using the code object of another, which we have all agreed is not > that common. ?However, with a co_func I felt that all the bases would > be covered. Ah, but you can't do that! There are many situations where a single code object is used to create many different function objects. E.g. every time you have a nested function. Also the code object is immutable. This part is all carefully considered and should be left alone. >> Code objects don't always get created as part of a function. They can be >> returned by compile. What should co_func be set to then? > > None, since there was no function object created along with the code > object. ?Same with generator expressions. Just forget this part. >> Finally, if the function has a reference to the code object, and the code >> object has a reference to the function, you have a reference cycle. That's >> not the end of the world now as it used to be, in early Python before the >> garbage collector was added, but still, there better be a really good >> use-case to justify it. >> >> (Perhaps a weak reference might be more appropriate?) > > Good point. > > Mostly I am trying to look for an angle that works without a lot of > trouble. ?Can't fault me for trying in my own incoherent way. :) On the rest of that rejected PEP: - I'm not actually sure how easy it is to implement the setting of __function__ when the frame is created. IIRC the frame creation is rather far removed from the function object, as there are various cases where there is no function object (class bodies, module-level code) and in other cases the function is called via a bound method. Someone should write a working patch to figure out if this is a problem in practice. - The primary use case for __function__ to me seems to access function attributes, but I'm not sure what's wrong with referencing these via the function name. Maybe it's when there's a method involved, since then you'd have to write ... - It seems that the "current class" question has already been solved for super(). If more is needed I'd be okay with extending the machinery used by super() so that you can access the magic "current class" variable explicitly too. - For "current module" I've encountered a number of use cases, mostly having to do with wanting to define new names dynamically. Somehow I have found: globals()[x] = y # Note that x is a variable, not a literal cumbersome; I'd rather write: setattr(__this_module__, x, y) There are IIRC also some use cases where an API expects a module object (or at least something whose attributes it can set and/or get) and passing the current module is clumsy: foo(sys.modules[__name__]) On the whole these use cases are all fairly weak though and I would give it a +0 at best. But rather than a prolonged discussion of the merits and use cases, I strongly recommend that somebody tries to come up with a working implementation and we'll strengthen the PEP from there. -- --Guido van Rossum (python.org/~guido) From dag.odenhall at gmail.com Sun Aug 7 14:12:54 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 14:12:54 +0200 Subject: [Python-ideas] Change repr(Ellipsis) to '...' In-Reply-To: References: Message-ID: On 7 August 2011 14:01, Nick Coghlan wrote: > On Sun, Aug 7, 2011 at 2:33 PM, Mike Graham wrote: >> On Sat, Aug 6, 2011 at 7:21 PM, dag.odenhall at gmail.com >> wrote: >>> Now that it is valid in any expressions, I'd argue the repr should >>> reflect the literal syntax. There are however some reasons this might >>> not be desirable: ellipsis is used to represent recursive objects, and >>> by reprlib when summarizing long reprs. Thus there would be ambiguity. >>> A counter-argument may be that a repr isn't intended to be completely >>> unambiguous, reversible or parseable - in deed many objects mimic the >>> literal syntax of builtin types even though they add special behavior. >>> I was going to give os.environ as an example here, and then learned >>> this is no longer the case in Python 3, so maybe it is after all seen >>> as undesirable. :) >>> >>> Anyway: discuss! >> >> I think the current state is a lot more helpful in debugging, and that >> making repr(...) be "..." would result in occasional confusion but no >> positive effects. > > Interesting idea, but far too confusing in the interactive interpreter > and in doctests: > > - '...' is also the default prompt for continuation lines (e.g. when > defining a function) > - '...' is used to mark 'match anything' sections in doctests > > The situation might have been different if the syntax had always been > allowed everywhere, but there's no compelling reason to change it now. Strong, valid points; I'd say I'm +/-0 on this proposal myself at this point. From ncoghlan at gmail.com Sun Aug 7 14:14:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Aug 2011 22:14:18 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: On Sun, Aug 7, 2011 at 4:17 PM, Terry Reedy wrote: > On 8/7/2011 12:32 AM, Eric Snow wrote: >> Of the three code blocks, functions are the only ones for whom the >> resulting object and the execution of the code block are separate. ?So >> a code object could be executing for the original function or a >> different one that is sharing the code object. > > Now I remember that the separation between code object and function object > and the possibility of reusing code objects has been given as a reason to > reject the idea. On the other hand, reusing code objects is so rare that I > question the need to cater to it much. Nested function objects and class definitions say 'Hi!' - they reuse code blocks all the time. In the following code: def decorator(f): @wraps(f) def wrapper(*args, **kwds): return f(*args, **kwds) return wrapper All of the 'wrapper' instances share a single code object (stored as a constant in the code object for 'decorator'). If a proposal suggests storing mutable state on a code object it's time to stop and think of a new way (or drop the idea entirely). >> Why not bind the called function-object to the frame locals, rather >> than the one for which the code object was created, perhaps as >> "__function__"? > > I do not understand this. I'm fairly sure I do understand it, and I don't think it could be made to work in a sensible way (either thread safety problems or else thoroughly confusing interactions with the descriptor machinery). I don't see any fundamental limitations preventing the use of a magic closure variable for __function__ along the lines of __class__ in PEP 3135 though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Sun Aug 7 14:26:42 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Aug 2011 08:26:42 -0400 Subject: [Python-ideas] anonymous object support In-Reply-To: <7B27CFDD-5C28-45E6-B124-D0C77EBA85C4@gmail.com> References: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> <7B27CFDD-5C28-45E6-B124-D0C77EBA85C4@gmail.com> Message-ID: 2011/8/6 Carl Matthew Johnson : > > On Aug 6, 2011, at 12:53 PM, Devin Jeanpierre wrote: > >> My favorite declarative-namedtuple hack is http://code.activestate.com/recipes/500261-named-tuples/#c16 >> >> Devin > > For non-link followers: > > def _namedtuple(func): > ? ?return namedtuple(func.__name__, func.__code__.co_varnames) > > @_namedtuple > def Point(x,y): > ? ?pass > > That is very clever, but it kind of illustrates my point about needing a new keyword. When you see "def" don't you naturally think, "OK, what comes out of this will be a function named Point." But what comes out of this is not a function. It's a namedtuple, which is quite different? I'm not sure I find that much of an objection. There are plenty of situations where decorators are used to seriously pervert the type of the defined name. Plus, what's a function? A class can be called as well. What's the difference? > A similar case can be made about > > @sort_list_with_keyfunc(my_list) > def result(item): > ? ? ? ?... > ? ? ? ?return normalized_item > > It's a neat way of getting out of writing the keyfunc before the sort, but it's a bad practice because you're def-ing a sorted list, not a function. In this specific case I agree it's just confusing. (The difference is that 'Point' above still can be called with x and y arguments, returning a Point object.) > Also a Ruby-like each can be done through abuse of decorators > > @each(my_list) > def squared_list(item): > ? ? ? ?return item ** 2 > > Neat but it breaks the reader's expectations. (Also, a list comprehension is shorter.) Not to mention faster. The main reason why the argument against these doesn't provide an argument against @_namedtuple is that they don't create a callable thing at all. -- --Guido van Rossum (python.org/~guido) From guido at python.org Sun Aug 7 14:27:06 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Aug 2011 08:27:06 -0400 Subject: [Python-ideas] Change repr(Ellipsis) to '...' In-Reply-To: References: Message-ID: -1 here. Shorter is not always more beautiful. On Sun, Aug 7, 2011 at 8:12 AM, dag.odenhall at gmail.com wrote: > On 7 August 2011 14:01, Nick Coghlan wrote: >> On Sun, Aug 7, 2011 at 2:33 PM, Mike Graham wrote: >>> On Sat, Aug 6, 2011 at 7:21 PM, dag.odenhall at gmail.com >>> wrote: >>>> Now that it is valid in any expressions, I'd argue the repr should >>>> reflect the literal syntax. There are however some reasons this might >>>> not be desirable: ellipsis is used to represent recursive objects, and >>>> by reprlib when summarizing long reprs. Thus there would be ambiguity. >>>> A counter-argument may be that a repr isn't intended to be completely >>>> unambiguous, reversible or parseable - in deed many objects mimic the >>>> literal syntax of builtin types even though they add special behavior. >>>> I was going to give os.environ as an example here, and then learned >>>> this is no longer the case in Python 3, so maybe it is after all seen >>>> as undesirable. :) >>>> >>>> Anyway: discuss! >>> >>> I think the current state is a lot more helpful in debugging, and that >>> making repr(...) be "..." would result in occasional confusion but no >>> positive effects. >> >> Interesting idea, but far too confusing in the interactive interpreter >> and in doctests: >> >> - '...' is also the default prompt for continuation lines (e.g. when >> defining a function) >> - '...' is used to mark 'match anything' sections in doctests >> >> The situation might have been different if the syntax had always been >> allowed everywhere, but there's no compelling reason to change it now. > > Strong, valid points; I'd say I'm +/-0 on this proposal myself at this point. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Sun Aug 7 14:46:59 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 7 Aug 2011 22:46:59 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 10:10 PM, Guido van Rossum wrote: > On the rest of that rejected PEP: > > - I'm not actually sure how easy it is to implement the setting of > __function__ when the frame is created. IIRC the frame creation is > rather far removed from the function object, as there are various > cases where there is no function object (class bodies, module-level > code) and in other cases the function is called via a bound method. > Someone should write a working patch to figure out if this is a > problem in practice. With a PEP 3135 closure style solution, the cell reference would be filled in at function definition time, so that part shouldn't be an issue. > - The primary use case for __function__ to me seems to access function > attributes, but I'm not sure what's wrong with referencing these via > the function name. Maybe it's when there's a method involved, since > then you'd have to write ... That does raise an interesting question, though: in a function wrapped via decorators, should __function__ refer to the innermost function or the outermost one? Reference by name lazily accesses the outermost one, but doesn't care how the decorators are applied (i.e. as part of the def statement or via post decoration). A __class__ style cell reference to the result of the 'def' statement would behave differently in the post decoration case. While referencing the innermost function would likely be wrong in any case involving function attributes, having the function in a valid state during decoration will likely mandate filling in the cell reference before invoking any decorators. Perhaps the best solution would be to syntactically reference the innermost function, but provide a clean way in functools to shift the cell reference to a different function (with functools.wraps doing that automatically). This does seem like an area ripe for subtle decoration related bugs though, especially by contrast with lazy name based lookup. > - It seems that the "current class" question has already been solved > for super(). If more is needed I'd be okay with extending the > machinery used by super() so that you can access the magic "current > class" variable explicitly too. No need to extend it, that info is already available for explicit reference: >>> class C: ... def f(self): ... print(__class__) ... >>> C().f() (see postscript for more details) > - For "current module" I've encountered a number of use cases, mostly > having to do with wanting to define new names dynamically. Somehow I > have found: > > ?globals()[x] = y ?# Note that x is a variable, not a literal > > cumbersome; I'd rather write: > > ?setattr(__this_module__, x, y) > > There are IIRC also some use cases where an API expects a module > object (or at least something whose attributes it can set and/or get) > and passing the current module is clumsy: > > ?foo(sys.modules[__name__]) While this may sound a little hypocritical coming from the author of PEPs 366 and 395, I'm wary of adding new implicit module globals for problems with relatively simple and robust alternatives. In this case, it's fairly easy to get access to the current module using the idiom Guido quoted: import sys _this = sys.modules[__name__] (or using dict-style access on globals()) Cheers, Nick. P.S. More details on the magic __class__ closure reference: >>> dis.show_code(C.f) Name: f Filename: Argument count: 1 Kw-only arguments: 0 Number of locals: 1 Stack size: 2 Flags: OPTIMIZED, NEWLOCALS Constants: 0: None Names: 0: print Variable names: 0: self Free variables: 0: __class__ >>> C.f.__closure__[0].cell_contents -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jeanpierreda at gmail.com Sun Aug 7 14:59:36 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 7 Aug 2011 08:59:36 -0400 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: What if you want to say that it raises both ValueError and ArithmeticError? Annotations never did support multiple annotations, it's a big drawback. This is a decorator though, you can do what you want. How about making __annotations__['raise'] a list? Otherwise sounds fine to me. The use of keywords in __annotations__ is pretty clever. :) Devin On Sun, Aug 7, 2011 at 7:27 AM, dag.odenhall at gmail.com wrote: > This isn't really a proposal for any sort of language/stdlib change, > rather I felt like discussing the potential for informal standard > conventions with annotations. > > Probably for the better, there is no special syntax for annotating > raised exceptions or yields from a generator. In line with how the > "->" syntax sets the 'return' key, I suggest an informal standard of > representing keywords this way, for example in a decorator for > raise/yield annotations: > > # third-party decorator > @raises(ValueError) > def foo():pass > > assert foo.__annotations__['raise'] == ValueError > > This might be an obvious solution, but I just wanted to "put it out > there" up front, before inconsistent workarounds emerge. This > convention should work because it is primarily needed for control > structures that are reserved keywords anyway. The one exception I can > think of is the inverse of yield: generator.send() - "send" could > conflict with a function argument, and should therefore not be put in > __annotations__. (A hack could be to use a different but semantically > related keyword like 'import', or an otherwise invalid identifier like > 'send()', but it might be best to simply not use __annotations__ for > this.) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From guido at python.org Sun Aug 7 15:07:02 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Aug 2011 09:07:02 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 8:46 AM, Nick Coghlan wrote: > On Sun, Aug 7, 2011 at 10:10 PM, Guido van Rossum wrote: >> On the rest of that rejected PEP: >> >> - I'm not actually sure how easy it is to implement the setting of >> __function__ when the frame is created. IIRC the frame creation is >> rather far removed from the function object, as there are various >> cases where there is no function object (class bodies, module-level >> code) and in other cases the function is called via a bound method. >> Someone should write a working patch to figure out if this is a >> problem in practice. > > With a PEP 3135 closure style solution, the cell reference would be > filled in at function definition time, so that part shouldn't be an > issue. Yes, I was thinking of something like that (though honestly I'd forgotten some of the details :-). >> - The primary use case for __function__ to me seems to access function >> attributes, but I'm not sure what's wrong with referencing these via >> the function name. Maybe it's when there's a method involved, since >> then you'd have to write ... > > That does raise an interesting question, though: in a function wrapped > via decorators, should __function__ refer to the innermost function or > the outermost one? IMO there is no doubt that if __function__ were to exist it should reference the innermost function, i.e. the thing that was created by the 'def' statement before any decorators were applied. > Reference by name lazily accesses the outermost one, but doesn't care > how the decorators are applied (i.e. as part of the def statement or > via post decoration). What do you mean here by lazily? > A __class__ style cell reference to the result > of the 'def' statement would behave differently in the post decoration > case. Oh you were thinking of making it reference the result after decoration? Maybe I know too much about the implementation, but I would find that highly confusing. Do you even have a use case for that? If so, I think it should be a separate name, e.g. __decorated_function__. > While referencing the innermost function would likely be wrong in any > case involving function attributes, having the function in a valid > state during decoration will likely mandate filling in the cell > reference before invoking any decorators. Perhaps the best solution > would be to syntactically reference the innermost function, but > provide a clean way in functools to shift the cell reference to a > different function (with functools.wraps doing that automatically). Hm, making it dynamic sounds wrong. I think it makes more sense to just share the attribute dict (which is easily done through assignment to the wrapping function's __dict__). > This does seem like an area ripe for subtle decoration related bugs > though, especially by contrast with lazy name based lookup. TBH, personally I am in most cases unhappy with the aggressive copying of docstring and other metadata from the wrapped function to the wrapper function, and wish the idiom had never been invented. >> - It seems that the "current class" question has already been solved >> for super(). If more is needed I'd be okay with extending the >> machinery used by super() so that you can access the magic "current >> class" variable explicitly too. > > No need to extend it, that info is already available for explicit reference: > >>>> class C: > ... ? def f(self): > ... ? ? print(__class__) > ... >>>> C().f() > > > (see postscript for more details) Awesome. >> - For "current module" I've encountered a number of use cases, mostly >> having to do with wanting to define new names dynamically. Somehow I >> have found: >> >> ?globals()[x] = y ?# Note that x is a variable, not a literal >> >> cumbersome; I'd rather write: >> >> ?setattr(__this_module__, x, y) >> >> There are IIRC also some use cases where an API expects a module >> object (or at least something whose attributes it can set and/or get) >> and passing the current module is clumsy: >> >> ?foo(sys.modules[__name__]) > > While this may sound a little hypocritical coming from the author of > PEPs 366 and 395, I'm wary of adding new implicit module globals for > problems with relatively simple and robust alternatives. In this case, > it's fairly easy to get access to the current module using the idiom > Guido quoted: > > ? ?import sys > ? ?_this = sys.modules[__name__] > > (or using dict-style access on globals()) Yeah, well, in most cases I find having to reference sys.modules a distraction and an unwarranted jump into the implementation. It may not even work: there are some recipes that replace sys.modules[__name__] with some wrapper object. If __this_module__ existed it would of course refer to the "real" module object involved. -- --Guido van Rossum (python.org/~guido) From jeanpierreda at gmail.com Sun Aug 7 15:07:20 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 7 Aug 2011 09:07:20 -0400 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: Does it? What else would you use type(None) for? An isinstance() check? Devin On Sun, Aug 7, 2011 at 4:12 AM, Georg Brandl wrote: > Am 07.08.2011 01:34, schrieb dag.odenhall at gmail.com: >> On 29 July 2011 19:03, Guido van Rossum wrote: >>> I think it is fine if type(None)() returns None instead of raising an exception. >> >> +1, I've often wanted it and felt (lambda: None) was somewhat clunky. > > It makes its intent much clearer than type(None) though. > > Georg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From guido at python.org Sun Aug 7 15:15:27 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Aug 2011 09:15:27 -0400 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: Frankly, type(None) only exists in the language because all objects have a type, and have to be introspectable in a uniform manner. This is occasionally useful to introspection code, but of no practical consequence for most users. Using type(None) as a cheap alternative to lambda:None strikes me as an invitation for unreadable code -- since type(None) doesn't have much of a practical purpose, users have no expectation of what it would do (even though you can reason it out). --Guido On Sun, Aug 7, 2011 at 9:07 AM, Devin Jeanpierre wrote: > Does it? What else would you use type(None) for? An isinstance() check? > > Devin > > On Sun, Aug 7, 2011 at 4:12 AM, Georg Brandl wrote: >> Am 07.08.2011 01:34, schrieb dag.odenhall at gmail.com: >>> On 29 July 2011 19:03, Guido van Rossum wrote: >>>> I think it is fine if type(None)() returns None instead of raising an exception. >>> >>> +1, I've often wanted it and felt (lambda: None) was somewhat clunky. >> >> It makes its intent much clearer than type(None) though. >> >> Georg >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From g.brandl at gmx.net Sun Aug 7 15:38:56 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 07 Aug 2011 15:38:56 +0200 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: Well, I wouldn't use it for anything. Georg Am 07.08.2011 15:07, schrieb Devin Jeanpierre: > Does it? What else would you use type(None) for? An isinstance() check? > > Devin > > On Sun, Aug 7, 2011 at 4:12 AM, Georg Brandl wrote: >> Am 07.08.2011 01:34, schrieb dag.odenhall at gmail.com: >>> On 29 July 2011 19:03, Guido van Rossum wrote: >>>> I think it is fine if type(None)() returns None instead of raising an exception. >>> >>> +1, I've often wanted it and felt (lambda: None) was somewhat clunky. >> >> It makes its intent much clearer than type(None) though. >> >> Georg >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> From dag.odenhall at gmail.com Sun Aug 7 16:18:39 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 16:18:39 +0200 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: On 7 August 2011 14:59, Devin Jeanpierre wrote: > What if you want to say that it raises both ValueError and ArithmeticError? > > Annotations never did support multiple annotations, it's a big > drawback. This is a decorator though, you can do what you want. How > about making __annotations__['raise'] a list? > > Otherwise sounds fine to me. The use of keywords in __annotations__ is > pretty clever. :) Neither does return support multiple values, and yet we can do "return x, y". :) Annotations are bound to the evaluation of an expression and you're free to put anything in them. I expect a convention of special-casing tuples will emerge, especially as annotations are particularly suitable for different sorts of "type hints" where tuples are already used for issubclass/isinstance, in "except" clauses and for listing base classes with type() etc. I think annotations should be used sparingly: there's little need for any sort of interoperability between multiple unrelated uses of annotations. It would be rather complicated and unwieldy to say, both put type hints and argument documentation strings in annotations. Decorators are better suited for composability, as are docstrings for documentation. If you're doing something like mapping a view function to a template with the return annotation, it is unlikely that you need to annotate it with type hints as well, for example. From jeanpierreda at gmail.com Sun Aug 7 16:26:03 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 7 Aug 2011 10:26:03 -0400 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: > Neither does return support multiple values, and yet we can do "return x, y". :) That's neither here nor there. That is a single return value, a tuple. Not different kinds of return values, such as returning something that might be annotated as a "sequence" as well as, in other circumstances, returning something that might be annotated as a "dict". Since few to no sanely written functions return incompatible types, there isn't as much reason to complain about this. On the other hand, almost every single function in the stdlib can raise multiple different exceptions. A "raises" annotation is not very useful unless you can annotate multiple exceptions that may be raised. > I think annotations should be used sparingly: there's little need for > any sort of interoperability between multiple unrelated uses of > annotations. It would be rather complicated and unwieldy to say, both > put type hints and argument documentation strings in annotations. > Decorators are better suited for composability, as are docstrings for > documentation. If you're doing something like mapping a view function > to a template with the return annotation, it is unlikely that you need > to annotate it with type hints as well, for example. The decorator you have provided is _not_ composable. In fact, that was my complaint. I never said anything about docstrings or mixing and matching annotations. Devin On Sun, Aug 7, 2011 at 10:18 AM, dag.odenhall at gmail.com wrote: > On 7 August 2011 14:59, Devin Jeanpierre wrote: >> What if you want to say that it raises both ValueError and ArithmeticError? >> >> Annotations never did support multiple annotations, it's a big >> drawback. This is a decorator though, you can do what you want. How >> about making __annotations__['raise'] a list? >> >> Otherwise sounds fine to me. The use of keywords in __annotations__ is >> pretty clever. :) > > Neither does return support multiple values, and yet we can do "return x, y". :) > > Annotations are bound to the evaluation of an expression and you're > free to put anything in them. I expect a convention of special-casing > tuples will emerge, especially as annotations are particularly > suitable for different sorts of "type hints" where tuples are already > used for issubclass/isinstance, in "except" clauses and for listing > base classes with type() etc. > > I think annotations should be used sparingly: there's little need for > any sort of interoperability between multiple unrelated uses of > annotations. It would be rather complicated and unwieldy to say, both > put type hints and argument documentation strings in annotations. > Decorators are better suited for composability, as are docstrings for > documentation. If you're doing something like mapping a view function > to a template with the return annotation, it is unlikely that you need > to annotate it with type hints as well, for example. > From dag.odenhall at gmail.com Sun Aug 7 16:42:19 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 16:42:19 +0200 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: On 7 August 2011 16:26, Devin Jeanpierre wrote: >> Neither does return support multiple values, and yet we can do "return x, y". :) > > That's neither here nor there. That is a single return value, a tuple. > Not different kinds of return values, such as returning something that > might be annotated as a "sequence" as well as, in other circumstances, > returning something that might be annotated as a "dict". That was exactly the point I was trying to demonstrate. It doesn't make sense to put any sort of special support for multiple values in annotations, because you're really just reimplementing conventional use of tuples. > > Since few to no sanely written functions return incompatible types, > there isn't as much reason to complain about this. > > On the other hand, almost every single function in the stdlib can > raise multiple different exceptions. A "raises" annotation is not very > useful unless you can annotate multiple exceptions that may be raised. > >> I think annotations should be used sparingly: there's little need for >> any sort of interoperability between multiple unrelated uses of >> annotations. It would be rather complicated and unwieldy to say, both >> put type hints and argument documentation strings in annotations. >> Decorators are better suited for composability, as are docstrings for >> documentation. If you're doing something like mapping a view function >> to a template with the return annotation, it is unlikely that you need >> to annotate it with type hints as well, for example. > > The decorator you have provided is _not_ composable. In fact, that was > my complaint. I never said anything about docstrings or mixing and > matching annotations. Actually I (quite intentionally) didn't provide any decorator. If really desired, the fictional @raises decorator could check if 'raise' is already set and then combine them into a tuple. I'd argue that this would be inconsistent with syntactical annotations, and @raises should either override, fail, or silently do nothing, in case of the key already being set. More consistently, if you wanted to set a tuple, well then you pass a tuple to the decorator. This could perhaps work via unpacking as well: @raises(ValueError, ArithmeticError). My intent with this thread was to consider conventions for mimicing syntactical annotations in the cases not covered by syntax. If you want to do something more domain-specific using annotations, then by all means go right ahead, but consider that you might be better of using a custom function attribute as well. > > Devin > > On Sun, Aug 7, 2011 at 10:18 AM, dag.odenhall at gmail.com > wrote: >> On 7 August 2011 14:59, Devin Jeanpierre wrote: >>> What if you want to say that it raises both ValueError and ArithmeticError? >>> >>> Annotations never did support multiple annotations, it's a big >>> drawback. This is a decorator though, you can do what you want. How >>> about making __annotations__['raise'] a list? >>> >>> Otherwise sounds fine to me. The use of keywords in __annotations__ is >>> pretty clever. :) >> >> Neither does return support multiple values, and yet we can do "return x, y". :) >> >> Annotations are bound to the evaluation of an expression and you're >> free to put anything in them. I expect a convention of special-casing >> tuples will emerge, especially as annotations are particularly >> suitable for different sorts of "type hints" where tuples are already >> used for issubclass/isinstance, in "except" clauses and for listing >> base classes with type() etc. >> >> I think annotations should be used sparingly: there's little need for >> any sort of interoperability between multiple unrelated uses of >> annotations. It would be rather complicated and unwieldy to say, both >> put type hints and argument documentation strings in annotations. >> Decorators are better suited for composability, as are docstrings for >> documentation. If you're doing something like mapping a view function >> to a template with the return annotation, it is unlikely that you need >> to annotate it with type hints as well, for example. >> > From dag.odenhall at gmail.com Sun Aug 7 16:50:18 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Sun, 7 Aug 2011 16:50:18 +0200 Subject: [Python-ideas] ChainMap as a context manager Message-ID: How about adding the context manager protocol to collections.ChainMap, as an in-place combination of new_child() and parents()? The effect would simulate lexical scope using the with-statement. Trivial to do customly but always nice when builtin or stdlib types support the context manager protocol. :) From mikegraham at gmail.com Sun Aug 7 17:02:15 2011 From: mikegraham at gmail.com (Mike Graham) Date: Sun, 7 Aug 2011 11:02:15 -0400 Subject: [Python-ideas] ChainMap as a context manager In-Reply-To: References: Message-ID: On Sun, Aug 7, 2011 at 10:50 AM, dag.odenhall at gmail.com wrote: > How about adding the context manager protocol to collections.ChainMap, > as an in-place combination of new_child() and parents()? The effect > would simulate lexical scope using the with-statement. > > Trivial to do customly but always nice when builtin or stdlib types > support the context manager protocol. :) I don't understand the utility of such a thing. Can you post a use case or two to make it clearer to people like me? Mike From jeanpierreda at gmail.com Sun Aug 7 17:04:42 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 7 Aug 2011 11:04:42 -0400 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: You seem to believe that type-checking is OK when it's tuples. I don't agree with that philosophy, and there isn't a common ground for us to discuss this on. Sorry for intruding. Devin On Sun, Aug 7, 2011 at 10:42 AM, dag.odenhall at gmail.com wrote: > On 7 August 2011 16:26, Devin Jeanpierre wrote: >>> Neither does return support multiple values, and yet we can do "return x, y". :) >> >> That's neither here nor there. That is a single return value, a tuple. >> Not different kinds of return values, such as returning something that >> might be annotated as a "sequence" as well as, in other circumstances, >> returning something that might be annotated as a "dict". > > That was exactly the point I was trying to demonstrate. It doesn't > make sense to put any sort of special support for multiple values in > annotations, because you're really just reimplementing conventional > use of tuples. > >> >> Since few to no sanely written functions return incompatible types, >> there isn't as much reason to complain about this. >> >> On the other hand, almost every single function in the stdlib can >> raise multiple different exceptions. A "raises" annotation is not very >> useful unless you can annotate multiple exceptions that may be raised. >> >>> I think annotations should be used sparingly: there's little need for >>> any sort of interoperability between multiple unrelated uses of >>> annotations. It would be rather complicated and unwieldy to say, both >>> put type hints and argument documentation strings in annotations. >>> Decorators are better suited for composability, as are docstrings for >>> documentation. If you're doing something like mapping a view function >>> to a template with the return annotation, it is unlikely that you need >>> to annotate it with type hints as well, for example. >> >> The decorator you have provided is _not_ composable. In fact, that was >> my complaint. I never said anything about docstrings or mixing and >> matching annotations. > > Actually I (quite intentionally) didn't provide any decorator. If > really desired, the fictional @raises decorator could check if 'raise' > is already set and then combine them into a tuple. I'd argue that this > would be inconsistent with syntactical annotations, and @raises should > either override, fail, or silently do nothing, in case of the key > already being set. More consistently, if you wanted to set a tuple, > well then you pass a tuple to the decorator. This could perhaps work > via unpacking as well: @raises(ValueError, ArithmeticError). > > My intent with this thread was to consider conventions for mimicing > syntactical annotations in the cases not covered by syntax. If you > want to do something more domain-specific using annotations, then by > all means go right ahead, but consider that you might be better of > using a custom function attribute as well. > >> >> Devin >> >> On Sun, Aug 7, 2011 at 10:18 AM, dag.odenhall at gmail.com >> wrote: >>> On 7 August 2011 14:59, Devin Jeanpierre wrote: >>>> What if you want to say that it raises both ValueError and ArithmeticError? >>>> >>>> Annotations never did support multiple annotations, it's a big >>>> drawback. This is a decorator though, you can do what you want. How >>>> about making __annotations__['raise'] a list? >>>> >>>> Otherwise sounds fine to me. The use of keywords in __annotations__ is >>>> pretty clever. :) >>> >>> Neither does return support multiple values, and yet we can do "return x, y". :) >>> >>> Annotations are bound to the evaluation of an expression and you're >>> free to put anything in them. I expect a convention of special-casing >>> tuples will emerge, especially as annotations are particularly >>> suitable for different sorts of "type hints" where tuples are already >>> used for issubclass/isinstance, in "except" clauses and for listing >>> base classes with type() etc. >>> >>> I think annotations should be used sparingly: there's little need for >>> any sort of interoperability between multiple unrelated uses of >>> annotations. It would be rather complicated and unwieldy to say, both >>> put type hints and argument documentation strings in annotations. >>> Decorators are better suited for composability, as are docstrings for >>> documentation. If you're doing something like mapping a view function >>> to a template with the return annotation, it is unlikely that you need >>> to annotate it with type hints as well, for example. >>> >> > From grosser.meister.morti at gmx.net Sun Aug 7 17:31:10 2011 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sun, 07 Aug 2011 17:31:10 +0200 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: <4E3EAFBE.70909@gmx.net> On 08/07/2011 01:27 PM, dag.odenhall at gmail.com wrote: > This isn't really a proposal for any sort of language/stdlib change, > rather I felt like discussing the potential for informal standard > conventions with annotations. > > Probably for the better, there is no special syntax for annotating > raised exceptions or yields from a generator. In line with how the > "->" syntax sets the 'return' key, I suggest an informal standard of > representing keywords this way, for example in a decorator for > raise/yield annotations: > > # third-party decorator > @raises(ValueError) > def foo():pass > > assert foo.__annotations__['raise'] == ValueError > I wrote something using the foo.__annotations__ convention a while back, but it wasn't received well on this mailinglist. Here it is again now with an added `annotations` decorator: """ >>> @annotation >>> def raises(*exceptions): >>> return exceptions >>> >>> @raises(TypeError) >>> def foo(): >>> pass >>> >>> getannot(foo,'raises') (,) >>> """ from functools import wraps def annotations(**annots): def deco(obj): if hasattr(obj,'__annotations__'): obj.__annotations__.update(annots) else: obj.__annotations__ = annots return obj return deco _NONE = object() def getannot(obj, key, default=_NONE): if hasattr(obj, '__annotations__'): if default is _NONE: return obj.__annotations__[key] else: return obj.__annotations__.get(key, default) elif default is _NONE: raise KeyError(key) else: return default def setannot(obj, key, value): if hasattr(obj, '__annotations__'): obj.__annotations__[key] = value else: obj.__annotations__ = {key: value} def hasannot(obj, key): if hasattr(obj, '__annotations__'): return key in obj.__annotations__ else: return False def annotation(annotfunc): if hasattr(annotfunc, '__name__'): key = annotfunc.__name__ else: key = annotfunc.func_name @wraps(annotfunc) def params(*args,**kwargs): def deco(obj): setannot(obj, key, annotfunc(*args,**kwargs)) return obj return deco return params From guido at python.org Sun Aug 7 18:31:33 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Aug 2011 12:31:33 -0400 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: On Sun, Aug 7, 2011 at 7:27 AM, dag.odenhall at gmail.com wrote: > This isn't really a proposal for any sort of language/stdlib change, > rather I felt like discussing the potential for informal standard > conventions with annotations. > > Probably for the better, there is no special syntax for annotating > raised exceptions or yields from a generator. In line with how the > "->" syntax sets the 'return' key, I suggest an informal standard of > representing keywords this way, for example in a decorator for > raise/yield annotations: > > # third-party decorator > @raises(ValueError) > def foo():pass > > assert foo.__annotations__['raise'] == ValueError > > This might be an obvious solution, but I just wanted to "put it out > there" up front, before inconsistent workarounds emerge. This > convention should work because it is primarily needed for control > structures that are reserved keywords anyway. The one exception I can > think of is the inverse of yield: generator.send() - "send" could > conflict with a function argument, and should therefore not be put in > __annotations__. (A hack could be to use a different but semantically > related keyword like 'import', or an otherwise invalid identifier like > 'send()', but it might be best to simply not use __annotations__ for > this.) Hi Dag, Are you currently using annotations? Could you post some of the cool usages that you are making of annotations? The explicit plan with annotations (read PEP 3107) was that significant use should precede the creation of conventions for use. So please don't wait until a convention has been established -- go ahead and have fun with them, and let us know what you are doing with them! Re: declaring raised exceptions, IIUC Java is pretty much the only language supporting such a feature, and even there the current view is that they have not lived up to the expectation when the feature was designed. So I would rather do nothing in that area. -- --Guido van Rossum (python.org/~guido) From aquavitae69 at gmail.com Sun Aug 7 18:37:27 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 7 Aug 2011 18:37:27 +0200 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: Is there any reason not to allow similar behaviour for True and False? i.e. True() == True On Aug 7, 2011 3:39 PM, "Georg Brandl" wrote: > Well, I wouldn't use it for anything. > > Georg > > Am 07.08.2011 15:07, schrieb Devin Jeanpierre: >> Does it? What else would you use type(None) for? An isinstance() check? >> >> Devin >> >> On Sun, Aug 7, 2011 at 4:12 AM, Georg Brandl wrote: >>> Am 07.08.2011 01:34, schrieb dag.odenhall at gmail.com: >>>> On 29 July 2011 19:03, Guido van Rossum wrote: >>>>> I think it is fine if type(None)() returns None instead of raising an exception. >>>> >>>> +1, I've often wanted it and felt (lambda: None) was somewhat clunky. >>> >>> It makes its intent much clearer than type(None) though. >>> >>> Georg >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >>> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Sun Aug 7 18:46:53 2011 From: masklinn at masklinn.net (Masklinn) Date: Sun, 7 Aug 2011 18:46:53 +0200 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: <2EFA68BB-861E-4593-9CBF-989020D6C5A8@masklinn.net> On 2011-08-07, at 18:37 , David Townshend wrote: > Is there any reason not to allow similar behaviour for True and False? i.e. > True() == True As far as I undersand, that's not the proposal for NoneType (or NotImplementedType or ellipsis) at all. You're suggesting that the *value* could be callable (and return itself), the proposal is about *types*, and them becoming "normal" (callable) types: currently, if you get a hold of NoneType, NotImplementedType or ellipsis (generally via a call to `type`) you can not use it to get an instance of that class back: >>> type(None)() Traceback (most recent call last): File "", line 1, in TypeError: cannot create 'NoneType' instances >>> type(NotImplemented)() Traceback (most recent call last): File "", line 1, in TypeError: cannot create 'NotImplementedType' instances >>> type(...)() Traceback (most recent call last): File "", line 1, in TypeError: cannot create 'ellipsis' instances This proposal is merely about removing this forbidding. Booleans don't have this issue: `bool` is already a normal (callable) type, and returns `False` when called without arguments. From stefan_ml at behnel.de Sun Aug 7 18:51:41 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 07 Aug 2011 18:51:41 +0200 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: References: <4E32E1CD.4020906@stoneleaf.us> Message-ID: David Townshend, 07.08.2011 18:37: > Is there any reason not to allow similar behaviour for True and False? i.e. > True() == True That's not similar at all. But we already have >>> type(True)() False which, written in a less surprising way, gives >>> type(False)() False Stefan From aquavitae69 at gmail.com Sun Aug 7 19:38:29 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 7 Aug 2011 19:38:29 +0200 Subject: [Python-ideas] change NoneType, NotImplementedType, & ellipses to return the appropriate singleton In-Reply-To: <2EFA68BB-861E-4593-9CBF-989020D6C5A8@masklinn.net> References: <4E32E1CD.4020906@stoneleaf.us> <2EFA68BB-861E-4593-9CBF-989020D6C5A8@masklinn.net> Message-ID: Ah, sorry - I didn't read it properly! On Aug 7, 2011 6:46 PM, "Masklinn" wrote: > On 2011-08-07, at 18:37 , David Townshend wrote: >> Is there any reason not to allow similar behaviour for True and False? i.e. >> True() == True > As far as I undersand, that's not the proposal for NoneType (or NotImplementedType or ellipsis) at all. > > You're suggesting that the *value* could be callable (and return itself), the proposal is about *types*, and them becoming "normal" (callable) types: currently, if you get a hold of NoneType, NotImplementedType or ellipsis (generally via a call to `type`) you can not use it to get an instance of that class back: > > >>> type(None)() > Traceback (most recent call last): > File "", line 1, in > TypeError: cannot create 'NoneType' instances > >>> type(NotImplemented)() > Traceback (most recent call last): > File "", line 1, in > TypeError: cannot create 'NotImplementedType' instances > >>> type(...)() > Traceback (most recent call last): > File "", line 1, in > TypeError: cannot create 'ellipsis' instances > > This proposal is merely about removing this forbidding. > > Booleans don't have this issue: `bool` is already a normal (callable) type, and returns `False` when called without arguments. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Sun Aug 7 20:00:36 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Sun, 7 Aug 2011 14:00:36 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: <4E3E2445.2070403@pearwood.info> References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 1:36 AM, Steven D'Aprano wrote: > For most uses, standard recursion via the name is good enough, it's only > a few corner cases where self-reflection (as I call it) is needed. Self-reflection is a purity issue. Are you worried about obscure bugs when something else hijacks the name of your function (or module or class), but your function is still somehow runnable? Practicality Beats Purity, but purity still has some value. I will also note that names are particularly likely to get reused in some contexts (GUIs, security proxies) where the writer of the original code can't rely on anything about the runtime environment. > I don't think that use-case is so important that it should be implicitly added > to every function, on the off-chance it is needed, rather than explicitly on > demand. Agreed. -jJ From guido at python.org Sun Aug 7 20:15:02 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Aug 2011 14:15:02 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 2:00 PM, Jim Jewett wrote: > On Sun, Aug 7, 2011 at 1:36 AM, Steven D'Aprano wrote: > >> For most uses, standard recursion via the name is good enough, it's only >> a few corner cases where self-reflection (as I call it) is needed. > > Self-reflection is a purity issue. ?Are you worried about obscure bugs > when something else hijacks the name of your function (or module or > class), but your function is still somehow runnable? If you worry about that every time you write a recursive function you'll go insane. > Practicality Beats Purity, but purity still has some value. ?I will > also note that names are particularly likely to get reused in some > contexts (GUIs, security proxies) where the writer of the original > code can't rely on anything about the runtime environment. There are some standard situations where some standard recipes apply. But it's best to use those sparingly or your code will become unreadable. >> I don't think that use-case is so important that it should be implicitly added >> to every function, on the off-chance it is needed, rather than explicitly on >> demand. Right. -- --Guido van Rossum (python.org/~guido) From ericsnowcurrently at gmail.com Sun Aug 7 20:21:14 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sun, 7 Aug 2011 12:21:14 -0600 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Aug 7, 2011 6:11 AM, "Guido van Rossum" wrote: > > On Sun, Aug 7, 2011 at 3:46 AM, Eric Snow wrote: > > On Sat, Aug 6, 2011 at 11:36 PM, Steven D'Aprano wrote: > >> Eric Snow wrote: > >>> Why not bind the called function-object to the frame locals, rather > >>> than the one for which the code object was created, perhaps as > >>> "__function__"? > >> > >> I'm afraid I can't interpret this (which may be my ignorance rather than > >> your fault). > > > > No, the fault is likely mine. But it seems so clear to me. :) > > > >> The only guess I can make is based on what you say later: > >> > >> "One new implicit name in locals(). > >> > >> so I presume you mean that the function should see a local variable (perhaps > >> called "me", or "this"?) that is bound to itself. > > > > One called __function__ (or the like). A "dunder" name is used to > > indicate its special nature and limit conflict with existing code. > > Without thinking too much about this I like it. > > > The function that was called would be bound to that name at function > > execution time. Keep in mind that I am talking about the frame > > locals, not anything stored on the code object nor on the function > > object. Not to overdramatize it, but it would happen at the beginning > > of every call of every function. I don't know what that overhead > > would be. > > It could be made into a "cell", which is the same way all locals are > normally represented. This is very fast. Further the code for it could > be triggered by the appearance of __function__ (if that's the keyword > we choose) in the function body. I don't really care what happens if > people use locals() -- that's inefficient and outmoded anyway. (Note > that!) > > >> Presumably if a function wants to use that same name as a local, nothing bad > >> will happen, since the local assignment will just override the implicit > >> assignment. But what about code that expects to see a nonlocal or global > >> with the same name? > > That's why a __dunder__ name is used. > > >> What happens when two functions, sharing the same code object, get called > >> from two threads at the same time? Are their locals independent? > > > > I'm afraid I don't know. I expect that each would get executed in > > separate execution frames, and so have separate frame locals. > > The frames are completely independent. They all point to the same code > object and under the proposal they will all point to the same function > object. I see no problems here except self-inflicted, like using > __function__ to hold state that can't be accessed concurrently safely; > note that recursive invocations have the same issue. I see it as a > non-problem. > > >> For most uses, standard recursion via the name is good enough, it's only a > >> few corner cases where self-reflection (as I call it) is needed. > > Right. If it were expected that people would start writing recursive > calls using __function__ routinely, in situations where a name > reference works, I'd be very unhappy with the new feature. (And if > someone wants to make the argument that recursive calls using > __function__ are actually better in some way I am willing to > filibuster.) > > >> And I say > >> that as somebody who does want a way for functions to know themselves. I > >> don't think that use-case is so important that it should be implicitly added > >> to every function, on the off-chance it is needed, rather than explicitly on > >> demand. > > > > For me the use case involves determining what function called my > > function. Currently you can tell in which execution frame a function > > was called, and thereby which code object, but reliably matching that > > to a function is not so simple. I recognize that my case is likely > > not a general one. > > But it is a nice one. It solves some issues that pdb currently solves > by just using a file/line reference. > > >>> To finish things off, bind to every new code object > >>> the function for which it was created, perhaps as "co_func". That way > >>> you will always know what function object was called and which one the > >>> code object came from originally. > >> > >> What benefit will this give? Have you ever looked at a code object and said, > >> "I need a way of knowing which function this is from?" If so, I'd love to > >> know what problem you were trying to solve at the time! > > > > You caught me! :) I don't already have a use case for this part. I > > had only considered that without this you could not determine where a > > code object came from, or if a function had borrowed another's code > > object. This is certainly only useful in the case that one function > > is using the code object of another, which we have all agreed is not > > that common. However, with a co_func I felt that all the bases would > > be covered. > > Ah, but you can't do that! There are many situations where a single > code object is used to create many different function objects. E.g. > every time you have a nested function. Also the code object is > immutable. This part is all carefully considered and should be left > alone. >From all the responses it's apparent I have not communicated the idea well. The idea is for the *called* function object to be bound to __function__. Here's an example: def g(): def f(): return return f f1 = g() f1() f2 = g() f2() (Thanks for pointing out that f1 and f2 share a code object.) In the call to g, g would be bound to __function__ in the function body. In the call to f1, __function__ would be f1 (in f_locals). And in the call to f2, __function__ would be f2. Also, while __function__ is not used in this example, it would still be there in the frame locals. Otherwise a function called by f would be unable to use it (via inspect.stack() and the like). Also, at definition time the original function would be bound as co_func. It would not be associated with __function__, except indirectly. This is so that you can tell the difference between the original function and one that is using the code object of the first. -eric > > >> Code objects don't always get created as part of a function. They can be > >> returned by compile. What should co_func be set to then? > > > > None, since there was no function object created along with the code > > object. Same with generator expressions. > > Just forget this part. > > >> Finally, if the function has a reference to the code object, and the code > >> object has a reference to the function, you have a reference cycle. That's > >> not the end of the world now as it used to be, in early Python before the > >> garbage collector was added, but still, there better be a really good > >> use-case to justify it. > >> > >> (Perhaps a weak reference might be more appropriate?) > > > > Good point. > > > > Mostly I am trying to look for an angle that works without a lot of > > trouble. Can't fault me for trying in my own incoherent way. :) > > On the rest of that rejected PEP: > > - I'm not actually sure how easy it is to implement the setting of > __function__ when the frame is created. IIRC the frame creation is > rather far removed from the function object, as there are various > cases where there is no function object (class bodies, module-level > code) and in other cases the function is called via a bound method. > Someone should write a working patch to figure out if this is a > problem in practice. > > - The primary use case for __function__ to me seems to access function > attributes, but I'm not sure what's wrong with referencing these via > the function name. Maybe it's when there's a method involved, since > then you'd have to write ... > > - It seems that the "current class" question has already been solved > for super(). If more is needed I'd be okay with extending the > machinery used by super() so that you can access the magic "current > class" variable explicitly too. > > - For "current module" I've encountered a number of use cases, mostly > having to do with wanting to define new names dynamically. Somehow I > have found: > > globals()[x] = y # Note that x is a variable, not a literal > > cumbersome; I'd rather write: > > setattr(__this_module__, x, y) > > There are IIRC also some use cases where an API expects a module > object (or at least something whose attributes it can set and/or get) > and passing the current module is clumsy: > > foo(sys.modules[__name__]) > > On the whole these use cases are all fairly weak though and I would > give it a +0 at best. But rather than a prolonged discussion of the > merits and use cases, I strongly recommend that somebody tries to come > up with a working implementation and we'll strengthen the PEP from > there. > > -- > --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Sun Aug 7 20:25:55 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Sun, 7 Aug 2011 14:25:55 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 8:10 AM, Guido van Rossum wrote: > On Sun, Aug 7, 2011 at 3:46 AM, Eric Snow wrote: >> On Sat, Aug 6, 2011 at 11:36 PM, Steven D'Aprano wrote: >>> Eric Snow wrote: > Right. If it were expected that people would start writing recursive > calls using __function__ routinely, in situations where a name > reference works, I'd be very unhappy with the new feature. (And if > someone wants to make the argument that recursive calls using > __function__ are actually better in some way I am willing to > filibuster.) They are better from a purity standpoint ... if your function is later renamed (or stored anonymously in an array of functions), and then the name is reused, then using the name allows silent bugs. As a style guide, I can certainly understand "Don't do that", but I prefer to program libraries defensively, because that sort of bug won't show up until the system is already pretty complex. Using __function__ is similar to using self instead of presuming that you know the class name, except that overridden class attributes (vs self) are far more common (and therefore also far easier to catch when debugging). > There are IIRC also some use cases where an API expects a module > object (or at least something whose attributes it can set and/or get) > and passing the current module is clumsy: > > ?foo(sys.modules[__name__]) And it may not work, if the installation is complex enough that there are multiple modules with the same name. For me, this falls in the gray area of "Don't do that, but beware that others might." -jJ From solipsis at pitrou.net Sun Aug 7 21:35:47 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 7 Aug 2011 21:35:47 +0200 Subject: [Python-ideas] Access to function objects References: <4E3E2445.2070403@pearwood.info> Message-ID: <20110807213547.2a0f6ab7@pitrou.net> On Sun, 7 Aug 2011 09:07:02 -0400 Guido van Rossum wrote: > > > > While this may sound a little hypocritical coming from the author of > > PEPs 366 and 395, I'm wary of adding new implicit module globals for > > problems with relatively simple and robust alternatives. In this case, > > it's fairly easy to get access to the current module using the idiom > > Guido quoted: > > > > ? ?import sys > > ? ?_this = sys.modules[__name__] > > > > (or using dict-style access on globals()) > > Yeah, well, in most cases I find having to reference sys.modules a > distraction and an unwarranted jump into the implementation. It may > not even work: there are some recipes that replace > sys.modules[__name__] with some wrapper object. If __this_module__ > existed it would of course refer to the "real" module object involved. I would also add a ton of reference cycles and might create new juicy problems in the module shutdown procedure :) Regards Antoine. From tjreedy at udel.edu Sun Aug 7 22:46:19 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 07 Aug 2011 16:46:19 -0400 Subject: [Python-ideas] anonymous object support In-Reply-To: References: <64FD0873-5ACB-4BD8-AB90-43EC9DE11215@gmail.com> <7B27CFDD-5C28-45E6-B124-D0C77EBA85C4@gmail.com> Message-ID: On 8/7/2011 8:26 AM, Guido van Rossum wrote: > 2011/8/6 Carl Matthew Johnson: >> def _namedtuple(func): >> return namedtuple(func.__name__, func.__code__.co_varnames) >> >> @_namedtuple >> def Point(x,y): >> pass >> >> That is very clever, but it kind of illustrates my point about needing a new keyword. When you see "def" don't you naturally think, "OK, what comes out of this will be a function named Point." But what comes out of this is not a function. It's a namedtuple, which is quite different? > > I'm not sure I find that much of an objection. There are plenty of > situations where decorators are used to seriously pervert the type of > the defined name. Plus, what's a function? A class can be called as > well. What's the difference? Calling the decorator NamedTuple might hint that the decorated result is the type of function we call a class. -- Terry Jan Reedy From tjreedy at udel.edu Mon Aug 8 00:01:23 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 07 Aug 2011 18:01:23 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On 8/7/2011 8:10 AM, Guido van Rossum wrote: >>> For most uses, standard recursion via the name is good enough, it's only a >>> few corner cases where self-reflection (as I call it) is needed. > > Right. If it were expected that people would start writing recursive > calls using __function__ routinely, in situations where a name > reference works, I'd be very unhappy with the new feature. I am willing to separate the recursion and attribute access use cases and not advocate that. I agree that def fact(n): # requires int n >= 0 return n*__function__(n-1) if n else 1 is less readable than fact(n-1) > (And if > someone wants to make the argument that recursive calls using > __function__ are actually better in some way I am willing to > filibuster.) Such calls would be slightly faster by avoiding a name lookup, but avoiding *any* function call by using iteration should be faster yet. For linear recursion (at most one call per call), this is usually trivial. In my book, I am stipulating that namespace manipulations that change the recursiveness of a function as written are 'forbidden' for the purpose of interpreting the code presented. This should be assumed or stated in other similar contexts. This restricts the recursion use case to multiple recursion (possibly multiple calls per call) in production library code that is not easily converted to iteration, is not intended to be read by anyone other than maintainers, and that might or even is expected to be used in an adverse environment. -- Terry Jan Reedy From ncoghlan at gmail.com Mon Aug 8 01:56:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Aug 2011 09:56:32 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 11:07 PM, Guido van Rossum wrote: > On Sun, Aug 7, 2011 at 8:46 AM, Nick Coghlan wrote: >> With a PEP 3135 closure style solution, the cell reference would be >> filled in at function definition time, so that part shouldn't be an >> issue. > > Yes, I was thinking of something like that (though honestly I'd > forgotten some of the details :-). I'd forgotten many of the details as well, but was tracking down some super() strangeness recently (to answer a question Michael Foord asked, IIRC) and had to look it up. > IMO there is no doubt that if __function__ were to exist it should > reference the innermost function, i.e. the thing that was created by > the 'def' statement before any decorators were applied. Yeah, I'd mostly realised that by the time I finished writing by last message, but figured I'd record the train of thought that got me there. >> Reference by name lazily accesses the outermost one, but doesn't care >> how the decorators are applied (i.e. as part of the def statement or >> via post decoration). > > What do you mean here by lazily? Just the fact that the reference isn't resolved until the function executes rather than being resolved when it gets defined. >> A __class__ style cell reference to the result >> of the 'def' statement would behave differently in the post decoration >> case. > > Oh you were thinking of making it reference the result after > decoration? Maybe I know too much about the implementation, but I > would find that highly confusing. Do you even have a use case for > that? If so, I think it should be a separate name, e.g. > __decorated_function__. The only reason I was thinking that way is that currently, if you do something like [1]: @lru_cache() def fib(n): if n < 2: return n return fib(n-1) + fib(n-2) then, at call time, 'fib' will resolve to the caching wrapper rather than to the undecorated function. Using a reference to the undecorated function instead (as would have to happen for a sane implementation of __func__) would be actively harmful since the recursive calls would bypass the cache unless the lru_cache decorator took steps to change the way the reference evolved: @lru_cache() def fib(n): if n < 2: return n return __func__(n-1) + __func__(n-2) # Not the same, unless lru_cache adjusts the reference This semantic mismatch has actually shifted my opinion from +0 to -1 on the idea. Relying on normal name lookup can be occasionally inconvenient, but it is at least clear what we're referring to. The existence of wrapper functions means that "this function" isn't as clear and unambiguous a phrase as it first seems. (I think the reason we get away with it in the PEP 3135 case is that 'class wrappers' typically aren't handled via class decorators but via metaclasses, which do a better job of playing nicely with the implicit closure created to handle super() and __class__) >> While referencing the innermost function would likely be wrong in any >> case involving function attributes, having the function in a valid >> state during decoration will likely mandate filling in the cell >> reference before invoking any decorators. Perhaps the best solution >> would be to syntactically reference the innermost function, but >> provide a clean way in functools to shift the cell reference to a >> different function (with functools.wraps doing that automatically). > > Hm, making it dynamic sounds wrong. I think it makes more sense to > just share the attribute dict (which is easily done through assignment > to the wrapping function's __dict__). Huh, I hadn't even thought of that as a potential alternative to the update() based approach currently used in functools.wraps (I had to jump into the interactive interpreter to confirm that functions really do let you swap out their instance dict). It's interesting that, once again, the status quo deals with this according to ordinary name resolution rules: any wrapping of the function will be ignored, *unless* we store the wrapper back into the original location so the name resolution in the function body will see it. Since the idea of implicitly sharing state between currently independent wrapper functions scares me, this strikes me as another reason to switch to '-1'. >> This does seem like an area ripe for subtle decoration related bugs >> though, especially by contrast with lazy name based lookup. > > TBH, personally I am in most cases unhappy with the aggressive copying > of docstring and other metadata from the wrapped function to the > wrapper function, and wish the idiom had never been invented. IIRC, I was the one who actually committed the stdlib blessing of the idiom in the form of 'functools.wraps'. It was definitely a hack to deal with the increasing prevalence of wrapper functions as decorators became more popular - naive introspection was giving too many wrong answers and tweaking the recommended wrapping process so that 'f.__doc__' would work again seemed like a better option than defining a complex introspection protocol to handle wrapped functions. I still think it was a reasonable way forward (and better than leaving things as they were), but it's definitely an approach with quite a few flaws. >> While this may sound a little hypocritical coming from the author of >> PEPs 366 and 395, I'm wary of adding new implicit module globals for >> problems with relatively simple and robust alternatives. In this case, >> it's fairly easy to get access to the current module using the idiom >> Guido quoted: >> >> ? ?import sys >> ? ?_this = sys.modules[__name__] >> >> (or using dict-style access on globals()) > > Yeah, well, in most cases I find having to reference sys.modules a > distraction and an unwarranted jump into the implementation. It may > not even work: there are some recipes that replace > sys.modules[__name__] with some wrapper object. If __this_module__ > existed it would of course refer to the "real" module object involved. Some invocations of runpy.run_module also cause the 'sys.modules' based idioms to fail, so there may be a case to be made for this one. I suspect some folks would use it to avoid global declarations as well (i.e. by just writing '__module__.x = y'). It might cause the cyclic GC some grief, though,so the implementation consequences would need to be investigated if someone wanted to pursue it. Cheers, Nick. [1] http://docs.python.org/dev/library/functools.html#functools.lru_cache -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dag.odenhall at gmail.com Mon Aug 8 03:03:33 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Mon, 8 Aug 2011 03:03:33 +0200 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: > Hi Dag, > > Are you currently using annotations? Could you post some of the cool > usages that you are making of annotations? The explicit plan with > annotations (read PEP 3107) was that significant use should precede > the creation of conventions for use. So please don't wait until a > convention has been established -- go ahead and have fun with them, > and let us know what you are doing with them! I'm toying with them for adaptation, interfaces and dependency injection from a component registry. Each use is about type constants, but not necessarily in the vein of static type checking, which I think stands to show their strengths and "Pythonicity". I do kinda think there's some need of informal conventions, examplified by Mathias post: his decorator sets the 'raises' key, effectively making "raises" a reserved keyword in function arguments using the decorator! > Re: declaring raised exceptions, IIUC Java is pretty much the only > language supporting such a feature, and even there the current view is > that they have not lived up to the expectation when the feature was > designed. So I would rather do nothing in that area. Probably true, though probably in part because it isn't optional in Java. In Pythonland it's probably more useful to make i part of the documentation ? la the :raises: field of Sphinx. Much of the benefits of exceptions are that they can be ... unexpected, and bubble up: doing something like reraising unexpected exceptions as a TypeError would not likely be very useful (well, maybe a little useful with the new __cause__/__context__) and neither do exceptions fit in well with my other uses (adaptation, dependency injection) of annotations. Some convention for annotating 'yield' may still be useful though, although one alternative convention could use some form of "parametrized types" and the return annotation: foo() -> Iterator[tuple]. Now we just need to add this to ABCMeta. *cough* ;) From ncoghlan at gmail.com Mon Aug 8 05:01:13 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Aug 2011 13:01:13 +1000 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: On Mon, Aug 8, 2011 at 11:03 AM, dag.odenhall at gmail.com wrote: >> Hi Dag, >> >> Are you currently using annotations? Could you post some of the cool >> usages that you are making of annotations? The explicit plan with >> annotations (read PEP 3107) was that significant use should precede >> the creation of conventions for use. So please don't wait until a >> convention has been established -- go ahead and have fun with them, >> and let us know what you are doing with them! > > I'm toying with them for adaptation, interfaces and dependency > injection from a component registry. Each use is about type constants, > but not necessarily in the vein of static type checking, which I think > stands to show their strengths and "Pythonicity". > > I do kinda think there's some need of informal conventions, > examplified by Mathias post: his decorator sets the 'raises' key, > effectively making "raises" a reserved keyword in function arguments > using the decorator! So far, the general approach has been for annotations to be paired with decorator APIs, such that there is a cleaner less repetitive syntax that relies on function annotations and a more general (but more verbose) approach that uses arguments to a decorator factory. That approach seems to work well, with the latter API used to handle cases where a developer wants to use more than one annotation based decorator on a single function. The general principle is that any decorator that can use argument annotations should have an alternate decorator factory based API that can be used when necessary (usually either because the annotations are being used for something else or because the function being decorated is an existing one from another library that doesn't have any relevant annotations at all). Using the annotation namespace to store arbitrary metadata doesn't seem like a good idea at all. It is better to use the function attribute namespace for that kind of thing - don't forget about the old tools just because there is a shiny new tool to play with. > Some convention for annotating 'yield' may still be useful though, > although one alternative convention could use some form of > "parametrized types" and the return annotation: foo() -> > Iterator[tuple]. Now we just need to add this to ABCMeta. *cough* ;) Why not just use the return field on the generator function as is? The return type of calling something for which 'inspect.isgeneratorfunction(x)' is true is always going to be 'generator', so the return annotation is unlikely to be referring directly to that. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dag.odenhall at gmail.com Mon Aug 8 05:32:42 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Mon, 8 Aug 2011 05:32:42 +0200 Subject: [Python-ideas] Conventions for Function Annotations In-Reply-To: References: Message-ID: On 8 August 2011 05:01, Nick Coghlan wrote: > On Mon, Aug 8, 2011 at 11:03 AM, dag.odenhall at gmail.com > wrote: >>> Hi Dag, >>> >>> Are you currently using annotations? Could you post some of the cool >>> usages that you are making of annotations? The explicit plan with >>> annotations (read PEP 3107) was that significant use should precede >>> the creation of conventions for use. So please don't wait until a >>> convention has been established -- go ahead and have fun with them, >>> and let us know what you are doing with them! >> >> I'm toying with them for adaptation, interfaces and dependency >> injection from a component registry. Each use is about type constants, >> but not necessarily in the vein of static type checking, which I think >> stands to show their strengths and "Pythonicity". >> >> I do kinda think there's some need of informal conventions, >> examplified by Mathias post: his decorator sets the 'raises' key, >> effectively making "raises" a reserved keyword in function arguments >> using the decorator! > > So far, the general approach has been for annotations to be paired > with decorator APIs, such that there is a cleaner less repetitive > syntax that relies on function annotations and a more general (but > more verbose) approach that uses arguments to a decorator factory. > That approach seems to work well, with the latter API used to handle > cases where a developer wants to use more than one annotation based > decorator on a single function. The general principle is that any > decorator that can use argument annotations should have an alternate > decorator factory based API that can be used when necessary (usually > either because the annotations are being used for something else or > because the function being decorated is an existing one from another > library that doesn't have any relevant annotations at all). Or, not even an explicit decorator: it's often useful to embed instructions without runtime side-effects, such as with venusian and its use in Pyramid for @view_config and config.scan(). I use annotations like that for functions as adapter factories without a global registry. Similarly an interface definition might treat all methods as abstract with type hints without requiring a decorator like @abstractmethod. > > Using the annotation namespace to store arbitrary metadata doesn't > seem like a good idea at all. It is better to use the function > attribute namespace for that kind of thing - don't forget about the > old tools just because there is a shiny new tool to play with. > >> Some convention for annotating 'yield' may still be useful though, >> although one alternative convention could use some form of >> "parametrized types" and the return annotation: foo() -> >> Iterator[tuple]. Now we just need to add this to ABCMeta. *cough* ;) > > Why not just use the return field on the generator function as is? The > return type of calling something for which > 'inspect.isgeneratorfunction(x)' is true is always going to be > 'generator', so the return annotation is unlikely to be referring > directly to that. I think Armin Ronacher said generator detection is unreliable, but in any case, it seems perhaps more Pythonic to rely only on the iterator/iterable protocol rather than specifically generators. "Iterator[tuple]" for example would match dict.items(). From dag.odenhall at gmail.com Mon Aug 8 05:43:44 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Mon, 8 Aug 2011 05:43:44 +0200 Subject: [Python-ideas] ChainMap as a context manager In-Reply-To: References: Message-ID: On 7 August 2011 17:02, Mike Graham wrote: > On Sun, Aug 7, 2011 at 10:50 AM, dag.odenhall at gmail.com > wrote: >> How about adding the context manager protocol to collections.ChainMap, >> as an in-place combination of new_child() and parents()? The effect >> would simulate lexical scope using the with-statement. >> >> Trivial to do customly but always nice when builtin or stdlib types >> support the context manager protocol. :) > > I don't understand the utility of such a thing. Can you post a use > case or two to make it clearer to people like me? To be quite honest I'm not perfectly sure myself; it just seemed like such a natural context manager that I got curious if it was just overlooked or if I was missing something. If it turns out it would make sense it might be good to realize it early before the 3.3 release? From ncoghlan at gmail.com Mon Aug 8 06:11:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Aug 2011 14:11:58 +1000 Subject: [Python-ideas] ChainMap as a context manager In-Reply-To: References: Message-ID: On Mon, Aug 8, 2011 at 1:43 PM, dag.odenhall at gmail.com wrote: > To be quite honest I'm not perfectly sure myself; it just seemed like > such a natural context manager that I got curious if it was just > overlooked or if I was missing something. If it turns out it would > make sense it might be good to realize it early before the 3.3 > release? While the enthusiasm is appreciated, it's useful to run through a quick sanity check before lobbing an email into the inboxes of all of the python-ideas subscribers: 1. Do I have a concrete use case in mind? 2. If yes, is that use case highly specific to my current problem domain, or is it more broadly applicable than that? 3. If no, is there a clear inconsistency or wart that presents a roadblock to learning the language that the idea would address? We're actually fairly conservative about what we add to the core language and the standard library, so we greatly prefer ideas that have been "battle tested" outside the standard library first. Py3k did include a couple of experiments that are still in the process of proving themselves (i.e. function annotations and new-style string formatting), but such changes are definitely the exception rather than the rule. In this case, it is trivial for someone to write a collections.ChainMap based decorator that works as you describe: @contextmanager def scope(chain=None): if scope is None: yield ChainMap({}) else: yield chain.new_child() with scope() as outer: with scope(outer) as inner1: # Manipulate inner1 without affecting outer However, that adds no real expressivity and is unlikely to be particularly useful in practice, since most lexical scoping problems can be handled using *actual* lexical scoping. ChainMap is useful for cases like multiple levels of configuration data where the scoping occurs at runtime rather than in the source code. Hiding a call new_child() inside a context manager just to get an additional level of indentation is fairly pointless. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Aug 8 06:13:44 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Aug 2011 14:13:44 +1000 Subject: [Python-ideas] ChainMap as a context manager In-Reply-To: References: Message-ID: On Mon, Aug 8, 2011 at 2:11 PM, Nick Coghlan wrote: > @contextmanager > def scope(chain=None): > ? ?if scope is None: > ? ? ? ?yield ChainMap({}) > ? ?else: > ? ? ? ?yield chain.new_child() s/scope/chain/ on the first line of the function body. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dag.odenhall at gmail.com Mon Aug 8 06:18:53 2011 From: dag.odenhall at gmail.com (dag.odenhall at gmail.com) Date: Mon, 8 Aug 2011 06:18:53 +0200 Subject: [Python-ideas] ChainMap as a context manager In-Reply-To: References: Message-ID: On 8 August 2011 06:11, Nick Coghlan wrote: > On Mon, Aug 8, 2011 at 1:43 PM, dag.odenhall at gmail.com > wrote: >> To be quite honest I'm not perfectly sure myself; it just seemed like >> such a natural context manager that I got curious if it was just >> overlooked or if I was missing something. If it turns out it would >> make sense it might be good to realize it early before the 3.3 >> release? > > While the enthusiasm is appreciated, it's useful to run through a > quick sanity check before lobbing an email into the inboxes of all of > the python-ideas subscribers: > > 1. Do I have a concrete use case in mind? > 2. If yes, is that use case highly specific to my current problem > domain, or is it more broadly applicable than that? > 3. If no, is there a clear inconsistency or wart that presents a > roadblock to learning the language that the idea would address? > > We're actually fairly conservative about what we add to the core > language and the standard library, so we greatly prefer ideas that > have been "battle tested" outside the standard library first. Py3k did > include a couple of experiments that are still in the process of > proving themselves (i.e. function annotations and new-style string > formatting), but such changes are definitely the exception rather than > the rule. > > In this case, it is trivial for someone to write a > collections.ChainMap based decorator that works as you describe: > > @contextmanager > def scope(chain=None): > ? ?if scope is None: > ? ? ? ?yield ChainMap({}) > ? ?else: > ? ? ? ?yield chain.new_child() > > with scope() as outer: > ? ?with scope(outer) as inner1: > ? ? ? ?# Manipulate inner1 without affecting outer > > However, that adds no real expressivity and is unlikely to be > particularly useful in practice, since most lexical scoping problems > can be handled using *actual* lexical scoping. ChainMap is useful for > cases like multiple levels of configuration data where the scoping > occurs at runtime rather than in the source code. Hiding a call > new_child() inside a context manager just to get an additional level > of indentation is fairly pointless. I meant it in-place, so the scope of the context is more than mere indentation: scope = ChainMap() scope['foo'] = 1 with scope: scope['foo'] = 2 with scope: scope['foo'] = 3 assert scope['foo'] == 2 assert scope['foo'] == 1 From lyricconch at gmail.com Mon Aug 8 07:27:06 2011 From: lyricconch at gmail.com (=?UTF-8?B?5rW36Z+1?=) Date: Mon, 8 Aug 2011 13:27:06 +0800 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: when function body access function name, it requires "name lookup", that is a "runtime" behavior. i would like that python offer some "compile time" behavior just like this proposal - things(here, it's function) declare by the "as clause" is always "runtime independent" and only visible on it's own suite(which means you can not use the "as declared" NAME outside its indent block). 2011/8/8 Nick Coghlan : > On Sun, Aug 7, 2011 at 11:07 PM, Guido van Rossum wrote: >> On Sun, Aug 7, 2011 at 8:46 AM, Nick Coghlan wrote: >>> With a PEP 3135 closure style solution, the cell reference would be >>> filled in at function definition time, so that part shouldn't be an >>> issue. >> >> Yes, I was thinking of something like that (though honestly I'd >> forgotten some of the details :-). > > I'd forgotten many of the details as well, but was tracking down some > super() strangeness recently (to answer a question Michael Foord > asked, IIRC) and had to look it up. > >> IMO there is no doubt that if __function__ were to exist it should >> reference the innermost function, i.e. the thing that was created by >> the 'def' statement before any decorators were applied. > > Yeah, I'd mostly realised that by the time I finished writing by last > message, but figured I'd record the train of thought that got me > there. > >>> Reference by name lazily accesses the outermost one, but doesn't care >>> how the decorators are applied (i.e. as part of the def statement or >>> via post decoration). >> >> What do you mean here by lazily? > > Just the fact that the reference isn't resolved until the function > executes rather than being resolved when it gets defined. > >>> A __class__ style cell reference to the result >>> of the 'def' statement would behave differently in the post decoration >>> case. >> >> Oh you were thinking of making it reference the result after >> decoration? Maybe I know too much about the implementation, but I >> would find that highly confusing. Do you even have a use case for >> that? If so, I think it should be a separate name, e.g. >> __decorated_function__. > > The only reason I was thinking that way is that currently, if you do > something like [1]: > > @lru_cache() > def fib(n): > ? ?if n < 2: > ? ? ? ?return n > ? ?return fib(n-1) + fib(n-2) > > then, at call time, 'fib' will resolve to the caching wrapper rather > than to the undecorated function. Using a reference to the undecorated > function instead (as would have to happen for a sane implementation of > __func__) would be actively harmful since the recursive calls would > bypass the cache unless the lru_cache decorator took steps to change > the way the reference evolved: > > @lru_cache() > def fib(n): > ? ?if n < 2: > ? ? ? ?return n > ? ?return __func__(n-1) + __func__(n-2) # Not the same, unless > lru_cache adjusts the reference > > This semantic mismatch has actually shifted my opinion from +0 to -1 > on the idea. Relying on normal name lookup can be occasionally > inconvenient, but it is at least clear what we're referring to. The > existence of wrapper functions means that "this function" isn't as > clear and unambiguous a phrase as it first seems. > > (I think the reason we get away with it in the PEP 3135 case is that > 'class wrappers' typically aren't handled via class decorators but via > metaclasses, which do a better job of playing nicely with the implicit > closure created to handle super() and __class__) > >>> While referencing the innermost function would likely be wrong in any >>> case involving function attributes, having the function in a valid >>> state during decoration will likely mandate filling in the cell >>> reference before invoking any decorators. Perhaps the best solution >>> would be to syntactically reference the innermost function, but >>> provide a clean way in functools to shift the cell reference to a >>> different function (with functools.wraps doing that automatically). >> >> Hm, making it dynamic sounds wrong. I think it makes more sense to >> just share the attribute dict (which is easily done through assignment >> to the wrapping function's __dict__). > > Huh, I hadn't even thought of that as a potential alternative to the > update() based approach currently used in functools.wraps (I had to > jump into the interactive interpreter to confirm that functions really > do let you swap out their instance dict). > > It's interesting that, once again, the status quo deals with this > according to ordinary name resolution rules: any wrapping of the > function will be ignored, *unless* we store the wrapper back into the > original location so the name resolution in the function body will see > it. > > Since the idea of implicitly sharing state between currently > independent wrapper functions scares me, this strikes me as another > reason to switch to '-1'. > >>> This does seem like an area ripe for subtle decoration related bugs >>> though, especially by contrast with lazy name based lookup. >> >> TBH, personally I am in most cases unhappy with the aggressive copying >> of docstring and other metadata from the wrapped function to the >> wrapper function, and wish the idiom had never been invented. > > IIRC, I was the one who actually committed the stdlib blessing of the > idiom in the form of 'functools.wraps'. It was definitely a hack to > deal with the increasing prevalence of wrapper functions as decorators > became more popular - naive introspection was giving too many wrong > answers and tweaking the recommended wrapping process so that > 'f.__doc__' would work again seemed like a better option than defining > a complex introspection protocol to handle wrapped functions. > > I still think it was a reasonable way forward (and better than leaving > things as they were), but it's definitely an approach with quite a few > flaws. > >>> While this may sound a little hypocritical coming from the author of >>> PEPs 366 and 395, I'm wary of adding new implicit module globals for >>> problems with relatively simple and robust alternatives. In this case, >>> it's fairly easy to get access to the current module using the idiom >>> Guido quoted: >>> >>> ? ?import sys >>> ? ?_this = sys.modules[__name__] >>> >>> (or using dict-style access on globals()) >> >> Yeah, well, in most cases I find having to reference sys.modules a >> distraction and an unwarranted jump into the implementation. It may >> not even work: there are some recipes that replace >> sys.modules[__name__] with some wrapper object. If __this_module__ >> existed it would of course refer to the "real" module object involved. > > Some invocations of runpy.run_module also cause the 'sys.modules' > based idioms to fail, so there may be a case to be made for this one. > I suspect some folks would use it to avoid global declarations as well > (i.e. by just writing '__module__.x = y'). > > It might cause the cyclic GC some grief, though,so the implementation > consequences would need to be investigated if someone wanted to pursue > it. > > Cheers, > Nick. > > [1] http://docs.python.org/dev/library/functools.html#functools.lru_cache > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ncoghlan at gmail.com Mon Aug 8 09:20:50 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Aug 2011 17:20:50 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Mon, Aug 8, 2011 at 3:27 PM, ?? wrote: > when function body access function name, it requires "name lookup", > that is a "runtime" behavior. > i would like that python offer some "compile time" behavior just like > this proposal - things(here, it's function) declare by the "as clause" > is always "runtime independent" and only visible on it's own > suite(which means you can not use the "as declared" NAME outside its > indent block). While a language could certainly be defined around such behaviour, that language isn't Python (Py3k except clauses notwithstanding). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Mon Aug 8 15:07:46 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Aug 2011 09:07:46 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 7:56 PM, Nick Coghlan wrote: > On Sun, Aug 7, 2011 at 11:07 PM, Guido van Rossum wrote: >> On Sun, Aug 7, 2011 at 8:46 AM, Nick Coghlan wrote: >>> With a PEP 3135 closure style solution, the cell reference would be >>> filled in at function definition time, so that part shouldn't be an >>> issue. >> >> Yes, I was thinking of something like that (though honestly I'd >> forgotten some of the details :-). > > I'd forgotten many of the details as well, but was tracking down some > super() strangeness recently (to answer a question Michael Foord > asked, IIRC) and had to look it up. > >> IMO there is no doubt that if __function__ were to exist it should >> reference the innermost function, i.e. the thing that was created by >> the 'def' statement before any decorators were applied. > > Yeah, I'd mostly realised that by the time I finished writing by last > message, but figured I'd record the train of thought that got me > there. > >>> Reference by name lazily accesses the outermost one, but doesn't care >>> how the decorators are applied (i.e. as part of the def statement or >>> via post decoration). >> >> What do you mean here by lazily? > > Just the fact that the reference isn't resolved until the function > executes rather than being resolved when it gets defined. > >>> A __class__ style cell reference to the result >>> of the 'def' statement would behave differently in the post decoration >>> case. >> >> Oh you were thinking of making it reference the result after >> decoration? Maybe I know too much about the implementation, but I >> would find that highly confusing. Do you even have a use case for >> that? If so, I think it should be a separate name, e.g. >> __decorated_function__. > > The only reason I was thinking that way is that currently, if you do > something like [1]: > > @lru_cache() > def fib(n): > ? ?if n < 2: > ? ? ? ?return n > ? ?return fib(n-1) + fib(n-2) > > then, at call time, 'fib' will resolve to the caching wrapper rather > than to the undecorated function. Using a reference to the undecorated > function instead (as would have to happen for a sane implementation of > __func__) would be actively harmful since the recursive calls would > bypass the cache unless the lru_cache decorator took steps to change > the way the reference evolved: > > @lru_cache() > def fib(n): > ? ?if n < 2: > ? ? ? ?return n > ? ?return __func__(n-1) + __func__(n-2) # Not the same, unless lru_cache adjusts the reference How would the the reference be adjusted? > This semantic mismatch has actually shifted my opinion from +0 to -1 > on the idea. Relying on normal name lookup can be occasionally > inconvenient, but it is at least clear what we're referring to. The > existence of wrapper functions means that "this function" isn't as > clear and unambiguous a phrase as it first seems. To me it just means that __func__ will remain esoteric, which is just fine with me. I wouldn't be surprised if there were use cases where it was *desirable* to have a way (from the inside) to access the undecorated function (somewhat similar to the thing with modules below). Also I really don't want the semantics of decorators to depart from the original "define the function, then apply this to it" thing. And I don't want to have to think about the possibility of __func__ being overridden by the wrapping decorator either (or by anything else). > (I think the reason we get away with it in the PEP 3135 case is that > 'class wrappers' typically aren't handled via class decorators but via > metaclasses, which do a better job of playing nicely with the implicit > closure created to handle super() and __class__) We didn't have class decorators then did we? Anyway I'm not sure what the semantics are, but I hope they will be such that __class__ references the undecorated, original class object used when the method was being defined. (If the class statement is executed repeatedly the __class__ should always refer to the "real" class actually involved in the method call.) >>> While referencing the innermost function would likely be wrong in any >>> case involving function attributes, having the function in a valid >>> state during decoration will likely mandate filling in the cell >>> reference before invoking any decorators. Perhaps the best solution >>> would be to syntactically reference the innermost function, but >>> provide a clean way in functools to shift the cell reference to a >>> different function (with functools.wraps doing that automatically). >> >> Hm, making it dynamic sounds wrong. I think it makes more sense to >> just share the attribute dict (which is easily done through assignment >> to the wrapping function's __dict__). > > Huh, I hadn't even thought of that as a potential alternative to the > update() based approach currently used in functools.wraps (I had to > jump into the interactive interpreter to confirm that functions really > do let you swap out their instance dict). Me too. :-) But I did remember that we might have made it that way, possibly for this very use case. > It's interesting that, once again, the status quo deals with this > according to ordinary name resolution rules: any wrapping of the > function will be ignored, *unless* we store the wrapper back into the > original location so the name resolution in the function body will see > it. This makes sense because it builds complex functionality out of simpler building blocks. Combining two things together doesn't add any extra magic -- it's the building blocks themselves that add the magic. > Since the idea of implicitly sharing state between currently > independent wrapper functions scares me, this strikes me as another > reason to switch to '-1'. I'm still wavering between -0 and +0; I see some merit but I think the high hopes of some folks for __func__ are unwarranted. Using the same cell-based mechanism as used for __class__ may or may not be the right implementation but I don't think that additional hacks based on mutating that cell should be considered. So it would really be a wash how it was done (at call time or at func def time). Are you aware of anything that mutates the __class__ cell? It would seem pretty tricky to do. FWIW I don't think I want __func__ to be available at all times, like someone (the OP?) mentioned. That seems an unnecessary slowdown of every call / increase of every frame. >>> This does seem like an area ripe for subtle decoration related bugs >>> though, especially by contrast with lazy name based lookup. >> >> TBH, personally I am in most cases unhappy with the aggressive copying >> of docstring and other metadata from the wrapped function to the >> wrapper function, and wish the idiom had never been invented. > > IIRC, I was the one who actually committed the stdlib blessing of the > idiom in the form of 'functools.wraps'. It was definitely a hack to > deal with the increasing prevalence of wrapper functions as decorators > became more popular - naive introspection was giving too many wrong > answers and tweaking the recommended wrapping process so that > 'f.__doc__' would work again seemed like a better option than defining > a complex introspection protocol to handle wrapped functions. I guess you rely more on interactive features like help() whereas I rely more on browsing the source code. :-) > I still think it was a reasonable way forward (and better than leaving > things as they were), but it's definitely an approach with quite a few > flaws. You are forgiven. :-) >>> While this may sound a little hypocritical coming from the author of >>> PEPs 366 and 395, I'm wary of adding new implicit module globals for >>> problems with relatively simple and robust alternatives. In this case, >>> it's fairly easy to get access to the current module using the idiom >>> Guido quoted: >>> >>> ? ?import sys >>> ? ?_this = sys.modules[__name__] >>> >>> (or using dict-style access on globals()) >> >> Yeah, well, in most cases I find having to reference sys.modules a >> distraction and an unwarranted jump into the implementation. It may >> not even work: there are some recipes that replace >> sys.modules[__name__] with some wrapper object. If __this_module__ >> existed it would of course refer to the "real" module object involved. > > Some invocations of runpy.run_module also cause the 'sys.modules' > based idioms to fail, so there may be a case to be made for this one. > I suspect some folks would use it to avoid global declarations as well > (i.e. by just writing '__module__.x = y'). +1. But what to call it? __module__ is a string in other places. > It might cause the cyclic GC some grief, though,so the implementation > consequences would need to be investigated if someone wanted to pursue > it. Modules are already involved in much cyclical GC grief, and most have an infinite lifetime anyway (sys.modules keeps them alive). I doubt it will get any worse. > Cheers, > Nick. > > [1] http://docs.python.org/dev/library/functools.html#functools.lru_cache -- --Guido van Rossum (python.org/~guido) From aquavitae69 at gmail.com Mon Aug 8 21:26:02 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 8 Aug 2011 21:26:02 +0200 Subject: [Python-ideas] Fwd: Re: Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: Sorry, this should have been a reply to all! ---------- Forwarded message ---------- From: "David Townshend" Date: Aug 8, 2011 9:24 PM Subject: Re: [Python-ideas] Access to function objects To: "Guido van Rossum" On Aug 8, 2011 3:08 PM, "Guido van Rossum" wrote: > > On Sun, Aug 7, 2011 at 7:56 PM, Nick Coghlan wrote: > > On Sun, Aug 7, 2011 at 11:07 PM, Guido van Rossum wrote: > >> On Sun, Aug 7, 2011 at 8:46 AM, Nick Coghlan wrote: > >>> With a PEP 3135 closure style solution, the cell reference would be > >>> filled in at function definition time, so that part shouldn't be an > >>> issue. > >> > >> Yes, I was thinking of something like that (though honestly I'd > >> forgotten some of the details :-). > > > > I'd forgotten many of the details as well, but was tracking down some > > super() strangeness recently (to answer a question Michael Foord > > asked, IIRC) and had to look it up. > > > >> IMO there is no doubt that if __function__ were to exist it should > >> reference the innermost function, i.e. the thing that was created by > >> the 'def' statement before any decorators were applied. > > > > Yeah, I'd mostly realised that by the time I finished writing by last > > message, but figured I'd record the train of thought that got me > > there. > > > >>> Reference by name lazily accesses the outermost one, but doesn't care > >>> how the decorators are applied (i.e. as part of the def statement or > >>> via post decoration). > >> > >> What do you mean here by lazily? > > > > Just the fact that the reference isn't resolved until the function > > executes rather than being resolved when it gets defined. > > > >>> A __class__ style cell reference to the result > >>> of the 'def' statement would behave differently in the post decoration > >>> case. > >> > >> Oh you were thinking of making it reference the result after > >> decoration? Maybe I know too much about the implementation, but I > >> would find that highly confusing. Do you even have a use case for > >> that? If so, I think it should be a separate name, e.g. > >> __decorated_function__. > > > > The only reason I was thinking that way is that currently, if you do > > something like [1]: > > > > @lru_cache() > > def fib(n): > > if n < 2: > > return n > > return fib(n-1) + fib(n-2) > > > > then, at call time, 'fib' will resolve to the caching wrapper rather > > than to the undecorated function. Using a reference to the undecorated > > function instead (as would have to happen for a sane implementation of > > __func__) would be actively harmful since the recursive calls would > > bypass the cache unless the lru_cache decorator took steps to change > > the way the reference evolved: > > > > @lru_cache() > > def fib(n): > > if n < 2: > > return n > > return __func__(n-1) + __func__(n-2) # Not the same, unless lru_cache adjusts the reference > > How would the the reference be adjusted? > > > This semantic mismatch has actually shifted my opinion from +0 to -1 > > on the idea. Relying on normal name lookup can be occasionally > > inconvenient, but it is at least clear what we're referring to. The > > existence of wrapper functions means that "this function" isn't as > > clear and unambiguous a phrase as it first seems. > > To me it just means that __func__ will remain esoteric, which is just > fine with me. I wouldn't be surprised if there were use cases where it > was *desirable* to have a way (from the inside) to access the > undecorated function (somewhat similar to the thing with modules > below). > > Also I really don't want the semantics of decorators to depart from > the original "define the function, then apply this to it" thing. And I > don't want to have to think about the possibility of __func__ being > overridden by the wrapping decorator either (or by anything else). > > > (I think the reason we get away with it in the PEP 3135 case is that > > 'class wrappers' typically aren't handled via class decorators but via > > metaclasses, which do a better job of playing nicely with the implicit > > closure created to handle super() and __class__) > > We didn't have class decorators then did we? Anyway I'm not sure what > the semantics are, but I hope they will be such that __class__ > references the undecorated, original class object used when the method > was being defined. (If the class statement is executed repeatedly the > __class__ should always refer to the "real" class actually involved in > the method call.) > > >>> While referencing the innermost function would likely be wrong in any > >>> case involving function attributes, having the function in a valid > >>> state during decoration will likely mandate filling in the cell > >>> reference before invoking any decorators. Perhaps the best solution > >>> would be to syntactically reference the innermost function, but > >>> provide a clean way in functools to shift the cell reference to a > >>> different function (with functools.wraps doing that automatically). > >> > >> Hm, making it dynamic sounds wrong. I think it makes more sense to > >> just share the attribute dict (which is easily done through assignment > >> to the wrapping function's __dict__). > > > > Huh, I hadn't even thought of that as a potential alternative to the > > update() based approach currently used in functools.wraps (I had to > > jump into the interactive interpreter to confirm that functions really > > do let you swap out their instance dict). > > Me too. :-) > > But I did remember that we might have made it that way, possibly for > this very use case. > > > It's interesting that, once again, the status quo deals with this > > according to ordinary name resolution rules: any wrapping of the > > function will be ignored, *unless* we store the wrapper back into the > > original location so the name resolution in the function body will see > > it. > > This makes sense because it builds complex functionality out of > simpler building blocks. Combining two things together doesn't add any > extra magic -- it's the building blocks themselves that add the magic. > > > Since the idea of implicitly sharing state between currently > > independent wrapper functions scares me, this strikes me as another > > reason to switch to '-1'. > > I'm still wavering between -0 and +0; I see some merit but I think the > high hopes of some folks for __func__ are unwarranted. Using the same > cell-based mechanism as used for __class__ may or may not be the right > implementation but I don't think that additional hacks based on > mutating that cell should be considered. So it would really be a wash > how it was done (at call time or at func def time). Are you aware of > anything that mutates the __class__ cell? It would seem pretty tricky > to do. > > FWIW I don't think I want __func__ to be available at all times, like > someone (the OP?) mentioned. That seems an unnecessary slowdown of > every call / increase of every frame. > Yes, that was my idea, (hence the "as" syntax). However, this discussion is getting a bit out of my depth now and I don't really know the implications of my suggestion! > >>> This does seem like an area ripe for subtle decoration related bugs > >>> though, especially by contrast with lazy name based lookup. > >> > >> TBH, personally I am in most cases unhappy with the aggressive copying > >> of docstring and other metadata from the wrapped function to the > >> wrapper function, and wish the idiom had never been invented. > > > > IIRC, I was the one who actually committed the stdlib blessing of the > > idiom in the form of 'functools.wraps'. It was definitely a hack to > > deal with the increasing prevalence of wrapper functions as decorators > > became more popular - naive introspection was giving too many wrong > > answers and tweaking the recommended wrapping process so that > > 'f.__doc__' would work again seemed like a better option than defining > > a complex introspection protocol to handle wrapped functions. > > I guess you rely more on interactive features like help() whereas I > rely more on browsing the source code. :-) > > > I still think it was a reasonable way forward (and better than leaving > > things as they were), but it's definitely an approach with quite a few > > flaws. > > You are forgiven. :-) > > >>> While this may sound a little hypocritical coming from the author of > >>> PEPs 366 and 395, I'm wary of adding new implicit module globals for > >>> problems with relatively simple and robust alternatives. In this case, > >>> it's fairly easy to get access to the current module using the idiom > >>> Guido quoted: > >>> > >>> import sys > >>> _this = sys.modules[__name__] > >>> > >>> (or using dict-style access on globals()) > >> > >> Yeah, well, in most cases I find having to reference sys.modules a > >> distraction and an unwarranted jump into the implementation. It may > >> not even work: there are some recipes that replace > >> sys.modules[__name__] with some wrapper object. If __this_module__ > >> existed it would of course refer to the "real" module object involved. > > > > Some invocations of runpy.run_module also cause the 'sys.modules' > > based idioms to fail, so there may be a case to be made for this one. > > I suspect some folks would use it to avoid global declarations as well > > (i.e. by just writing '__module__.x = y'). > > +1. But what to call it? __module__ is a string in other places. > > > It might cause the cyclic GC some grief, though,so the implementation > > consequences would need to be investigated if someone wanted to pursue > > it. > > Modules are already involved in much cyclical GC grief, and most have > an infinite lifetime anyway (sys.modules keeps them alive). I doubt it > will get any worse. > > > Cheers, > > Nick. > > > > [1] http://docs.python.org/dev/library/functools.html#functools.lru_cache > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Aug 9 01:00:17 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 9 Aug 2011 09:00:17 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Mon, Aug 8, 2011 at 11:07 PM, Guido van Rossum wrote: > On Sun, Aug 7, 2011 at 7:56 PM, Nick Coghlan wrote: >> then, at call time, 'fib' will resolve to the caching wrapper rather >> than to the undecorated function. Using a reference to the undecorated >> function instead (as would have to happen for a sane implementation of >> __func__) would be actively harmful since the recursive calls would >> bypass the cache unless the lru_cache decorator took steps to change >> the way the reference evolved: >> >> @lru_cache() >> def fib(n): >> ? ?if n < 2: >> ? ? ? ?return n >> ? ?return __func__(n-1) + __func__(n-2) # Not the same, unless lru_cache adjusts the reference > > How would the the reference be adjusted? I was thinking of Michael's blog post about modifying cell contents [1], but I had forgotten that the write operation required mucking about with ctypes (since the cell_contents attribute of the cell is read only at the Python level). That's actually a good thing, since it means a cell based __func__ reference would consistently refer to the innermost function, ignoring any wrapper functions. [1] http://www.voidspace.org.uk/python/weblog/arch_d7_2011_05_28.shtml#e1214 >> This semantic mismatch has actually shifted my opinion from +0 to -1 >> on the idea. Relying on normal name lookup can be occasionally >> inconvenient, but it is at least clear what we're referring to. The >> existence of wrapper functions means that "this function" isn't as >> clear and unambiguous a phrase as it first seems. > > To me it just means that __func__ will remain esoteric, which is just > fine with me. I wouldn't be surprised if there were use cases where it > was *desirable* to have a way (from the inside) to access the > undecorated function (somewhat similar to the thing with modules > below). Yeah, now that I remember that you have to use the C API in order to monkey with the contents of a cell reference, I'm significantly happier with the idea that __func__ could be given solid 'always refers to the innermost unwrapped function definition' semantics. Referring to the function by name would remain the way to access the potentially wrapped version that is stored in the containing namespace. That's enough to get me back to -0. The reason I remain slightly negative is that I'd like to see some concrete use cases where ignoring wrapper functions is the right thing to do - every case that comes to mind for me is like the lru_cache() Fibonacci example, where bypassing the wrapper functions is precisely the *wrong* thing to do. >> (I think the reason we get away with it in the PEP 3135 case is that >> 'class wrappers' typically aren't handled via class decorators but via >> metaclasses, which do a better job of playing nicely with the implicit >> closure created to handle super() and __class__) > > We didn't have class decorators then did we? Anyway I'm not sure what > the semantics are, but I hope they will be such that __class__ > references the undecorated, original class object used when the method > was being defined. (If the class statement is executed repeatedly the > __class__ should always refer to the "real" class actually involved in > the method call.) Yeah, __class__ always refers to the original class object, as created by calling the metaclass. The idiom seems to be that people don't use class decorators to wrap classes anyway, as metaclasses are a better tool for that kind of thing - decorators are more used for things like registration or attribute modifications. >> Since the idea of implicitly sharing state between currently >> independent wrapper functions scares me, this strikes me as another >> reason to switch to '-1'. > > I'm still wavering between -0 and +0; I see some merit but I think the > high hopes of some folks for __func__ are unwarranted. Using the same > cell-based mechanism as used for __class__ may or may not be the right > implementation but I don't think that additional hacks based on > mutating that cell should be considered. So it would really be a wash > how it was done (at call time or at func def time). Are you aware of > anything that mutates the __class__ cell? It would seem pretty tricky > to do. No, I was misremembering how Michael's cell content modification trick worked, and that was throwing off my opinion of how easy it was to mess with the cell contents. Once people start using ctypes to access the C API all bets are off anyway. > FWIW I don't think I want __func__ to be available at all times, like > someone (the OP?) mentioned. That seems an unnecessary slowdown of > every call / increase of every frame. Yeah, I think at least that part of the PEP 3135 approach should be copied, even if the implementation ended up being different. So the short version of my current opinion would be: __func__: -0 - this discussion has pretty much sorted out what the semantics of such a reference would be - we know at least one way to implement it that works (cell based, modelled on PEP 3135's __class__ reference) - lacking concrete use cases where it is demonstrably superior to reference by name lookup in the containing scope (given that many use cases are forced into the use of name lookup in order to refer to a wrapped version of the function) __this_module__: +0 - the sys.modules[__name__] approach is obscure and distracting when reading code - there are cases where that approach is unreliable (e.g. involving runpy.run_module with sys module alteration disabled) - obvious naming (i.e. __module__) is problematic, since class and function __module__ attributes are strings - will need to be careful to avoid creating uncollectable garbage due to the cyclic reference between the module and its global namespace (but shouldn't be any worse than the cycle created by any module that imports the sys module) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ben+python at benfinney.id.au Tue Aug 9 02:09:23 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2011 10:09:23 +1000 Subject: [Python-ideas] Access to function objects References: <4E3E2445.2070403@pearwood.info> Message-ID: <87d3gfjvik.fsf@benfinney.id.au> Guido van Rossum writes: > To me it just means that __func__ will remain esoteric, which is just > fine with me. I wouldn't be surprised if there were use cases where it > was *desirable* to have a way (from the inside) to access the > undecorated function (somewhat similar to the thing with modules > below). Perhaps I misunderstand the functionality, but: The ability to access the undecorated function would be of great help in our Django-based project. Django is very decorator-happy, and we are trying to introspect code and, given a particular URL, determine the innermost function (more precisely, what location in our code) that actually handles that URL. Is that a use-case of interest? -- \ ?Always code as if the guy who ends up maintaining your code | `\ will be a violent psychopath who knows where you live.? ?John | _o__) F. Woods | Ben Finney From steve at pearwood.info Tue Aug 9 02:42:47 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 09 Aug 2011 10:42:47 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: <87d3gfjvik.fsf@benfinney.id.au> References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> Message-ID: <4E408287.1040104@pearwood.info> Ben Finney wrote: > The ability to access the undecorated function would be of great help in > our Django-based project. Django is very decorator-happy, and we are > trying to introspect code and, given a particular URL, determine the > innermost function (more precisely, what location in our code) that > actually handles that URL. > > Is that a use-case of interest? I shouldn't think the proposed __func__ local would be of any use here, because it is local to the function. As I understand your use-case, you need to inspect a function from the outside, where __func__ is not available, and peel back all the decorators to the innermost function. To do that, I would start by looking at: function.__closure__[0].cell_contents and recurse as necessary. -- Steven From guido at python.org Tue Aug 9 02:55:55 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Aug 2011 20:55:55 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: <4E408287.1040104@pearwood.info> References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> Message-ID: On Mon, Aug 8, 2011 at 8:42 PM, Steven D'Aprano wrote: > Ben Finney wrote: > >> The ability to access the undecorated function would be of great help in >> our Django-based project. Django is very decorator-happy, and we are >> trying to introspect code and, given a particular URL, determine the >> innermost function (more precisely, what location in our code) that >> actually handles that URL. >> >> Is that a use-case of interest? > > I shouldn't think the proposed __func__ local would be of any use here, > because it is local to the function. As I understand your use-case, you need > to inspect a function from the outside, where __func__ is not available, and > peel back all the decorators to the innermost function. > > To do that, I would start by looking at: > > function.__closure__[0].cell_contents > > and recurse as necessary. AFAIK there is no fool-proof way to peel back a decorator; decorators aren't required to have any specific implementation. If the use case was important it would reopen the question of always making this value available as a frame attribute. But I'm not sure why the line number indicated by the code object (which *is* accessible) is insufficient. -- --Guido van Rossum (python.org/~guido) From ben+python at benfinney.id.au Tue Aug 9 03:36:01 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2011 11:36:01 +1000 Subject: [Python-ideas] Access to function objects References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> Message-ID: <87vcu7icxq.fsf@benfinney.id.au> Guido van Rossum writes: > AFAIK there is no fool-proof way to peel back a decorator; decorators > aren't required to have any specific implementation. > > If the use case was important it would reopen the question of always > making this value available as a frame attribute. But I'm not sure why > the line number indicated by the code object (which *is* accessible) > is insufficient. Which code object, though? If we can't get at the innermost function (which is the code location we're interested in) handling the URL, how will we get its line number? -- \ ?I bought some batteries, but they weren't included; so I had | `\ to buy them again.? ?Steven Wright | _o__) | Ben Finney From ncoghlan at gmail.com Tue Aug 9 05:25:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 9 Aug 2011 13:25:16 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: <87vcu7icxq.fsf@benfinney.id.au> References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> <87vcu7icxq.fsf@benfinney.id.au> Message-ID: On Tue, Aug 9, 2011 at 11:36 AM, Ben Finney wrote: > Which code object, though? If we can't get at the innermost function > (which is the code location we're interested in) handling the URL, how > will we get its line number? In general, you can't, since Python can't readily tell the difference between decorators that wrap a function and those that replace it completely. Even the '__wrapped__' convention adopted by recent incarnations of functools.wraps (primarily to expose the function underlying lru_cache) only works for genuine wrapper functions that use that decorator. >From outside, a cell referencing the current function (if it was implemented that way) wouldn't tell you anything new, since the following identity would hold: f is f.__closure__[f.__code__.co_freevars.index('__func__')].cell_contents On any given function, *if* the new cell existed at all, it would refer to that specific function, not an inner one. Cheers, Nick. P.S. An example with the existing PEP 3135 implicit cell creation: >>> class C: ... def f(self): ... print(__class__) ... >>> f = C.f >>> C is f.__closure__[f.__code__.co_freevars.index('__class__')].cell_contents True -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ben+python at benfinney.id.au Tue Aug 9 06:01:31 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 09 Aug 2011 14:01:31 +1000 Subject: [Python-ideas] Access to function objects References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> <87vcu7icxq.fsf@benfinney.id.au> Message-ID: <87r54vi678.fsf@benfinney.id.au> Nick Coghlan writes: > On Tue, Aug 9, 2011 at 11:36 AM, Ben Finney wrote: > > Which code object, though? If we can't get at the innermost function > > (which is the code location we're interested in) handling the URL, > > how will we get its line number? > > In general, you can't [?] Thanks. I don't know what Guido was referring to, then, with: > But I'm not sure why the line number indicated by the code object > (which *is* accessible) is insufficient. Since the inner function's code object isn't reliably available, it's insufficient for reliably getting to the undecorated function which handles a specific Django URL. -- \ ?? whoever claims any right that he is unwilling to accord to | `\ his fellow-men is dishonest and infamous.? ?Robert G. | _o__) Ingersoll, _The Liberty of Man, Woman and Child_, 1877 | Ben Finney From tjreedy at udel.edu Tue Aug 9 09:15:05 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 09 Aug 2011 03:15:05 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: <87r54vi678.fsf@benfinney.id.au> References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> <87vcu7icxq.fsf@benfinney.id.au> <87r54vi678.fsf@benfinney.id.au> Message-ID: On 8/9/2011 12:01 AM, Ben Finney wrote: > Since the inner function's code object isn't reliably available, it's > insufficient for reliably getting to the undecorated function which > handles a specific Django URL. I think you should try to keep an overt reference to the original function. -- Terry Jan Reedy From guido at python.org Tue Aug 9 13:44:27 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 9 Aug 2011 07:44:27 -0400 Subject: [Python-ideas] Access to function objects In-Reply-To: <87r54vi678.fsf@benfinney.id.au> References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> <87vcu7icxq.fsf@benfinney.id.au> <87r54vi678.fsf@benfinney.id.au> Message-ID: I was thinking from inside the executing code. It seems that your problem is how to get to the function object underneath the stack of decorators given an object that is the fully-decorated function. Never mind then. On Tue, Aug 9, 2011 at 12:01 AM, Ben Finney wrote: > Nick Coghlan writes: > >> On Tue, Aug 9, 2011 at 11:36 AM, Ben Finney wrote: >> > Which code object, though? If we can't get at the innermost function >> > (which is the code location we're interested in) handling the URL, >> > how will we get its line number? >> >> In general, you can't [?] > > Thanks. I don't know what Guido was referring to, then, with: > >> But I'm not sure why the line number indicated by the code object >> (which *is* accessible) is insufficient. > > Since the inner function's code object isn't reliably available, it's > insufficient for reliably getting to the undecorated function which > handles a specific Django URL. > > -- > ?\ ? ? ? ?? whoever claims any right that he is unwilling to accord to | > ?`\ ? ? ? ? ? ? his fellow-men is dishonest and infamous.? ?Robert G. | > _o__) ? ? ? ? ? Ingersoll, _The Liberty of Man, Woman and Child_, 1877 | > Ben Finney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From benjamin at python.org Tue Aug 9 18:36:24 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 9 Aug 2011 16:36:24 +0000 (UTC) Subject: [Python-ideas] Add an identity function References: Message-ID: dag.odenhall at ... writes: > > Yes, I know, it's merely a (lambda x: x), but I find a need for this > often enough that dedicated, documented function would help > readability and encourage certain patterns. Whether it should go in > builtins or functools is debatable. http://mail.python.org/pipermail/python-ideas/2009-March/003646.html From ron3200 at gmail.com Tue Aug 9 19:46:58 2011 From: ron3200 at gmail.com (ron3200) Date: Tue, 09 Aug 2011 12:46:58 -0500 Subject: [Python-ideas] Access to function objects In-Reply-To: References: Message-ID: <1312912018.6042.51.camel@Gutsy> On Sun, 2011-08-07 at 22:14 +1000, Nick Coghlan wrote: > On Sun, Aug 7, 2011 at 4:17 PM, Terry Reedy wrote: > > On 8/7/2011 12:32 AM, Eric Snow wrote: > >> Of the three code blocks, functions are the only ones for whom the > >> resulting object and the execution of the code block are separate. So > >> a code object could be executing for the original function or a > >> different one that is sharing the code object. > > > > Now I remember that the separation between code object and function object > > and the possibility of reusing code objects has been given as a reason to > > reject the idea. On the other hand, reusing code objects is so rare that I > > question the need to cater to it much. > > Nested function objects and class definitions say 'Hi!' - they reuse > code blocks all the time. In the following code: > > def decorator(f): > @wraps(f) > def wrapper(*args, **kwds): > return f(*args, **kwds) > return wrapper > > All of the 'wrapper' instances share a single code object (stored as a > constant in the code object for 'decorator'). > > If a proposal suggests storing mutable state on a code object it's > time to stop and think of a new way (or drop the idea entirely). It seems to me, many of the attempts to improve functions involve making them more like objects. Since functions are a key building block of class's, I'd rather see functions simplified rather than made more complex. Which probably isn't possible at this time. As for a new way/direction ... Maybe a simplified function as a method may be possible? With that, it then becomes possible to have a Function class that people can alter by adding additional methods to. Then "def foo(): pass" could possibly be short for ... class f(Function): method __call__(self): #can this be more efficient over def? pass foo = f() Yes, functions are objects now, but modifying them is almost always hackish. Maybe moving in this direction can lead to some non-hackish alternatives for those use cases. Cheers, Ron From ben+python at benfinney.id.au Wed Aug 10 06:47:39 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 10 Aug 2011 14:47:39 +1000 Subject: [Python-ideas] Access to function objects References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> <87vcu7icxq.fsf@benfinney.id.au> <87r54vi678.fsf@benfinney.id.au> Message-ID: <87k4alj2j8.fsf@benfinney.id.au> Guido van Rossum writes: > It seems that your problem is how to get to the function object > underneath the stack of decorators given an object that is the > fully-decorated function. Never mind then. Yes. I'd still like a solution to that. -- \ ?Broken promises don't upset me. I just think, why did they | `\ believe me?? ?Jack Handey | _o__) | Ben Finney From ncoghlan at gmail.com Wed Aug 10 09:09:57 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 10 Aug 2011 17:09:57 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: <87k4alj2j8.fsf@benfinney.id.au> References: <4E3E2445.2070403@pearwood.info> <87d3gfjvik.fsf@benfinney.id.au> <4E408287.1040104@pearwood.info> <87vcu7icxq.fsf@benfinney.id.au> <87r54vi678.fsf@benfinney.id.au> <87k4alj2j8.fsf@benfinney.id.au> Message-ID: On Wed, Aug 10, 2011 at 2:47 PM, Ben Finney wrote: > Guido van Rossum writes: > >> It seems that your problem is how to get to the function object >> underneath the stack of decorators given an object that is the >> fully-decorated function. Never mind then. > > Yes. I'd still like a solution to that. The thing is that there isn't a general purpose way to resolve that question. Consider a wrapper like functools.lru_cache. It has 3 downcalls to other functions: tuple, sorted and the decorated user function. Without an explicit convention (like the __wrapped__ convention adopted by functools in 3.2), there's no way for external code to follow the chain downwards. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From python at mrabarnett.plus.com Thu Aug 11 15:02:42 2011 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 11 Aug 2011 14:02:42 +0100 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> Message-ID: <4E43D2F2.1090004@mrabarnett.plus.com> On 11/08/2011 05:16, Chris Rebert wrote: > On Wed, Aug 10, 2011 at 7:52 PM, Yingjie Lan wrote: >> :And if we require {} then truly free indentation should be OK too! But >> >> :it wouldn't be Python any more. >> >> Of course, but not the case with ';'. Currently ';' is optional in Python, > > I think of it more as that Python deigns to permit semicolons. > >> But '{' is used for dicts. Clearly, ';' and '{' are different in magnitude. >> >> So the decision is: shall we change ';' from optional to mandatory >> to allow free line splitting? > > Hell no, considering that the sizable majority of lines *aren't* > split, which makes those semicolons completely redundant to their > accompanying newlines. We'd be practicing poor Huffman coding by > optimizing for the *un*common case. It would also add punctuational > noise to what is otherwise an amazingly clean and readable syntax. > Accidental semicolon omission is (IMO) the most irritating source of > syntax (and, inadvertently, sometimes other more serious) errors in > curly-braced programming languages. > +1 > Such a core syntax feature is not going to be changed lightly (or likely ever). > I'm glad to hear that. :-) Although Python's use of indentation has its downside, we gain much more then we lose, IMHO. From anacrolix at gmail.com Thu Aug 11 16:28:04 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 12 Aug 2011 00:28:04 +1000 Subject: [Python-ideas] allow line break at operators In-Reply-To: <4E43D2F2.1090004@mrabarnett.plus.com> References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: +0.5 The "trailing \" workaround is nonobvious. Wrapping in () is noisy and already heavily used by other syntactical structures. Since a final ':' is needed anyway, i think this would be great. if a and b or c: do stuff() On Thu, Aug 11, 2011 at 11:02 PM, MRAB wrote: > On 11/08/2011 05:16, Chris Rebert wrote: >> >> On Wed, Aug 10, 2011 at 7:52 PM, Yingjie Lan ?wrote: >>> >>> :And if we require {} then truly free indentation should be OK too! But >>> >>> :it wouldn't be Python any more. >>> >>> Of course, but not the case with ';'. Currently ';' is optional in >>> Python, >> >> I think of it more as that Python deigns to permit semicolons. >> >>> But '{' is used for dicts. Clearly, ';' and '{' are different in >>> magnitude. >>> >>> So the decision is: shall we change ';' from optional to mandatory >>> to allow free line splitting? >> >> Hell no, considering that the sizable majority of lines *aren't* >> split, which makes those semicolons completely redundant to their >> accompanying newlines. We'd be practicing poor Huffman coding by >> optimizing for the *un*common case. It would also add punctuational >> noise to what is otherwise an amazingly clean and readable syntax. >> Accidental semicolon omission is (IMO) the most irritating source of >> syntax (and, inadvertently, sometimes other more serious) errors in >> curly-braced programming languages. >> > +1 >> >> Such a core syntax feature is not going to be changed lightly (or likely >> ever). >> > I'm glad to hear that. :-) > > Although Python's use of indentation has its downside, we gain much > more then we lose, IMHO. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From jkbbwr at gmail.com Thu Aug 11 17:42:38 2011 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Thu, 11 Aug 2011 16:42:38 +0100 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: -1 This idea seems like it would remove the true readability of python. Personally it would create more confusion than it would remove. On Thu, Aug 11, 2011 at 3:28 PM, Matt Joiner wrote: > +0.5 > > The "trailing \" workaround is nonobvious. Wrapping in () is noisy and > already heavily used by other syntactical structures. Since a final > ':' is needed anyway, i think this would be great. > > if a > ?and b > ?or c: > ?do stuff() > > On Thu, Aug 11, 2011 at 11:02 PM, MRAB wrote: >> On 11/08/2011 05:16, Chris Rebert wrote: >>> >>> On Wed, Aug 10, 2011 at 7:52 PM, Yingjie Lan ?wrote: >>>> >>>> :And if we require {} then truly free indentation should be OK too! But >>>> >>>> :it wouldn't be Python any more. >>>> >>>> Of course, but not the case with ';'. Currently ';' is optional in >>>> Python, >>> >>> I think of it more as that Python deigns to permit semicolons. >>> >>>> But '{' is used for dicts. Clearly, ';' and '{' are different in >>>> magnitude. >>>> >>>> So the decision is: shall we change ';' from optional to mandatory >>>> to allow free line splitting? >>> >>> Hell no, considering that the sizable majority of lines *aren't* >>> split, which makes those semicolons completely redundant to their >>> accompanying newlines. We'd be practicing poor Huffman coding by >>> optimizing for the *un*common case. It would also add punctuational >>> noise to what is otherwise an amazingly clean and readable syntax. >>> Accidental semicolon omission is (IMO) the most irritating source of >>> syntax (and, inadvertently, sometimes other more serious) errors in >>> curly-braced programming languages. >>> >> +1 >>> >>> Such a core syntax feature is not going to be changed lightly (or likely >>> ever). >>> >> I'm glad to hear that. :-) >> >> Although Python's use of indentation has its downside, we gain much >> more then we lose, IMHO. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From pydanny at gmail.com Thu Aug 11 19:04:24 2011 From: pydanny at gmail.com (Daniel Greenfeld) Date: Thu, 11 Aug 2011 10:04:24 -0700 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: Something like this already exists: a = 0 b = 1 if (True == True and False == False and a + 1 == b and b - 1 == a): print 'meh' So I've got no idea what this proposal is about except for the dropping of readability of Python. -1 Daniel Greenfeld On Thu, Aug 11, 2011 at 8:42 AM, Jakob Bowyer wrote: > -1 This idea seems like it would remove the true readability of > python. Personally it would create more confusion than it would > remove. > > On Thu, Aug 11, 2011 at 3:28 PM, Matt Joiner wrote: >> +0.5 >> >> The "trailing \" workaround is nonobvious. Wrapping in () is noisy and >> already heavily used by other syntactical structures. Since a final >> ':' is needed anyway, i think this would be great. >> >> if a >> ?and b >> ?or c: >> ?do stuff() >> >> On Thu, Aug 11, 2011 at 11:02 PM, MRAB wrote: >>> On 11/08/2011 05:16, Chris Rebert wrote: >>>> >>>> On Wed, Aug 10, 2011 at 7:52 PM, Yingjie Lan ?wrote: >>>>> >>>>> :And if we require {} then truly free indentation should be OK too! But >>>>> >>>>> :it wouldn't be Python any more. >>>>> >>>>> Of course, but not the case with ';'. Currently ';' is optional in >>>>> Python, >>>> >>>> I think of it more as that Python deigns to permit semicolons. >>>> >>>>> But '{' is used for dicts. Clearly, ';' and '{' are different in >>>>> magnitude. >>>>> >>>>> So the decision is: shall we change ';' from optional to mandatory >>>>> to allow free line splitting? >>>> >>>> Hell no, considering that the sizable majority of lines *aren't* >>>> split, which makes those semicolons completely redundant to their >>>> accompanying newlines. We'd be practicing poor Huffman coding by >>>> optimizing for the *un*common case. It would also add punctuational >>>> noise to what is otherwise an amazingly clean and readable syntax. >>>> Accidental semicolon omission is (IMO) the most irritating source of >>>> syntax (and, inadvertently, sometimes other more serious) errors in >>>> curly-braced programming languages. >>>> >>> +1 >>>> >>>> Such a core syntax feature is not going to be changed lightly (or likely >>>> ever). >>>> >>> I'm glad to hear that. :-) >>> >>> Although Python's use of indentation has its downside, we gain much >>> more then we lose, IMHO. >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >>> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From paul at colomiets.name Thu Aug 11 21:17:08 2011 From: paul at colomiets.name (Paul Colomiets) Date: Thu, 11 Aug 2011 22:17:08 +0300 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: Hi Matt, On Thu, Aug 11, 2011 at 5:28 PM, Matt Joiner wrote: > +0.5 > > The "trailing \" workaround is nonobvious. Wrapping in () is noisy and > already heavily used by other syntactical structures. Since a final > ':' is needed anyway, i think this would be great. > > if a > ?and b > ?or c: > ?do stuff() > If you really think so, try writing some coffeescript (remember to obey 79 chars limit). Coffeescript is amasing, but it lacks strictness of python. So you really don't know how to break line, and it really takes time to figure out right way each time you need it. -- Paul From jeanpierreda at gmail.com Thu Aug 11 21:24:34 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 11 Aug 2011 15:24:34 -0400 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: Javascript also lets you break lines. For example, this does what you want: return 1 + 5 Whereas this does not return 1 + 5 Of course, Python would have no such problem, because you could make both cases unambiguous due to the indent. Devin On Thu, Aug 11, 2011 at 3:17 PM, Paul Colomiets wrote: > Hi Matt, > > On Thu, Aug 11, 2011 at 5:28 PM, Matt Joiner wrote: > > +0.5 > > > > The "trailing \" workaround is nonobvious. Wrapping in () is noisy and > > already heavily used by other syntactical structures. Since a final > > ':' is needed anyway, i think this would be great. > > > > if a > > and b > > or c: > > do stuff() > > > If you really think so, try writing some coffeescript (remember to > obey 79 chars limit). Coffeescript is amasing, but it lacks > strictness of python. So you really don't know how to break line, > and it really takes time to figure out right way each time you need > it. > > -- > Paul > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Thu Aug 11 22:45:58 2011 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 11 Aug 2011 13:45:58 -0700 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: On Thu, Aug 11, 2011 at 12:24 PM, Devin Jeanpierre wrote: > Javascript also lets you break lines. For example, this does what you want: > > return 1 > + 5 > > Whereas this does not > > return > 1 + 5 > > Of course, Python would have no such problem, because you could make both > cases unambiguous due to the indent. > > Devin > > > Note that this is already valid and is not a continuation line: return 1 +5 Right now you do not need to indent continuation lines. So in order to disambiguate you would need to enforce indentation for continuations, but for backward compatibility that would only be required when not using parentheses or backslashes. Ick. Can blank lines or comment lines appear between a line and its continuation? That's allowed now as well. Now allowing line breaks *after* operators would be unambiguous and would not require new indentation rules. When a line ends with an operator, it's clearly incomplete (so no fear the reader will think the statement has ended unlike the above case) and it's a syntax error today: return 1 + 5 x = y > 0 and y < 10 This code is not valid today without parens or \ regardless of indentation. I'm +0 on this. I'd use it but does it really add enough convenience? --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Thu Aug 11 23:06:59 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 11 Aug 2011 17:06:59 -0400 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: > Right now you do not need to indent continuation lines. So in order to disambiguate you would need to enforce indentation for continuations, but for backward compatibility that would only be required when not using parentheses or backslashes. Ick. Can blank lines or comment lines appear between a line and its continuation? That's allowed now as well. Eek no. If I was suggesting anything, it would have been a third form of continuation: collapsing subsequent extra-indented lines. This is never ambiguous. (This could be done in such a way as to permit comments, namely, by doing it to the tokenstream rather than to the actual text) Devin On Thu, Aug 11, 2011 at 4:45 PM, Bruce Leban wrote: > > On Thu, Aug 11, 2011 at 12:24 PM, Devin Jeanpierre wrote: >> >> Javascript also lets you break lines. For example, this does what you want: >> >> ??? return 1 >> ??????? + 5 >> >> Whereas this does not >> >> ??? return >> ??????? 1 + 5 >> >> Of course, Python would have no such problem, because you could make both cases unambiguous due to the indent. >> >> Devin >> > Note that this is already valid and is not a continuation line: > > return 1 > +5 > > Right now you do not need to indent continuation lines. So in order to disambiguate you would need to enforce indentation for continuations, but for backward compatibility that would only be required when not using parentheses or backslashes. Ick. Can blank lines or comment lines appear between a line and its continuation? That's allowed now as well. > Now allowing line breaks *after* operators would be unambiguous and would not require new indentation rules. When a line ends with an operator, it's clearly incomplete (so no fear the reader will think the statement has ended unlike the above case) and it's a syntax error today: > > return 1 + > ? ? 5 > x = y > 0 and > ? ?y < 10 > > This code is not valid today without parens or \ regardless of indentation. I'm +0 on this. I'd use it but does it really add enough convenience? > --- Bruce > Follow me:?http://www.twitter.com/Vroo?http://www.vroospeak.com From bruce at leapyear.org Thu Aug 11 23:21:53 2011 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 11 Aug 2011 14:21:53 -0700 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: On Thu, Aug 11, 2011 at 2:06 PM, Devin Jeanpierre wrote: > > Eek no. If I was suggesting anything, it would have been a third form > of continuation: collapsing subsequent extra-indented lines. This is > never ambiguous. (This could be done in such a way as to permit > comments, namely, by doing it to the tokenstream rather than to the > actual text) So if I miss-indent this a = b (x, y) = z instead of getting "unexpected indent" I get "SyntaxError: can't assign to function call". I'm sure someone can come up with two valid statements that have a different meaning when spliced together. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Thu Aug 11 23:29:36 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 11 Aug 2011 17:29:36 -0400 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: a = b, c = d is a pair of such statements. Howeverm indentation errors have been extremely rare in my experience, so I'm not really compelled to think it's harmful. Especially since 3.x outlaws mixing tabs and spaces. I don't love it, but I guess I prefer it to throwing parentheses and especially \ everywhere. Parentheses can be awkward and don't quite work everywhere the way one might want, and \ has that trailing space ugliness. Devin On Thu, Aug 11, 2011 at 5:21 PM, Bruce Leban wrote: > > On Thu, Aug 11, 2011 at 2:06 PM, Devin Jeanpierre > wrote: >> >> Eek no. If I was suggesting anything, it would have been a third form >> of continuation: collapsing subsequent extra-indented lines. This is >> never ambiguous. (This could be done in such a way as to permit >> comments, namely, by doing it to the tokenstream rather than to the >> actual text) > > So if I miss-indent this > a = b > ? (x, y) = z > > instead of getting "unexpected indent" I get "SyntaxError: can't assign to > function call". I'm sure someone can come up with two valid statements that > have a different meaning when spliced together. > --- Bruce > Follow me:?http://www.twitter.com/Vroo?http://www.vroospeak.com > > > From jimjjewett at gmail.com Thu Aug 11 23:39:01 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 11 Aug 2011 17:39:01 -0400 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: On Thu, Aug 11, 2011 at 5:29 PM, Devin Jeanpierre wrote: > Howeverm indentation errors have been extremely rare in my experience, > so I'm not really compelled to think it's harmful. Especially since > 3.x outlaws mixing tabs and spaces. I normally get them when starting with code from somewhere else (which might well mixed tabs and spaces, or worse, if emailed or posted to the web) or when cutting and pasting at an interactive prompt. -jJ From jeanpierreda at gmail.com Fri Aug 12 00:29:37 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 11 Aug 2011 18:29:37 -0400 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: Well the tabs&spaces issue is no longer an issue as far as I understand it (such a change to indent semantics could only go into 3.x), and cutting and pasting to the interpreter is obvious anyway just visually, regardless of the specific error message. The other issue sounds reasonable. Code that has indentation stripped or mangled due to the transition medium would be even harder to recompose. Devin On Thu, Aug 11, 2011 at 5:39 PM, Jim Jewett wrote: > On Thu, Aug 11, 2011 at 5:29 PM, Devin Jeanpierre > wrote: >> Howeverm indentation errors have been extremely rare in my experience, >> so I'm not really compelled to think it's harmful. Especially since >> 3.x outlaws mixing tabs and spaces. > > I normally get them when starting with code from somewhere else (which > might well mixed tabs and spaces, or worse, if emailed or posted to > the web) or when cutting and pasting at an interactive prompt. > > -jJ > From tjreedy at udel.edu Fri Aug 12 00:31:32 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 11 Aug 2011 18:31:32 -0400 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <4e424208$0$29965$c3e8da3$5496439d@news.astraweb.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: On 8/11/2011 5:06 PM, Devin Jeanpierre wrote: >> Right now you do not need to indent continuation lines. So in order >> to disambiguate you would need to enforce indentation for >> continuations, but for backward compatibility that would only be >> required when not using parentheses or backslashes. Ick. Can blank >> lines or comment lines appear between a line and its continuation? >> That's allowed now as well. > > Eek no. If I was suggesting anything, it would have been a third > form of continuation: collapsing subsequent extra-indented lines. > This is never ambiguous. (This could be done in such a way as to > permit comments, namely, by doing it to the tokenstream rather than > to the actual text) One bit of reality for these types of proposals. Cpython is implemented with an auto-generated LL(1) parser and will remain that way. Hence, all grammar proposals must stay within the bounds of LL(1) context-free languages. I would not be surprised if some of the proposals have 'jumped the fence'. But it can be hard to tell without a concrete grammar revision that can be run thru the LL(1) parser generator. Aside from that, discuss away. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Fri Aug 12 02:21:23 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 12 Aug 2011 12:21:23 +1200 Subject: [Python-ideas] allow line break at operators In-Reply-To: References: <1312951356.77394.YahooMailNeo@web121518.mail.ne1.yahoo.com> <1312981104.89312.YahooMailNeo@web121520.mail.ne1.yahoo.com> <1312982377.95657.YahooMailNeo@web121508.mail.ne1.yahoo.com> <1313031175.38817.YahooMailNeo@web121515.mail.ne1.yahoo.com> <4E43D2F2.1090004@mrabarnett.plus.com> Message-ID: <4E447203.1000303@canterbury.ac.nz> Terry Reedy wrote: > Hence, all > grammar proposals must stay within the bounds of LL(1) context-free > languages. I don't think any of the proposals so far go beyond LL(1). However, some of them might require rethinking the traditional form of interface between the tokeniser and the parser. For example, instead of treating newlines as a separate token, have a flag which says "this token occurred at the end of a line", that the parser can take notice of or not depending on the context. -- Greg From aquavitae69 at gmail.com Fri Aug 12 13:59:22 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 12 Aug 2011 13:59:22 +0200 Subject: [Python-ideas] Implementation of shutil.move Message-ID: The shutil.move function uses os.rename to move files on the same file system. On unix, this function will overwrite an existing destination, so the obvious approach is if not os.path.exists(dst): shutil.move(src, dst) But this could result in race conditions if dst is created after os.path.exists and before shutil.move. From my research, it seems that this is a limitation in the unix c library, but it should be possible to avoid it through a workaround (pieced together from http://bytes.com/topic/python/answers/555794-safely-renaming-file-without-overwriting). This involves some fairly low-level work, so I propose adding a new move2 function to shutil, which raises an error if dst exists and locking it if it doesn't: def move2(src, dst): try: fd = os.open(dst, os.O_EXCL | os.O_CREAT) except OSError: raise Error('Destination exists') try: move(src, dst) finally: os.close(fd) This could be optimised by using shutil.move code rather than just calling it, but the idea is that an attempt is made to create dst with exclusive access. If this fails, then it means that the file exists, but if it passes, then dst is locked so no other process can create it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Aug 12 14:29:21 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 12 Aug 2011 14:29:21 +0200 Subject: [Python-ideas] Implementation of shutil.move References: Message-ID: <20110812142921.18e1ad3d@pitrou.net> On Fri, 12 Aug 2011 13:59:22 +0200 David Townshend wrote: > The shutil.move function uses os.rename to move files on the same file > system. On unix, this function will overwrite an existing destination, so > the obvious approach is > > if not os.path.exists(dst): > shutil.move(src, dst) > > But this could result in race conditions if dst is created after > os.path.exists and before shutil.move. From my research, it seems that this > is a limitation in the unix c library, but it should be possible to avoid it > through a workaround (pieced together from > http://bytes.com/topic/python/answers/555794-safely-renaming-file-without-overwriting). > This involves some fairly low-level work, so I propose adding a new move2 > function to shutil, which raises an error if dst exists and locking it if it > doesn't: This is a reasonable request (although it could also be an optional argument to shutil.move() rather than a separate function). Could you open an issue with your proposal on http://bugs.python.org ? You are also welcome to submit a patch in the issue; please see http://docs.python.org/devguide/ Regards Antoine. From aquavitae69 at gmail.com Fri Aug 12 14:43:40 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 12 Aug 2011 14:43:40 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <20110812142921.18e1ad3d@pitrou.net> References: <20110812142921.18e1ad3d@pitrou.net> Message-ID: > > This is a reasonable request (although it could also be an optional > argument to shutil.move() rather than a separate function). > Could you open an issue with your proposal on http://bugs.python.org ? > You are also welcome to submit a patch in the issue; please see > http://docs.python.org/devguide/ > Issue created: http://bugs.python.org/issue12741 Adding an optional argument to move sounds like a better solution, so I'll work on a patch for that. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikegraham at gmail.com Fri Aug 12 15:20:50 2011 From: mikegraham at gmail.com (Mike Graham) Date: Fri, 12 Aug 2011 09:20:50 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: Message-ID: On Fri, Aug 12, 2011 at 7:59 AM, David Townshend wrote: > The shutil.move function uses os.rename to move files on the same file > system. On unix, this function will overwrite an existing destination, so > the obvious approach is > if not os.path.exists(dst): > ? ? shutil.move(src, dst) > But this could result in race conditions if dst is created after > os.path.exists and before shutil.move. ?From my research, it seems that this > is a limitation in the unix c library, but it should be possible to avoid it > through a workaround (pieced together > from?http://bytes.com/topic/python/answers/555794-safely-renaming-file-without-overwriting). > ?This involves some fairly low-level work, so I propose adding a new move2 > function to shutil, which raises an error if dst exists and locking it if it > doesn't: > def move2(src, dst): > ? ? try: > ? ? ? ? fd = os.open(dst, os.O_EXCL | os.O_CREAT) > ? ? except OSError: > ? ? ? ? raise Error('Destination exists') > ? ? try: > ? ? ? ? move(src, dst) > ? ? finally: > ? ? ? ? os.close(fd) > This could be optimised by using shutil.move code rather than just calling > it, but the idea is that an attempt is made to create dst with exclusive > access. If this fails, then it means that the file exists, but if it passes, > then dst is locked so no other process can create it. This type of problem comes up regularly and a lot of user code is riddled with this kind of race conditions. Many (most?) are avoidable now by writing code to using EAFP rather than LBYL, but many are not. I wonder if this broader problem could be addressed by a context manager. Something to the general effect of the usage try: with lockfile(dst): move(src, dst) except OSError as e: if e != errno.EEXIST: raise raise AppSpecificError("File already exists.") # or whatever and a definition like @contextlib.contextmanager def lockfile(path): fd = os.open(path, os.O_EXCL | os.O_CREAT) yield os.close(fd) The usage is still sort of ugly, but I'm not sure I can think of a general way that isn't. Mike From ironfroggy at gmail.com Fri Aug 12 15:45:41 2011 From: ironfroggy at gmail.com (Calvin Spealman) Date: Fri, 12 Aug 2011 09:45:41 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: Message-ID: On Fri, Aug 12, 2011 at 9:20 AM, Mike Graham wrote: > On Fri, Aug 12, 2011 at 7:59 AM, David Townshend wrote: >> The shutil.move function uses os.rename to move files on the same file >> system. On unix, this function will overwrite an existing destination, so >> the obvious approach is >> if not os.path.exists(dst): >> ? ? shutil.move(src, dst) >> But this could result in race conditions if dst is created after >> os.path.exists and before shutil.move. ?From my research, it seems that this >> is a limitation in the unix c library, but it should be possible to avoid it >> through a workaround (pieced together >> from?http://bytes.com/topic/python/answers/555794-safely-renaming-file-without-overwriting). >> ?This involves some fairly low-level work, so I propose adding a new move2 >> function to shutil, which raises an error if dst exists and locking it if it >> doesn't: >> def move2(src, dst): >> ? ? try: >> ? ? ? ? fd = os.open(dst, os.O_EXCL | os.O_CREAT) >> ? ? except OSError: >> ? ? ? ? raise Error('Destination exists') >> ? ? try: >> ? ? ? ? move(src, dst) >> ? ? finally: >> ? ? ? ? os.close(fd) >> This could be optimised by using shutil.move code rather than just calling >> it, but the idea is that an attempt is made to create dst with exclusive >> access. If this fails, then it means that the file exists, but if it passes, >> then dst is locked so no other process can create it. > > This type of problem comes up regularly and a lot of user code is > riddled with this kind of race conditions. Many (most?) are avoidable > now by writing code to using EAFP rather than LBYL, but many are not. > I wonder if this broader problem could be addressed by a context > manager. > > Something to the general effect of the usage > > try: > ? ?with lockfile(dst): > ? ? ? ?move(src, dst) > except OSError as e: > ? ?if e != errno.EEXIST: > ? ? ? ?raise > ? ?raise AppSpecificError("File already exists.") # or whatever > > and a definition like > > @contextlib.contextmanager > def lockfile(path): > ? ?fd = os.open(path, os.O_EXCL | os.O_CREAT) > ? ?yield > ? ?os.close(fd) > > > The usage is still sort of ugly, but I'm not sure I can think of a > general way that isn't. lockfile is not a good name for this, but the manager itself is very simple and useful. +1 from me for getting this somewhere into stdlib. > Mike > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From jeanpierreda at gmail.com Fri Aug 12 15:53:56 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 12 Aug 2011 09:53:56 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: Message-ID: This doesn't completely solve the race condition: `os.open(path, O_EXCL | O_CREAT)` opens an fd for a file location. If a file was already at that location, it fails, if not, it succeeds with an fd. So if os.move() used this fd to actually write the new data, there would be no race condition in terms of file creation/deletion. This is, as far as I can tell, what was suggested by your linked thread. However, in your suggestion it does not do that: it does opens a new fd to a new file, and then does precisely what it did before. During the time in-between the os.open and the shutil.move(), somebody can delete the created file, and write a new one, or whatever. If they do that, then any such changes they make will be lost because shutil.move will steamroller them and overwrite the file. Devin On Fri, Aug 12, 2011 at 7:59 AM, David Townshend wrote: > The shutil.move function uses os.rename to move files on the same file > system. On unix, this function will overwrite an existing destination, so > the obvious approach is > if not os.path.exists(dst): > ? ? shutil.move(src, dst) > But this could result in race conditions if dst is created after > os.path.exists and before shutil.move. ?From my research, it seems that this > is a limitation in the unix c library, but it should be possible to avoid it > through a workaround (pieced together > from?http://bytes.com/topic/python/answers/555794-safely-renaming-file-without-overwriting). > ?This involves some fairly low-level work, so I propose adding a new move2 > function to shutil, which raises an error if dst exists and locking it if it > doesn't: > def move2(src, dst): > ? ? try: > ? ? ? ? fd = os.open(dst, os.O_EXCL | os.O_CREAT) > ? ? except OSError: > ? ? ? ? raise Error('Destination exists') > ? ? try: > ? ? ? ? move(src, dst) > ? ? finally: > ? ? ? ? os.close(fd) > This could be optimised by using shutil.move code rather than just calling > it, but the idea is that an attempt is made to create dst with exclusive > access. If this fails, then it means that the file exists, but if it passes, > then dst is locked so no other process can create it. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From lists at cheimes.de Fri Aug 12 16:46:27 2011 From: lists at cheimes.de (Christian Heimes) Date: Fri, 12 Aug 2011 16:46:27 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: Message-ID: Am 12.08.2011 15:53, schrieb Devin Jeanpierre: > This doesn't completely solve the race condition: > > `os.open(path, O_EXCL | O_CREAT)` opens an fd for a file location. If > a file was already at that location, it fails, if not, it succeeds > with an fd. > So if os.move() used this fd to actually write the new data, there > would be no race condition in terms of file creation/deletion. This > is, as far as I can tell, what was suggested by your linked thread. > > However, in your suggestion it does not do that: it does opens a new > fd to a new file, and then does precisely what it did before. During > the time in-between the os.open and the shutil.move(), somebody can > delete the created file, and write a new one, or whatever. If they do > that, then any such changes they make will be lost because shutil.move > will steamroller them and overwrite the file. I couldn't figure out how the open fd should fix the race condition until I read your answer. You are right! The trick doesn't help at all. Even a lock file won't help since POSIX flock()s are only advisory locks. Contrary to shutil.copy(), os.move() doesn't write any data. It only changes some metadata by changing the name of the file. This makes the move op on a single file system an atomic operation. The fd doesn't help here at all. We could implement move with a combination of link() and unlink(). link() sets errno to EEXIST if the destination already exists. shutil.move() will no longer be an atomic operation but that's fine with me. We still have os.move(). Christian From aquavitae69 at gmail.com Fri Aug 12 16:48:49 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 12 Aug 2011 16:48:49 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: Message-ID: My understanding of os.O_EXCL is that it locks the file from changes by any other process. It appears from a quick test, though, that this is not the case. Perhaps the second suggestion in the linked thread (using link/unlink) would work better, since this situation only arises on unix. I like the idea of a context manager for locking, but I'm not sure how that would work in this case... On Fri, Aug 12, 2011 at 3:53 PM, Devin Jeanpierre wrote: > This doesn't completely solve the race condition: > > `os.open(path, O_EXCL | O_CREAT)` opens an fd for a file location. If > a file was already at that location, it fails, if not, it succeeds > with an fd. > So if os.move() used this fd to actually write the new data, there > would be no race condition in terms of file creation/deletion. This > is, as far as I can tell, what was suggested by your linked thread. > > However, in your suggestion it does not do that: it does opens a new > fd to a new file, and then does precisely what it did before. During > the time in-between the os.open and the shutil.move(), somebody can > delete the created file, and write a new one, or whatever. If they do > that, then any such changes they make will be lost because shutil.move > will steamroller them and overwrite the file. > > Devin > > On Fri, Aug 12, 2011 at 7:59 AM, David Townshend > wrote: > > The shutil.move function uses os.rename to move files on the same file > > system. On unix, this function will overwrite an existing destination, so > > the obvious approach is > > if not os.path.exists(dst): > > shutil.move(src, dst) > > But this could result in race conditions if dst is created after > > os.path.exists and before shutil.move. From my research, it seems that > this > > is a limitation in the unix c library, but it should be possible to avoid > it > > through a workaround (pieced together > > from > http://bytes.com/topic/python/answers/555794-safely-renaming-file-without-overwriting > ). > > This involves some fairly low-level work, so I propose adding a new > move2 > > function to shutil, which raises an error if dst exists and locking it if > it > > doesn't: > > def move2(src, dst): > > try: > > fd = os.open(dst, os.O_EXCL | os.O_CREAT) > > except OSError: > > raise Error('Destination exists') > > try: > > move(src, dst) > > finally: > > os.close(fd) > > This could be optimised by using shutil.move code rather than just > calling > > it, but the idea is that an attempt is made to create dst with exclusive > > access. If this fails, then it means that the file exists, but if it > passes, > > then dst is locked so no other process can create it. > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Fri Aug 12 16:55:03 2011 From: masklinn at masklinn.net (Masklinn) Date: Fri, 12 Aug 2011 16:55:03 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: Message-ID: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> On 2011-08-12, at 16:48 , David Townshend wrote: > My understanding of os.O_EXCL is that it locks the file from changes by any > other process. That, *could* be O_EXLOCK, but I'm not too sure. O_EXCL does not do anything in and of itself, it fails the file opening if combined with O_CREAT. That's it (from man 2 open): O_EXCL error if O_CREAT and the file exists From lists at cheimes.de Fri Aug 12 17:03:05 2011 From: lists at cheimes.de (Christian Heimes) Date: Fri, 12 Aug 2011 17:03:05 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> Message-ID: Am 12.08.2011 16:55, schrieb Masklinn: > On 2011-08-12, at 16:48 , David Townshend wrote: >> My understanding of os.O_EXCL is that it locks the file from changes by any >> other process. > That, *could* be O_EXLOCK, but I'm not too sure. > > O_EXCL does not do anything in and of itself, it fails the file opening if combined with O_CREAT. That's it (from man 2 open): > > O_EXCL error if O_CREAT and the file exists The man page open(2) doesn't mention O_EXLOCK. It must belong to another low level function. Christian From g.brandl at gmx.net Fri Aug 12 17:15:24 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 12 Aug 2011 17:15:24 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> Message-ID: Am 12.08.2011 17:03, schrieb Christian Heimes: > Am 12.08.2011 16:55, schrieb Masklinn: >> On 2011-08-12, at 16:48 , David Townshend wrote: >>> My understanding of os.O_EXCL is that it locks the file from changes by any >>> other process. >> That, *could* be O_EXLOCK, but I'm not too sure. >> >> O_EXCL does not do anything in and of itself, it fails the file opening if combined with O_CREAT. That's it (from man 2 open): >> >> O_EXCL error if O_CREAT and the file exists > > The man page open(2) doesn't mention O_EXLOCK. It must belong to another > low level function. Or it isn't supported on the flavour of Unix you happen to be using :) Georg From mwm at mired.org Fri Aug 12 18:30:29 2011 From: mwm at mired.org (Mike Meyer) Date: Fri, 12 Aug 2011 09:30:29 -0700 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> Message-ID: <6e179d43-1eb9-486c-aaf1-18d3c1650d60@email.android.com> Georg Brandl wrote: >Am 12.08.2011 17:03, schrieb Christian Heimes: >> Am 12.08.2011 16:55, schrieb Masklinn: >>> On 2011-08-12, at 16:48 , David Townshend wrote: >>>> My understanding of os.O_EXCL is that it locks the file from >changes by any >>>> other process. >>> That, *could* be O_EXLOCK, but I'm not too sure. >>> >>> O_EXCL does not do anything in and of itself, it fails the file >opening if combined with O_CREAT. That's it (from man 2 open): >>> >>> O_EXCL error if O_CREAT and the file exists >> >> The man page open(2) doesn't mention O_EXLOCK. It must belong to >another >> low level function. > >Or it isn't supported on the flavour of Unix you happen to be using :) BSD systems have it, linux doesn't. One of the few times developing for linux on a mac has bitten me. References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> Message-ID: On 2011-08-12, at 17:03 , Christian Heimes wrote: > Am 12.08.2011 16:55, schrieb Masklinn: >> On 2011-08-12, at 16:48 , David Townshend wrote: >>> My understanding of os.O_EXCL is that it locks the file from changes by any >>> other process. >> That, *could* be O_EXLOCK, but I'm not too sure. >> >> O_EXCL does not do anything in and of itself, it fails the file opening if combined with O_CREAT. That's it (from man 2 open): >> >> O_EXCL error if O_CREAT and the file exists > > The man page open(2) doesn't mention O_EXLOCK. It must belong to another > low level function. Nope, got it from open(2), but it's apparently a bsd extension: http://lkml.indiana.edu/hypermail/linux/kernel/0005.1/1288.html This information (that it's an extension) is present in the OpenBSD open(2) page, but I checked it on an OSX machine where it's not specified. Sorry. It's an atomic `flock(fd, LOCK_EX)` (or LOCK_SH) built into open(2) to avoid the unlocked open hole between the open(2) and flock(2) calls. http://www.openbsd.org/cgi-bin/man.cgi?query=open&apropos=0&sektion=0&manpath=OpenBSD+Current&arch=i386&format=html http://www.freebsd.org/cgi/man.cgi?query=open&apropos=0&sektion=2&manpath=FreeBSD+8.2-RELEASE&format=html http://netbsd.gw.com/cgi-bin/man-cgi?open+2+NetBSD-current http://leaf.dragonflybsd.org/cgi/web-man?command=open§ion=2 https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/open.2.html From lists at cheimes.de Fri Aug 12 18:48:39 2011 From: lists at cheimes.de (Christian Heimes) Date: Fri, 12 Aug 2011 18:48:39 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> Message-ID: <4E455967.8050305@cheimes.de> Am 12.08.2011 18:37, schrieb Masklinn: > On 2011-08-12, at 17:03 , Christian Heimes wrote: >> Am 12.08.2011 16:55, schrieb Masklinn: >>> On 2011-08-12, at 16:48 , David Townshend wrote: >>>> My understanding of os.O_EXCL is that it locks the file from changes by any >>>> other process. >>> That, *could* be O_EXLOCK, but I'm not too sure. >>> >>> O_EXCL does not do anything in and of itself, it fails the file opening if combined with O_CREAT. That's it (from man 2 open): >>> >>> O_EXCL error if O_CREAT and the file exists >> >> The man page open(2) doesn't mention O_EXLOCK. It must belong to another >> low level function. > Nope, got it from open(2), but it's apparently a bsd extension: http://lkml.indiana.edu/hypermail/linux/kernel/0005.1/1288.html > > This information (that it's an extension) is present in the OpenBSD open(2) page, but I checked it on an OSX machine where it's not specified. Sorry. > > It's an atomic `flock(fd, LOCK_EX)` (or LOCK_SH) built into open(2) to avoid the unlocked open hole between the open(2) and flock(2) calls. Ah, I've checked the open(2) man page on Linux. Should have mentioned the OS ... sorry, too. Anyway flock()s are only advisory (cooperative) locks and not mandatory locks. Although some Unices have support for mandatory locks, they can be circumvented with unlink(). Christian From aquavitae69 at gmail.com Fri Aug 12 19:35:41 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 12 Aug 2011 19:35:41 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <4E455967.8050305@cheimes.de> References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> Message-ID: If the kernel doesn't allow file locking then I don't see any way that a locking context manager will be possible, but it should still be possible to safely move files using link and unlink. Maybe the other problem functions can be dealt with individually in a similar way? On Aug 12, 2011 6:57 PM, "Christian Heimes" wrote: > Am 12.08.2011 18:37, schrieb Masklinn: >> On 2011-08-12, at 17:03 , Christian Heimes wrote: >>> Am 12.08.2011 16:55, schrieb Masklinn: >>>> On 2011-08-12, at 16:48 , David Townshend wrote: >>>>> My understanding of os.O_EXCL is that it locks the file from changes by any >>>>> other process. >>>> That, *could* be O_EXLOCK, but I'm not too sure. >>>> >>>> O_EXCL does not do anything in and of itself, it fails the file opening if combined with O_CREAT. That's it (from man 2 open): >>>> >>>> O_EXCL error if O_CREAT and the file exists >>> >>> The man page open(2) doesn't mention O_EXLOCK. It must belong to another >>> low level function. >> Nope, got it from open(2), but it's apparently a bsd extension: http://lkml.indiana.edu/hypermail/linux/kernel/0005.1/1288.html >> >> This information (that it's an extension) is present in the OpenBSD open(2) page, but I checked it on an OSX machine where it's not specified. Sorry. >> >> It's an atomic `flock(fd, LOCK_EX)` (or LOCK_SH) built into open(2) to avoid the unlocked open hole between the open(2) and flock(2) calls. > > Ah, I've checked the open(2) man page on Linux. Should have mentioned > the OS ... sorry, too. > > Anyway flock()s are only advisory (cooperative) locks and not mandatory > locks. Although some Unices have support for mandatory locks, they can > be circumvented with unlink(). > > Christian > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From peet at altlinux.ru Fri Aug 12 22:49:14 2011 From: peet at altlinux.ru (Peter V. Saveliev) Date: Sat, 13 Aug 2011 00:49:14 +0400 Subject: [Python-ideas] multiple objects handling Message-ID: <4E4591CA.10302@altlinux.ru> ? Hello! I have a stream of objects, each of dictionary type. Actually, they represent events and are emitted by an asynchronous thread. Sample event: { 'type': 'neigh', 'action': 'add', 'index': 2, 'dest': '10.1.0.1', 'lladdr': '0:2:b3:39:2e:4c', 'timestamp': 'Wed Aug 10 17:20:28 2011', 'probes': None, 'cacheinfo': None } I shoud take an action depending on event's 'type' (neigh[bor], link, address, route etc.) and 'action' (add, del) items. Now I see two ways: * build a tree of ?if/elif? statements within one function * create a dictionary that points to different methods The issue is that a method call is slower that jump within one function. But a dictionary lookup is faster than ?if/elif? statements (if I have a lot of variants and do not want to rebuild B-like tree of ?if? statements by hands). Is there a way to combine these methods? Can I create a dictionary of jump instructions? So far I see that JUMP_ABSOLUTE and JUMP_FORWARD take an integer as a parameter. It is reasonable, 'cause it is faster for if/elif/for etc. statements. And thus I can not use a dictionary to store offsets and then use them for one JUMP instruction. But can we have another separate codeop, e.g. something like JUMP_BY_TOS, that takes offset (or absolute address) from the top of the stack? ? Yes, I understand that it seems like `goto` statement. ? Anyway, it can significantly speed up any stream parsing. ? Thanks. -- Peter V. Saveliev From bruce at leapyear.org Fri Aug 12 23:45:01 2011 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 12 Aug 2011 14:45:01 -0700 Subject: [Python-ideas] multiple objects handling In-Reply-To: <4E4591CA.10302@altlinux.ru> References: <4E4591CA.10302@altlinux.ru> Message-ID: On Fri, Aug 12, 2011 at 1:49 PM, Peter V. Saveliev wrote: > I have a stream of objects, each of dictionary type.... > > I shoud take an action depending on event's 'type' (neigh[bor], link, > address, route etc.) and 'action' (add, del) items. > > Now I see two ways: > * build a tree of <> statements within one function > * create a dictionary that points to different methods > > The issue is that a method call is slower that jump within one function. > Is the overhead of a method call the biggest problem in your code? > But a dictionary lookup is faster than <> statements (if I have a > lot of variants and do not want to rebuild B-like tree of <> > statements by hands). > > Is there a way to combine these methods? Can I create a dictionary of > jump instructions? > How would this be useful to someone writing a python program? Or put another way, if the Python interpreter had such a feature, what language feature would use it? Since there is no switch statement http://www.python.org/dev/peps/pep-3103/ this seems premature optimization. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott+python-ideas at scottdial.com Fri Aug 12 23:47:40 2011 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Fri, 12 Aug 2011 17:47:40 -0400 Subject: [Python-ideas] multiple objects handling In-Reply-To: <4E4591CA.10302@altlinux.ru> References: <4E4591CA.10302@altlinux.ru> Message-ID: <4E459F7C.9050706@scottdial.com> On 8/12/2011 4:49 PM, Peter V. Saveliev wrote: > I have a stream of objects, each of dictionary type. Actually, they > represent events and are emitted by an asynchronous thread. > > The issue is that a method call is slower that jump within one function. > But a dictionary lookup is faster than ?if/elif? statements (if I have a > lot of variants and do not want to rebuild B-like tree of ?if? > statements by hands). > > Is there a way to combine these methods? Can I create a dictionary of > jump instructions? > > So far I see that JUMP_ABSOLUTE and JUMP_FORWARD take an integer as a > parameter. It is reasonable, 'cause it is faster for if/elif/for etc. > statements. And thus I can not use a dictionary to store offsets and > then use them for one JUMP instruction. > > But can we have another separate codeop, e.g. something like > JUMP_BY_TOS, that takes offset (or absolute address) from the top of the > stack? > > Anyway, it can significantly speed up any stream parsing. > Really? Your runtime is dominated by a *single* dict lookup and a *single* method call? What do you do with these objects that takes so little time that such a micro-optimization will have a "significant speed up" to your program? Have you actually profiled the performance of your program? I would guess that the time spent dispatching through a dictionary is dwarfed by the time spent constructing those event objects and the ultimate processing of them. > But can we have another separate codeop, e.g. something like > JUMP_BY_TOS, that takes offset (or absolute address) from the top of the > stack? How would you use such a opcode? There is no syntax available that would allow you to get the offset (or absolute address) of any given line in a function. To have a "goto" statement, you have to have labels, we don't have those either. Or, are you proposing that CPython optimize large if/elif blocks into dispatch tables? That sounds more interesting, but the performance win would have to worth the additional code complexity. -- Scott Dial scott at scottdial.com From peet at altlinux.ru Sat Aug 13 00:18:37 2011 From: peet at altlinux.ru (Peter V. Saveliev) Date: Sat, 13 Aug 2011 02:18:37 +0400 Subject: [Python-ideas] multiple objects handling In-Reply-To: <4E459F7C.9050706@scottdial.com> References: <4E4591CA.10302@altlinux.ru> <4E459F7C.9050706@scottdial.com> Message-ID: <4E45A6BD.4020108@altlinux.ru> On 13.08.2011 01:47, Scott Dial wrote: > Really? Your runtime is dominated by a *single* dict lookup and a > *single* method call? No, surely :))) But it is one of the problems. > What do you do with these objects that takes so > little time that such a micro-optimization will have a "significant > speed up" to your program? > > Have you actually profiled the performance of your program? I would > guess that the time spent dispatching through a dictionary is dwarfed by > the time spent constructing those event objects and the ultimate > processing of them. You're right, but having really huge stream of packets I try to minimize any overhead ? using C modules, ctypes library and so on. But I hope to save high-level logic in Python, and before to reject a chance to speed up parsing, I asked for some alternatives and contras. I believe that method calls here are unnecessary while the ?if? statements tree makes the code hard to support. > >> But can we have another separate codeop, e.g. something like >> JUMP_BY_TOS, that takes offset (or absolute address) from the top of the >> stack? > > How would you use such a opcode? There is no syntax available that would > allow you to get the offset (or absolute address) of any given line in a > function. To have a "goto" statement, you have to have labels, we don't > have those either. > > Or, are you proposing that CPython optimize large if/elif blocks into > dispatch tables? That sounds more interesting, but the performance win > would have to worth the additional code complexity. > 'Cause there can be any expression used in ?if? statement, I see no way to create pre-computed hash :( Maybe something like ?switch? would work, as Bruce says, but it is rejected for now. -- Peter V. Saveliev From ncoghlan at gmail.com Sat Aug 13 02:54:21 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 13 Aug 2011 10:54:21 +1000 Subject: [Python-ideas] multiple objects handling In-Reply-To: <4E45A6BD.4020108@altlinux.ru> References: <4E4591CA.10302@altlinux.ru> <4E459F7C.9050706@scottdial.com> <4E45A6BD.4020108@altlinux.ru> Message-ID: On Sat, Aug 13, 2011 at 8:18 AM, Peter V. Saveliev wrote: >> Have you actually profiled the performance of your program? I would >> guess that the time spent dispatching through a dictionary is dwarfed by >> the time spent constructing those event objects and the ultimate >> processing of them. > > You're right, but having really huge stream of packets I try to minimize > any overhead ? using C modules, ctypes library and so on. But I hope to > save high-level logic in Python, and before to reject a chance to speed > up parsing, I asked for some alternatives and contras. I believe that > method calls here are unnecessary while the ?if? statements tree makes > the code hard to support. This kind of logic micro-optimisation is best handled by a JIT compiler rather than messing with the language definition and asking people to do it by hand. I suggest running your application on PyPy and seeing what kind of speed up you get. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From aquavitae69 at gmail.com Sat Aug 13 16:59:53 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sat, 13 Aug 2011 16:59:53 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> Message-ID: There was a suggestion on the issue tracker that it might be better to add an optional argument to shutil.move rather than create a new function. Does anyone have any comments or suggestions about this? On Aug 12, 2011 7:35 PM, "David Townshend" wrote: > If the kernel doesn't allow file locking then I don't see any way that a > locking context manager will be possible, but it should still be possible to > safely move files using link and unlink. Maybe the other problem functions > can be dealt with individually in a similar way? > On Aug 12, 2011 6:57 PM, "Christian Heimes" wrote: >> Am 12.08.2011 18:37, schrieb Masklinn: >>> On 2011-08-12, at 17:03 , Christian Heimes wrote: >>>> Am 12.08.2011 16:55, schrieb Masklinn: >>>>> On 2011-08-12, at 16:48 , David Townshend wrote: >>>>>> My understanding of os.O_EXCL is that it locks the file from changes > by any >>>>>> other process. >>>>> That, *could* be O_EXLOCK, but I'm not too sure. >>>>> >>>>> O_EXCL does not do anything in and of itself, it fails the file opening > if combined with O_CREAT. That's it (from man 2 open): >>>>> >>>>> O_EXCL error if O_CREAT and the file exists >>>> >>>> The man page open(2) doesn't mention O_EXLOCK. It must belong to another >>>> low level function. >>> Nope, got it from open(2), but it's apparently a bsd extension: > http://lkml.indiana.edu/hypermail/linux/kernel/0005.1/1288.html >>> >>> This information (that it's an extension) is present in the OpenBSD > open(2) page, but I checked it on an OSX machine where it's not specified. > Sorry. >>> >>> It's an atomic `flock(fd, LOCK_EX)` (or LOCK_SH) built into open(2) to > avoid the unlocked open hole between the open(2) and flock(2) calls. >> >> Ah, I've checked the open(2) man page on Linux. Should have mentioned >> the OS ... sorry, too. >> >> Anyway flock()s are only advisory (cooperative) locks and not mandatory >> locks. Although some Unices have support for mandatory locks, they can >> be circumvented with unlink(). >> >> Christian >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Sat Aug 13 19:08:57 2011 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 13 Aug 2011 18:08:57 +0100 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> Message-ID: <4E46AFA9.8080900@mrabarnett.plus.com> On 13/08/2011 15:59, David Townshend wrote: > There was a suggestion on the issue tracker that it might be better to > add an optional argument to shutil.move rather than create a new > function. Does anyone have any comments or suggestions about this? > [For reference, it's issue 12741 ("Add function similar to shutil.move that does not overwrite".)] My preference is for adding an optional argument because the difference in behaviour seems too small to justify a different function. From masklinn at masklinn.net Sat Aug 13 19:23:08 2011 From: masklinn at masklinn.net (Masklinn) Date: Sat, 13 Aug 2011 19:23:08 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <4E46AFA9.8080900@mrabarnett.plus.com> References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: On 13 ao?t 2011, at 19:08, MRAB wrote: > On 13/08/2011 15:59, David Townshend wrote: >> There was a suggestion on the issue tracker that it might be better to >> add an optional argument to shutil.move rather than create a new >> function. Does anyone have any comments or suggestions about this? >> > [For reference, it's issue 12741 ("Add function similar to shutil.move > that does not overwrite".)] > > My preference is for adding an optional argument because the difference > in behaviour seems too small to justify a different function. Likewise, and even more so because the current behaviour is platform-dependent and dome platforms already behave as proposed, so the proposal really unifies the behaviour across platforms. From steve at pearwood.info Sun Aug 14 06:05:27 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 14 Aug 2011 14:05:27 +1000 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: <4E474987.4050701@pearwood.info> Masklinn wrote: > On 13 ao?t 2011, at 19:08, MRAB wrote: >> On 13/08/2011 15:59, David Townshend wrote: >>> There was a suggestion on the issue tracker that it might be better to >>> add an optional argument to shutil.move rather than create a new >>> function. Does anyone have any comments or suggestions about this? >>> >> [For reference, it's issue 12741 ("Add function similar to shutil.move >> that does not overwrite".)] >> >> My preference is for adding an optional argument because the difference >> in behaviour seems too small to justify a different function. > Likewise, and even more so because the current behaviour is platform-dependent and dome platforms already behave as proposed, so the proposal really unifies the behaviour across platforms. I dispute this. I don't think the suggested function will work as described, at least not as given in the bug tracker, and certainly not in a platform-independent way. I don't think you can make it completely platform independent, because the underlying file system operations are too different. E.g. file renames are atomic on POSIX systems, but not on Windows. Glossing over these real differences will merely give people a false sense of security. It is important for people to understand the limitations of what is possible on their system, and not be given wrong ideas. (That's not to say that the proposed function is useless. It may be the closest thing that some users will get to the desired behaviour.) For documentation purposes alone, having two separate functions is valuable. The suggested implementation on the tracker http://bugs.python.org/issue12741 wraps the existing shutil.move function. The most natural implementation is a separate function that calls the first function. To have one function implement both behaviours requires an unnatural implementation, such as: def move(src, dest, flag=False): if flag: try: fd = os.open(dst, os.O_EXCL | os.O_CREAT) except OSError: raise Error('Destination exists') try: move(src, dst, False) # recursive call finally: os.close(fd) else: # Insert current implementation here... Slightly less ugly might be to rename the current function move to _move, then have this: def move(src, dest, flag=False): if flag: try: fd = os.open(dst, os.O_EXCL | os.O_CREAT) except OSError: raise Error('Destination exists') try: _move(src, dst) finally: os.close(fd) else: _move(src, dst) Either case is a prime example of why Guido's rule of thumb that functions shouldn't take arguments which select between two (or more) alternative behaviours is a good design principle. Better to keep two separate functions. -1 on a flag argument to shutil.move. +1 on a "safe" version of shutil.move. +0 on the given implementation. I'm not convinced that the proposed function actually is any safer. On the tracker, the OP says: "If this fails, then it means that the file exists, but if it passes, then dst is locked so no other process can create it." But that's not what actually happens. On my Linux box, I can do this in one terminal: >>> dst = 'spam' >>> fd = os.open(dst, os.O_EXCL | os.O_CREAT) >>> os.path.exists(dst) True >>> open(dst).read() # Confirm file is empty. '' I then switch to another terminal, and do this: [steve at sylar ~]$ cat spam # File exists, and is empty. [steve at sylar ~]$ echo "Writing to a locked file" > spam [steve at sylar ~]$ And back to the first one: >>> open(dst).read() 'Writing to a locked file\n' So despite the allegedly exclusive lock, another process was able to overwrite the file after the lock was taken. -- Steven From aquavitae69 at gmail.com Sun Aug 14 06:06:14 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 14 Aug 2011 06:06:14 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: The behaviour is platform dependant only because it uses os.rename. The implementation is independent of platform so any change would affect all platforms. This might not matter if os.link behaves the same across them all though. Initially I thought it would be better to use an argument, but having had a look at the code I'm not so sure any more, since it could end up as two different implementations in the same function. Also, I'm not so sure its a good idea to have an optional argument (never_overwrite) that defaults to False for backward compatibility, but should usually be set to True for safe behaviour. On Aug 13, 2011 7:23 PM, "Masklinn" wrote: > On 13 ao?t 2011, at 19:08, MRAB wrote: >> On 13/08/2011 15:59, David Townshend wrote: >>> There was a suggestion on the issue tracker that it might be better to >>> add an optional argument to shutil.move rather than create a new >>> function. Does anyone have any comments or suggestions about this? >>> >> [For reference, it's issue 12741 ("Add function similar to shutil.move >> that does not overwrite".)] >> >> My preference is for adding an optional argument because the difference >> in behaviour seems too small to justify a different function. > Likewise, and even more so because the current behaviour is platform-dependent and dome platforms already behave as proposed, so the proposal really unifies the behaviour across platforms. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Sun Aug 14 06:34:35 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 14 Aug 2011 00:34:35 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <4E474987.4050701@pearwood.info> References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> <4E474987.4050701@pearwood.info> Message-ID: As far as I am aware, locked files are impossible to do in a cross-platform way, and there can be no correct implementation of this "safe" move. So, I'm -1. It gives people the wrong idea of what it does. Devin On Sun, Aug 14, 2011 at 12:05 AM, Steven D'Aprano wrote: > Masklinn wrote: >> >> On 13 ao?t 2011, at 19:08, MRAB wrote: >>> >>> On 13/08/2011 15:59, David Townshend wrote: >>>> >>>> There was a suggestion on the issue tracker that it might be better to >>>> add an optional argument to shutil.move rather than create a new >>>> function. Does anyone have any comments or suggestions about this? >>>> >>> [For reference, it's issue 12741 ("Add function similar to shutil.move >>> that does not overwrite".)] >>> >>> My preference is for adding an optional argument because the difference >>> in behaviour seems too small to justify a different function. >> >> Likewise, and even more so because the current behaviour is >> platform-dependent and dome platforms already behave as proposed, so the >> proposal really unifies the behaviour across platforms. > > > I dispute this. I don't think the suggested function will work as described, > at least not as given in the bug tracker, and certainly not in a > platform-independent way. > > I don't think you can make it completely platform independent, because the > underlying file system operations are too different. E.g. file renames are > atomic on POSIX systems, but not on Windows. Glossing over these real > differences will merely give people a false sense of security. It is > important for people to understand the limitations of what is possible on > their system, and not be given wrong ideas. > > (That's not to say that the proposed function is useless. It may be the > closest thing that some users will get to the desired behaviour.) > > For documentation purposes alone, having two separate functions is valuable. > > The suggested implementation on the tracker > > http://bugs.python.org/issue12741 > > wraps the existing shutil.move function. The most natural implementation is > a separate function that calls the first function. To have one function > implement both behaviours requires an unnatural implementation, such as: > > def move(src, dest, flag=False): > ? ?if flag: > ? ? ? ?try: > ? ? ? ? ? ?fd = os.open(dst, os.O_EXCL | os.O_CREAT) > ? ? ? ?except OSError: > ? ? ? ? ? ?raise Error('Destination exists') > ? ? ? ?try: > ? ? ? ? ? ?move(src, dst, False) ?# recursive call > ? ? ? ?finally: > ? ? ? ? ? ?os.close(fd) > ? ?else: > ? ? ? ? # Insert current implementation here... > > Slightly less ugly might be to rename the current function move to _move, > then have this: > > def move(src, dest, flag=False): > ? ?if flag: > ? ? ? ?try: > ? ? ? ? ? ?fd = os.open(dst, os.O_EXCL | os.O_CREAT) > ? ? ? ?except OSError: > ? ? ? ? ? ?raise Error('Destination exists') > ? ? ? ?try: > ? ? ? ? ? ?_move(src, dst) > ? ? ? ?finally: > ? ? ? ? ? ?os.close(fd) > ? ?else: > ? ? ? ? _move(src, dst) > > > Either case is a prime example of why Guido's rule of thumb that functions > shouldn't take arguments which select between two (or more) alternative > behaviours is a good design principle. Better to keep two separate > functions. > > > -1 on a flag argument to shutil.move. > > +1 on a "safe" version of shutil.move. > > +0 on the given implementation. I'm not convinced that the proposed function > actually is any safer. > > On the tracker, the OP says: > > "If this fails, then it means that the file exists, but if it passes, then > dst is locked so no other process can create it." > > But that's not what actually happens. On my Linux box, I can do this in one > terminal: > >>>> dst = 'spam' >>>> fd = os.open(dst, os.O_EXCL | os.O_CREAT) >>>> os.path.exists(dst) > True >>>> open(dst).read() ?# Confirm file is empty. > '' > > > I then switch to another terminal, and do this: > > [steve at sylar ~]$ cat spam ?# File exists, and is empty. > [steve at sylar ~]$ echo "Writing to a locked file" > spam > [steve at sylar ~]$ > > > And back to the first one: > >>>> open(dst).read() > 'Writing to a locked file\n' > > So despite the allegedly exclusive lock, another process was able to > overwrite the file after the lock was taken. > > > > > -- > Steven > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From tjreedy at udel.edu Sun Aug 14 07:06:42 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 14 Aug 2011 01:06:42 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: On 8/14/2011 12:06 AM, David Townshend wrote: > Initially I thought it would be better to use an argument, but having > had a look at the code I'm not so sure any more, since it could end up > as two different implementations in the same function. Also, I'm not so > sure its a good idea to have an optional argument (never_overwrite) that > defaults to False for backward compatibility, but should usually be set > to True for safe behaviour. The new 4th parameter for difflib.SequenceMatcher is much like that. For me the issue is whether the code look like if param: do algorithm a else: do algorithm b versus code if param: something else: something else more code -- Terry Jan Reedy From ben+python at benfinney.id.au Sun Aug 14 07:27:04 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 14 Aug 2011 15:27:04 +1000 Subject: [Python-ideas] Cross-platform lockfile (was: Implementation of shutil.move) References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> <4E474987.4050701@pearwood.info> Message-ID: <87y5ywef6f.fsf_-_@benfinney.id.au> Devin Jeanpierre writes: > As far as I am aware, locked files are impossible to do in a > cross-platform way The ?lockfile? library is an attempt to implement cross-platform lockfile functionality. I would like that library to be part of the Python standard library, but I think the current maintainer (Skip Montanaro) no longer has the available time to get that done. Anyone care to work with me on getting ?lockfile? into the Python standard library? -- \ ?I'm a born-again atheist.? ?Gore Vidal | `\ | _o__) | Ben Finney From aquavitae69 at gmail.com Sun Aug 14 08:01:13 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 14 Aug 2011 08:01:13 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: > The new 4th parameter for difflib.SequenceMatcher is much like that. For > me the issue is whether the code look like > > if param: > do algorithm a > else: > do algorithm b > > versus > > code > if param: something > else: something else > more code My point exactly. The latest idea (which I've now described on the issue tracker), is not to use os.open(), rather os.link() and os.unlink(), which should work the same across platforms. Please could someone correct me if I'm wrong about this. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aquavitae69 at gmail.com Sun Aug 14 09:26:42 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 14 Aug 2011 09:26:42 +0200 Subject: [Python-ideas] Cross-platform lockfile (was: Implementation of shutil.move) In-Reply-To: <87y5ywef6f.fsf_-_@benfinney.id.au> References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> <4E474987.4050701@pearwood.info> <87y5ywef6f.fsf_-_@benfinney.id.au> Message-ID: As far as I can make out, this package is designed to lock files from access by different threads within a program by using a ".lock" file. I can't see how this could lock a file from external modification. On Sun, Aug 14, 2011 at 7:27 AM, Ben Finney wrote: > Devin Jeanpierre > writes: > > > As far as I am aware, locked files are impossible to do in a > > cross-platform way > > The ?lockfile? library is an > attempt to implement cross-platform lockfile functionality. > > I would like that library to be part of the Python standard library, but > I think the current maintainer (Skip Montanaro) no longer has the > available time to get that done. > > Anyone care to work with me on getting ?lockfile? into the Python > standard library? > > -- > \ ?I'm a born-again atheist.? ?Gore Vidal | > `\ | > _o__) | > Ben Finney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aquavitae69 at gmail.com Sun Aug 14 09:36:05 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 14 Aug 2011 09:36:05 +0200 Subject: [Python-ideas] Cross-platform lockfile (was: Implementation of shutil.move) In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> <4E474987.4050701@pearwood.info> <87y5ywef6f.fsf_-_@benfinney.id.au> Message-ID: However... It might be possible to use the "immutable" attribute in linux (and hopefully other unixes) to lock a file. This is still not completely secure since root can remove this attribute and change the file, but this is highly unlikely to happen by accident. On Sun, Aug 14, 2011 at 9:26 AM, David Townshend wrote: > As far as I can make out, this package is designed to lock files from > access by different threads within a program by using a ".lock" file. I > can't see how this could lock a file from external modification. > > > On Sun, Aug 14, 2011 at 7:27 AM, Ben Finney wrote: > >> Devin Jeanpierre >> writes: >> >> > As far as I am aware, locked files are impossible to do in a >> > cross-platform way >> >> The ?lockfile? library is an >> attempt to implement cross-platform lockfile functionality. >> >> I would like that library to be part of the Python standard library, but >> I think the current maintainer (Skip Montanaro) no longer has the >> available time to get that done. >> >> Anyone care to work with me on getting ?lockfile? into the Python >> standard library? >> >> -- >> \ ?I'm a born-again atheist.? ?Gore Vidal | >> `\ | >> _o__) | >> Ben Finney >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Sun Aug 14 11:30:02 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 14 Aug 2011 19:30:02 +1000 Subject: [Python-ideas] Cross-platform lockfile References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> <4E474987.4050701@pearwood.info> <87y5ywef6f.fsf_-_@benfinney.id.au> Message-ID: <87liuwe3xh.fsf@benfinney.id.au> David Townshend writes: > As far as I can make out, this package [PyPI's ?lockfile?] is designed > to lock files from access by different threads within a program by > using a ".lock" file. I can't see how this could lock a file from > external modification. You're right, ?lockfile? is for the cooperative locking conventions common on most operating systems. It doesn't implement mandatory exclusive locking of resources. -- \ ?When I was a kid I used to pray every night for a new bicycle. | `\ Then I realised that the Lord doesn't work that way so I stole | _o__) one and asked Him to forgive me.? ?Emo Philips | Ben Finney From lists at cheimes.de Sun Aug 14 16:00:50 2011 From: lists at cheimes.de (Christian Heimes) Date: Sun, 14 Aug 2011 16:00:50 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: Am 14.08.2011 08:01, schrieb David Townshend: > The latest idea (which I've now described on the issue tracker), is not to > use os.open(), rather os.link() and os.unlink(), which should work the same > across platforms. Please could someone correct me if I'm wrong about this. My proposed link() / unlink() procedure works only on platforms and file systems, that support hard links. I totally forgot about the file system issue, sorry. :) Hard links won't work on a FAT32 mount point on Unix and probably not on Samba shares, too. It might work on NTFS if the VFS implements it but NTFS has a limited reference counter for hard links. It might be possible that a rename() op would work but a link() op wouldn't. Christian From aquavitae69 at gmail.com Sun Aug 14 16:09:32 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 14 Aug 2011 16:09:32 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: It seems there's a second problem too - move on remote file systems use copy rather than rename, so changing the implementation means changing it for copy too, which is more difficult. Maybe the best option is to try to apply some sort of locking mechanism, but I can't see how right now. On Aug 14, 2011 4:01 PM, "Christian Heimes" wrote: > Am 14.08.2011 08:01, schrieb David Townshend: >> The latest idea (which I've now described on the issue tracker), is not to >> use os.open(), rather os.link() and os.unlink(), which should work the same >> across platforms. Please could someone correct me if I'm wrong about this. > > My proposed link() / unlink() procedure works only on platforms and file > systems, that support hard links. I totally forgot about the file system > issue, sorry. :) > > Hard links won't work on a FAT32 mount point on Unix and probably not on > Samba shares, too. It might work on NTFS if the VFS implements it but > NTFS has a limited reference counter for hard links. It might be > possible that a rename() op would work but a link() op wouldn't. > > Christian > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Aug 14 16:30:39 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 15 Aug 2011 00:30:39 +1000 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: <4E47DC0F.60709@pearwood.info> Christian Heimes wrote: > Am 14.08.2011 08:01, schrieb David Townshend: >> The latest idea (which I've now described on the issue tracker), is not to >> use os.open(), rather os.link() and os.unlink(), which should work the same >> across platforms. Please could someone correct me if I'm wrong about this. > > My proposed link() / unlink() procedure works only on platforms and file > systems, that support hard links. I totally forgot about the file system > issue, sorry. :) It seems to me that these various "safe-ish move" procedures should be added to shutil, as separate functions, and the limitations of each documented. The caller then can decide which, if any, they are going to use. Avoiding race conditions when renaming or copying files is a hard problem. Who says that there is one solution that will work everywhere? -- Steven From lists at cheimes.de Sun Aug 14 16:39:43 2011 From: lists at cheimes.de (Christian Heimes) Date: Sun, 14 Aug 2011 16:39:43 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: Am 14.08.2011 16:09, schrieb David Townshend: > It seems there's a second problem too - move on remote file systems use copy > rather than rename, so changing the implementation means changing it for > copy too, which is more difficult. Maybe the best option is to try to apply > some sort of locking mechanism, but I can't see how right now. Why do you think that move on remote file systems use copy? From past experience and recent tests I can confirm that shutil.move() uses rename on remote CIFS and NFS file systems. From jeanpierreda at gmail.com Sun Aug 14 17:11:29 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 14 Aug 2011 11:11:29 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: > Why do you think that move on remote file systems use copy? From past > experience and recent tests I can confirm that shutil.move() uses rename > on remote CIFS and NFS file systems I believe what he meant to say was "if you move from one filesystem to another". shutil.move tries to do a copy and delete if rename fails with an OSError. Devin On Sun, Aug 14, 2011 at 10:39 AM, Christian Heimes wrote: > Am 14.08.2011 16:09, schrieb David Townshend: >> It seems there's a second problem too - move on remote file systems use copy >> rather than rename, so changing the implementation means changing it for >> copy too, which is more difficult. Maybe the best option is to try to apply >> some sort of locking mechanism, but I can't see how right now. > > Why do you think that move on remote file systems use copy? From past > experience and recent tests I can confirm that shutil.move() uses rename > on remote CIFS and NFS file systems. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From aquavitae69 at gmail.com Sun Aug 14 17:23:30 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 14 Aug 2011 17:23:30 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: Sorry, yes. That is what I meant. On Aug 14, 2011 5:12 PM, "Devin Jeanpierre" wrote: >> Why do you think that move on remote file systems use copy? From past >> experience and recent tests I can confirm that shutil.move() uses rename >> on remote CIFS and NFS file systems > > I believe what he meant to say was "if you move from one filesystem to another". > > shutil.move tries to do a copy and delete if rename fails with an OSError. > > Devin > > On Sun, Aug 14, 2011 at 10:39 AM, Christian Heimes wrote: >> Am 14.08.2011 16:09, schrieb David Townshend: >>> It seems there's a second problem too - move on remote file systems use copy >>> rather than rename, so changing the implementation means changing it for >>> copy too, which is more difficult. Maybe the best option is to try to apply >>> some sort of locking mechanism, but I can't see how right now. >> >> Why do you think that move on remote file systems use copy? From past >> experience and recent tests I can confirm that shutil.move() uses rename >> on remote CIFS and NFS file systems. >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From aquavitae69 at gmail.com Mon Aug 15 12:07:36 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 15 Aug 2011 12:07:36 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: I''ve done some more reading and this is where I've got to: shutil.move() uses either os.rename() or shutil.copyfile() (via shutil.copytree()) depending on the situation. shutil.copyfile() uses open(). So to implement a safe move function it would seem to be necessary to do the same to copyfile(), and possibly open(). open(), in my opinon, already behaves as it should. It would be possible to add a slightly safer implementation by writing to a temporary file first, but this would not always be desired or even possible. Perhaps an alternative function could be added if the idea is popular enough? I would expect copyfile(), like move(), to fail if the destination exists. This is not the current behaviour, so both functions could benefit from this. >From what I've read, there are several ways of ensuring that these functions fail if destination exists depending on the platform and filesystem, but there is no uniform way to do it. One approach would be to try all possible methods and hope that at least one works, with a simple "if os.exists(dst): fail" fallback. The documentation would state that "An exception occurs if the destination exists. This check is done is as safe a way possible to avoid race conditions where the system supports it." An additional measure of safety on copyfile() would be to write to a temporary file first, then use move. This would allow rollback in case of failure during the copy, but as with open(), its not always the most appropriate approach. Adding new copyfile() and move() functions would mean also mean adding new copy(), copy2() and copytree() functions, perhaps as copy3() and copytree2(). This seems to be getting rather messy - three slightly different copy functions, so it might still be better to add an optional argument to these. Alternatively, a new module could be added dedicated to safe file operations. On Sun, Aug 14, 2011 at 5:23 PM, David Townshend wrote: > Sorry, yes. That is what I meant. > On Aug 14, 2011 5:12 PM, "Devin Jeanpierre" > wrote: > >> Why do you think that move on remote file systems use copy? From past > >> experience and recent tests I can confirm that shutil.move() uses rename > >> on remote CIFS and NFS file systems > > > > I believe what he meant to say was "if you move from one filesystem to > another". > > > > shutil.move tries to do a copy and delete if rename fails with an > OSError. > > > > Devin > > > > On Sun, Aug 14, 2011 at 10:39 AM, Christian Heimes > wrote: > >> Am 14.08.2011 16:09, schrieb David Townshend: > >>> It seems there's a second problem too - move on remote file systems use > copy > >>> rather than rename, so changing the implementation means changing it > for > >>> copy too, which is more difficult. Maybe the best option is to try to > apply > >>> some sort of locking mechanism, but I can't see how right now. > >> > >> Why do you think that move on remote file systems use copy? From past > >> experience and recent tests I can confirm that shutil.move() uses rename > >> on remote CIFS and NFS file systems. > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> http://mail.python.org/mailman/listinfo/python-ideas > >> > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Mon Aug 15 13:42:51 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Mon, 15 Aug 2011 07:42:51 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: > open(), in my opinon, already behaves as it should. open(..., 'w') doesn't, it overwrites the target. For copying an individual file, you'd want os.open(..., O_EXCL | O_CREAT), which is a cross-platform, race-condition-free way of creating and writing to a single file provided it didn't exist before. > From what I've read, there are several ways of ensuring that these functions > fail if destination exists depending on the platform and filesystem, but > there is no uniform way to do it. One approach would be to try all possible > methods and hope that at least one works, with a simple "if os.exists(dst): > fail" fallback. The documentation would state that "An exception occurs if > the destination exists. This check is done is as safe a way possible to > avoid race conditions where the system supports it." There are two attitudes to take when using the don't-overwrite argument: 1. User cares about safety and race conditions, is glad that this function tries to be safe. 2. User doesn't care about safety or race conditions, as User doesn't have shell scripts creating files at inconvenient times and tempting fate. In case 1, having it silently revert to the unsafe version is very bad, and potentially damaging. In case 2, the user doesn't care about the safe version. In fact, User 2 is probably using os.path.exists already. If it can't do things safely, it shouldn't do them at all. Devin On Mon, Aug 15, 2011 at 6:07 AM, David Townshend wrote: > I''ve done some more reading and this is where I've got to: > shutil.move() uses either os.rename() or shutil.copyfile() (via > shutil.copytree()) depending on the situation. shutil.copyfile() uses > open(). ?So to implement a safe move function it would seem to be necessary > to do the same to copyfile(), and possibly open(). > open(), in my opinon, already behaves as it should. ?It would be possible to > add a slightly safer implementation by writing to a temporary file first, > but this would not always be desired or even possible. Perhaps an > alternative function could be added if the idea is popular enough? > I would expect copyfile(), like move(), to fail if the destination exists. > ?This is not the current behaviour, so both functions could benefit from > this. > From what I've read, there are several ways of ensuring that these functions > fail if destination exists depending on the platform and filesystem, but > there is no uniform way to do it. One approach would be to try all possible > methods and hope that at least one works, with a simple "if os.exists(dst): > fail" fallback. ?The documentation would state that "An exception occurs if > the destination exists. ?This check is done is as safe a way possible to > avoid race conditions where the system supports it." ?An additional measure > of safety on copyfile() would be to write to a temporary file first, then > use move. This would allow rollback in case of failure during the copy, but > as with open(), its not always the most appropriate approach. > Adding new copyfile() and move() functions would mean also mean adding new > copy(), copy2() and copytree() functions, perhaps as copy3() and > copytree2(). This seems to be getting rather messy - three slightly > different copy functions, so it might still be better to add an optional > argument to these. ?Alternatively, a new module could be added dedicated to > safe file operations. > On Sun, Aug 14, 2011 at 5:23 PM, David Townshend > wrote: >> >> Sorry, yes. That is what I meant. >> >> On Aug 14, 2011 5:12 PM, "Devin Jeanpierre" >> wrote: >> >> Why do you think that move on remote file systems use copy? From past >> >> experience and recent tests I can confirm that shutil.move() uses >> >> rename >> >> on remote CIFS and NFS file systems >> > >> > I believe what he meant to say was "if you move from one filesystem to >> > another". >> > >> > shutil.move tries to do a copy and delete if rename fails with an >> > OSError. >> > >> > Devin >> > >> > On Sun, Aug 14, 2011 at 10:39 AM, Christian Heimes >> > wrote: >> >> Am 14.08.2011 16:09, schrieb David Townshend: >> >>> It seems there's a second problem too - move on remote file systems >> >>> use copy >> >>> rather than rename, so changing the implementation means changing it >> >>> for >> >>> copy too, which is more difficult. Maybe the best option is to try to >> >>> apply >> >>> some sort of locking mechanism, but I can't see how right now. >> >> >> >> Why do you think that move on remote file systems use copy? From past >> >> experience and recent tests I can confirm that shutil.move() uses >> >> rename >> >> on remote CIFS and NFS file systems. >> >> >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> http://mail.python.org/mailman/listinfo/python-ideas >> >> >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > http://mail.python.org/mailman/listinfo/python-ideas > > From ncoghlan at gmail.com Mon Aug 15 14:17:11 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 15 Aug 2011 22:17:11 +1000 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: On Mon, Aug 15, 2011 at 9:42 PM, Devin Jeanpierre wrote: > If it can't do things safely, it shouldn't do them at all. This pretty much sums up the reason why the standard lib is lacking in this area - it's hard to do anything more than the unsafe basics that isn't providing a *false* sense of security for at least some use cases. Good exploration of the possibilities, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From aquavitae69 at gmail.com Mon Aug 15 14:20:23 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 15 Aug 2011 14:20:23 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: On Mon, Aug 15, 2011 at 1:42 PM, Devin Jeanpierre wrote: > > open(), in my opinon, already behaves as it should. > > open(..., 'w') doesn't, it overwrites the target. For copying an > individual file, you'd want os.open(..., O_EXCL | O_CREAT), which is a > cross-platform, race-condition-free way of creating and writing to a > single file provided it didn't exist before. > > Good point. Perhaps the best way of improving this would be to add a 'c' mode to the builtin open() for creating new files, so that typical usage would be open(file, 'cw'). Or maybe 'c' should imply 'w', since there seems little point in creating a new file read-only. > > From what I've read, there are several ways of ensuring that these > functions > > fail if destination exists depending on the platform and filesystem, but > > there is no uniform way to do it. One approach would be to try all > possible > > methods and hope that at least one works, with a simple "if > os.exists(dst): > > fail" fallback. The documentation would state that "An exception occurs > if > > the destination exists. This check is done is as safe a way possible to > > avoid race conditions where the system supports it." > > There are two attitudes to take when using the don't-overwrite argument: > > 1. User cares about safety and race conditions, is glad that this > function tries to be safe. > > 2. User doesn't care about safety or race conditions, as User doesn't > have shell scripts creating files at inconvenient times and tempting > fate. > > In case 1, having it silently revert to the unsafe version is very > bad, and potentially damaging. In case 2, the user doesn't care about > the safe version. In fact, User 2 is probably using os.path.exists > already. > > If it can't do things safely, it shouldn't do them at all. So we provide an alternate safe implementation, which may not work on all systems? And it will be up to the user to decide whether to fall back to the unsafe functions? -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Mon Aug 15 14:33:23 2011 From: lists at cheimes.de (Christian Heimes) Date: Mon, 15 Aug 2011 14:33:23 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: <4E491213.1000305@cheimes.de> Am 15.08.2011 14:17, schrieb Nick Coghlan: > On Mon, Aug 15, 2011 at 9:42 PM, Devin Jeanpierre > wrote: >> If it can't do things safely, it shouldn't do them at all. > > This pretty much sums up the reason why the standard lib is lacking in > this area - it's hard to do anything more than the unsafe basics that > isn't providing a *false* sense of security for at least some use > cases. Good exploration of the possibilities, though. I have to agree with Nick. The worst mistake we could make is to declare something as secure although it is flawed. Most people don't have to worry about race conditions when renaming a file. The majority of Python apps aren't working in a hostile environment (e.g. hostile users should not be able to modify directories) or don't need more security than a shell script. Let's hope that the few system level programs are written by professionals. Another good thing is the fact that rename(2) already takes symlink attacks into account. It doesn't follow symlinks newpath but replaces the link instead. Christian From lists at cheimes.de Mon Aug 15 14:40:13 2011 From: lists at cheimes.de (Christian Heimes) Date: Mon, 15 Aug 2011 14:40:13 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: <4E4913AD.2080103@cheimes.de> Am 15.08.2011 14:20, schrieb David Townshend: > Good point. Perhaps the best way of improving this would be to add a 'c' > mode to the builtin open() for creating new files, so that typical usage > would be open(file, 'cw'). Or maybe 'c' should imply 'w', since there seems > little point in creating a new file read-only. For Python 2 it's not possible because Python's internal file API uses fopen(3). fopen(3) has a limited set of flags and exclusive create isn't one of them. You'd have to rewrite larger parts of the API. Python 3's io library is build around file descriptors returned from open(2). It shouldn't be hard to write a 'c' flag for io. Christian From aquavitae69 at gmail.com Mon Aug 15 14:59:14 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 15 Aug 2011 14:59:14 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <4E4913AD.2080103@cheimes.de> References: <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> <4E4913AD.2080103@cheimes.de> Message-ID: I'll have a look at open() later and see if I can write a patch for python 3. I'm not sure its worth that amount of work for python 2 at this stage in it's lifecycle. I'll also have a look at whether its possible to provide a safe alternative to move and copyfile, even if it cannot be used on all systems. I'm not thinking of actively hostile environments so much as multi-user situations. On Mon, Aug 15, 2011 at 2:40 PM, Christian Heimes wrote: > Am 15.08.2011 14:20, schrieb David Townshend: > > Good point. Perhaps the best way of improving this would be to add a 'c' > > mode to the builtin open() for creating new files, so that typical usage > > would be open(file, 'cw'). Or maybe 'c' should imply 'w', since there > seems > > little point in creating a new file read-only. > > For Python 2 it's not possible because Python's internal file API uses > fopen(3). fopen(3) has a limited set of flags and exclusive create isn't > one of them. You'd have to rewrite larger parts of the API. > > Python 3's io library is build around file descriptors returned from > open(2). It shouldn't be hard to write a 'c' flag for io. > > Christian > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Mon Aug 15 15:29:41 2011 From: lists at cheimes.de (Christian Heimes) Date: Mon, 15 Aug 2011 15:29:41 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <4E46AFA9.8080900@mrabarnett.plus.com> <4E4913AD.2080103@cheimes.de> Message-ID: <4E491F45.6020001@cheimes.de> Am 15.08.2011 14:59, schrieb David Townshend: > I'll have a look at open() later and see if I can write a patch for python > 3. I'm not sure its worth that amount of work for python 2 at this stage in > it's lifecycle. > > I'll also have a look at whether its possible to provide a safe alternative > to move and copyfile, even if it cannot be used on all systems. I'm not > thinking of actively hostile environments so much as multi-user situations. You have to modify at least the C functions Modules/_io/_iomodule.c:io_open() Modules/_io/fileio.c:fileio_init() as well as the pure python implementation Lib/_pyio.py to implement the 'c' mode. I like the idea and I've been missing the feature for a long time. This may sound harsh. If you proposed changes don't survive hostiles environment then there is no reason in implementing them at all. It's the false sense of security Nick was talking about earlier. At best your solution is slightly less insecure but still insecure and a loophole for exploits. IMHO you should update the docs and explain why and how some operations are subjected to race conditions. Christian From jeanpierreda at gmail.com Mon Aug 15 15:41:05 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Mon, 15 Aug 2011 09:41:05 -0400 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: > Good point. Perhaps the best way of improving this would be to add a 'c' > mode to the builtin open() for creating new files, so that typical usage > would be open(file, 'cw'). Or maybe 'c' should imply 'w', since there seems > little point in creating a new file read-only. The normal open() always did seem a bit weak, and this is probably the biggest use-case that it doesn't cover. +1 > So we provide an alternate safe implementation, which may not work on all > systems? And it will be up to the user to decide whether to fall back to the > unsafe functions? Well, that's basically what I was getting at. I don't like the idea of silently falling back to the unsafe thing one bit. It wouldn't be so bad to have something that tries to do it without any race conditions etc., and raises an exception if this isn't possible. Devin On Mon, Aug 15, 2011 at 8:20 AM, David Townshend wrote: > > > On Mon, Aug 15, 2011 at 1:42 PM, Devin Jeanpierre > wrote: >> >> > open(), in my opinon, already behaves as it should. >> >> open(..., 'w') doesn't, it overwrites the target. For copying an >> individual file, you'd want os.open(..., O_EXCL | O_CREAT), which is a >> cross-platform, race-condition-free way of creating and writing to a >> single file provided it didn't exist before. >> > Good point. ?Perhaps the best way of improving this would be to add a 'c' > mode to the builtin open() for creating new files, so that typical usage > would be open(file, 'cw'). ?Or maybe 'c' should imply 'w', since there seems > little point in creating a new file read-only. > >> >> > From what I've read, there are several ways of ensuring that these >> > functions >> > fail if destination exists depending on the platform and filesystem, but >> > there is no uniform way to do it. One approach would be to try all >> > possible >> > methods and hope that at least one works, with a simple "if >> > os.exists(dst): >> > fail" fallback. ?The documentation would state that "An exception occurs >> > if >> > the destination exists. ?This check is done is as safe a way possible to >> > avoid race conditions where the system supports it." >> >> There are two attitudes to take when using the don't-overwrite argument: >> >> 1. User cares about safety and race conditions, is glad that this >> function tries to be safe. >> >> 2. User doesn't care about safety or race conditions, as User doesn't >> have shell scripts creating files at inconvenient times and tempting >> fate. >> >> In case 1, having it silently revert to the unsafe version is very >> bad, and potentially damaging. In case 2, the user doesn't care about >> the safe version. In fact, User 2 is probably using os.path.exists >> already. >> >> If it can't do things safely, it shouldn't do them at all. > > So we provide an alternate safe implementation, which may not work on all > systems? And it will be up to the user to decide whether to fall back to the > unsafe functions? From aquavitae69 at gmail.com Mon Aug 15 15:58:45 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 15 Aug 2011 15:58:45 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <2018D5D2-0D1D-4ABF-B32D-0F151A36C5DF@masklinn.net> <4E455967.8050305@cheimes.de> <4E46AFA9.8080900@mrabarnett.plus.com> Message-ID: > > You have to modify at least the C functions > Modules/_io/_iomodule.c:io_open() > Modules/_io/fileio.c:fileio_init() > as well as the pure python implementation > Lib/_pyio.py Thanks for the info - it will save me looking for it :-) Well, that's basically what I was getting at. I don't like the idea of > silently falling back to the unsafe thing one bit. It wouldn't be so > bad to have something that tries to do it without any race conditions > etc., and raises an exception if this isn't possible. This may sound harsh. If you proposed changes don't survive hostiles > environment then there is no reason in implementing them at all. It's > the false sense of security Nick was talking about earlier. At best your > solution is slightly less insecure but still insecure and a loophole for > exploits. IMHO you should update the docs and explain why and how some > operations are subjected to race conditions. So a new function, say safe_copy(), tries to copy securely. If it can't, then an exception is raised. The user can then do something like: try: safe_copy(src, dst) except Error: logging.warning('Unsafe copy in progress') copy2(src, dst) My question now is whether there is really a need for this. The other option is, as Christian says, to document the problem and perhaps present an recipe for avoiding it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Aug 15 19:10:33 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 16 Aug 2011 03:10:33 +1000 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <4E491F45.6020001@cheimes.de> References: <4E46AFA9.8080900@mrabarnett.plus.com> <4E4913AD.2080103@cheimes.de> <4E491F45.6020001@cheimes.de> Message-ID: <4E495309.5070200@pearwood.info> Christian Heimes wrote: > This may sound harsh. If you proposed changes don't survive hostiles > environment then there is no reason in implementing them at all. It's > the false sense of security Nick was talking about earlier. At best your > solution is slightly less insecure but still insecure and a loophole for > exploits. IMHO you should update the docs and explain why and how some > operations are subjected to race conditions. Security against hostile attacks is not the only value for a so-called "safe move". There is also security against accidental collisions. I currently have about 100 processes running as me (excluding system processes), and some of them write to files. Sometimes I have a few scripts running which write to a *lot* of files. I'd like a little more protection from accidental collisions, even if it's not foolproof. But please don't call the function "safe_move", since it isn't safe. Better a bland name like "move2", and full disclosure of what it can and can't protect you from, than a misleading name that gives a false impression. -- Steven From guido at python.org Mon Aug 15 19:24:47 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 15 Aug 2011 10:24:47 -0700 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: <4E495309.5070200@pearwood.info> References: <4E46AFA9.8080900@mrabarnett.plus.com> <4E4913AD.2080103@cheimes.de> <4E491F45.6020001@cheimes.de> <4E495309.5070200@pearwood.info> Message-ID: I know I'm late to the party, but I'm surprised that the existing behavior of move() is considered problematic. It carefully mimics the default behavior of the Unix "mv" command. Doing a move or rename atomically with the provision that it fails reliably when the target exists might be useful in some cases, but it seems to be more related to filesystem-based locking, which is known to be hard in a cross-platform way. (It also seems that some folks on the thread have ignored Windows in their use of the term "cross-platform".) -- --Guido van Rossum (python.org/~guido) From masklinn at masklinn.net Mon Aug 15 22:10:56 2011 From: masklinn at masklinn.net (Masklinn) Date: Mon, 15 Aug 2011 22:10:56 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <4E46AFA9.8080900@mrabarnett.plus.com> <4E4913AD.2080103@cheimes.de> <4E491F45.6020001@cheimes.de> <4E495309.5070200@pearwood.info> Message-ID: On 2011-08-15, at 19:24 , Guido van Rossum wrote: > > Doing a move or rename atomically with the provision that it fails > reliably when the target exists might be useful in some cases, but it > seems to be more related to filesystem-based locking, which is known > to be hard in a cross-platform way. (It also seems that some folks on > the thread have ignored Windows in their use of the term > "cross-platform".) According to the documentation for `os.rename`, it already fails on windows if the target exists (though I do not know if it also fails on e.g. NFS or across filesystems), which seems to be the behavior desired for "move2". So there is no real reason to discuss it, unless there are tricky edge cases in its behavior. From aquavitae69 at gmail.com Mon Aug 15 22:15:05 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Mon, 15 Aug 2011 22:15:05 +0200 Subject: [Python-ideas] Implementation of shutil.move In-Reply-To: References: <4E46AFA9.8080900@mrabarnett.plus.com> <4E4913AD.2080103@cheimes.de> <4E491F45.6020001@cheimes.de> <4E495309.5070200@pearwood.info> Message-ID: On Mon, Aug 15, 2011 at 7:24 PM, Guido van Rossum wrote: > I know I'm late to the party, but I'm surprised that the existing > behavior of move() is considered problematic. It carefully mimics the > default behavior of the Unix "mv" command. > But "mv" does allow the "-i" (interactive) and "-n" (no-clobber) arguments, which move() doesn't. > Doing a move or rename atomically with the provision that it fails > reliably when the target exists might be useful in some cases, but it > seems to be more related to filesystem-based locking, which is known > to be hard in a cross-platform way. (It also seems that some folks on > the thread have ignored Windows in their use of the term > "cross-platform".) That's certainly what I've found in trying to do it! But it seems that it should be possible on most systems, just each in a different way. On windows, it should be as easy as trying os.open(file, os.O_EXEC|os.O_CREAT) first since windows appears to lock a file to a process when its opened. Linux ext file systems (arguably the most common) support an immutable attribute, also effectively locking the file. I think that reiserFS and XFS both has similar features. Unfortunately I have no experience with OSX or HPFS so I'm not sure how it would work on a Mac. This is quite a lot to implement though, so whether its worth it is another matter... especially since it would be far easier (although slower) to just copy/remove if it really is a problem. -------------- next part -------------- An HTML attachment was scrubbed... URL: From aquavitae69 at gmail.com Tue Aug 16 17:45:50 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Tue, 16 Aug 2011 17:45:50 +0200 Subject: [Python-ideas] Add create mode to open() Message-ID: This idea was proposed in the discussion on shutil.move, but I thought it would be worth posting separately to avoid confusing the discussion. The idea is to add a create mode ('c') to the builtin open() function, which will have the same effect as os.open(file, os.O_EXCL|os.O_CREAT). I have added an issue (http://bugs.python.org/issue12760) for this, including a patch. -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Tue Aug 16 23:26:23 2011 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 16 Aug 2011 21:26:23 +0000 (UTC) Subject: [Python-ideas] Add create mode to open() References: Message-ID: David Townshend writes: > > > This idea was proposed in the discussion on shutil.move, but I thought it would be worth posting separately to avoid confusing the discussion. > > The idea is to add a create mode ('c') to the builtin open() function, which will have the same effect as os.open(file, os.O_EXCL|os.O_CREAT). ?I have added an issue (http://bugs.python.org/issue12760) for this, including a patch. I am -1 because - Possibly not portable or at least subject to implementations of varying quality. - No precedence in other languages or fopen() for that matter. - It's not hard to use os.fdopen(). From fuzzyman at gmail.com Tue Aug 16 23:33:26 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Tue, 16 Aug 2011 22:33:26 +0100 Subject: [Python-ideas] Syntax for dedented strings Message-ID: Hello all, I'm sure this will be shot down [1], but it annoys me often enough that I'm going to suggest it anyway. :-) A recent tweet of Raymond's reminded me how useful textwrap.dedent() is for dedenting triple quoted strings in code blocks: def function(): this_string = textwrap.dedent("""\ Here is some indented text. that dedent will handle for us.""" ) Unfortunately that doesn't work for docstrings as they must be string literals. It is compounded by the fact that you can't even create the docstring for a class and manually assign it later. (Why not? But that's another issue...) How about *another* string prefix for dedented strings: class Thing(object): d""" This text will be, nicely dedented, thank you very much. """" All the best, Michael Foord [1] Because of the -100 rule as much as anything else, which applies doubly to features requiring new syntax https://blogs.msdn.com/b/ericgu/archive/2004/01/12/57985.aspx -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Tue Aug 16 23:42:10 2011 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 16 Aug 2011 14:42:10 -0700 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: References: Message-ID: This has been discussed a few times in the past. http://mail.python.org/pipermail/python-ideas/2011-May/010207.html triple-quoted strings and indendation http://mail.python.org/pipermail/python-ideas/2010-November/008589.html Multi-line strings that respect indentation --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com On Tue, Aug 16, 2011 at 2:33 PM, Michael Foord wrote: > Hello all, > > I'm sure this will be shot down [1], but it annoys me often enough that I'm > going to suggest it anyway. :-) > > A recent tweet of Raymond's reminded me how useful textwrap.dedent() is for > dedenting triple quoted strings in code blocks: > > def function(): > this_string = textwrap.dedent("""\ > Here is some indented text. > that dedent will handle for us.""" > ) > > Unfortunately that doesn't work for docstrings as they must be string > literals. It is compounded by the fact that you can't even create the > docstring for a class and manually assign it later. (Why not? But that's > another issue...) > > How about *another* string prefix for dedented strings: > > class Thing(object): > d""" > This text will be, > nicely dedented, > thank you very much. > """" > > All the best, > > Michael Foord > > [1] Because of the -100 rule as much as anything else, which applies doubly > to features requiring new syntax > https://blogs.msdn.com/b/ericgu/archive/2004/01/12/57985.aspx > > -- > > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > May you share freely, never taking more than you give. > > > -- the sqlite blessing http://www.sqlite.org/different.html > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Tue Aug 16 23:58:36 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 17 Aug 2011 07:58:36 +1000 Subject: [Python-ideas] Syntax for dedented strings References: Message-ID: <87mxf9c92r.fsf@benfinney.id.au> Michael Foord writes: > How about *another* string prefix for dedented strings: > > class Thing(object): > d""" > This text will be, > nicely dedented, > thank you very much. > """" class Thing(object): """ This literal string contains leading and trailing whitespace. It also is indented. But none of that will show up when the docstring is processed and presented to the user. Because it is a docstring that conforms to PEP 257, the indentation will be handled properly by PEP-257 conformant docstring processors. _ gives the specification for how indentation shall be handled by code that processes Python docstrings. The programmer inspecting the value of ?__doc__? will still see the leading, trailing, and indenting whitespace; but the programmer doing so isn't the recipient of the docstring as a docstring. So I don't see what problem there is to be solved. """ pass -- \ ?Guaranteed to work throughout its useful life.? ?packaging for | `\ clockwork toy, Hong Kong | _o__) | Ben Finney From fuzzyman at gmail.com Wed Aug 17 00:43:29 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Tue, 16 Aug 2011 23:43:29 +0100 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: <87mxf9c92r.fsf@benfinney.id.au> References: <87mxf9c92r.fsf@benfinney.id.au> Message-ID: On 16 August 2011 22:58, Ben Finney wrote: > Michael Foord writes: > > > How about *another* string prefix for dedented strings: > > > > class Thing(object): > > d""" > > This text will be, > > nicely dedented, > > thank you very much. > > """" > > class Thing(object): > """ This literal string contains leading and trailing whitespace. > > It also is indented. But none of that will show up when the > docstring is processed and presented to the user. > > Because it is a docstring that conforms to PEP 257, the > indentation will be handled properly by PEP-257 conformant > docstring processors. > > < > http://www.python.org/dev/peps/pep-0257/#handling-docstring-indentation>_ > gives the specification for how indentation shall be handled by > code that processes Python docstrings. > The place I'm concerned about is the interactive interpreter, virtually the only place I look at docstrings that isn't directly in the source code (or pre-processed by a doc tool - but I don't care about that). Michael > > The programmer inspecting the value of ?__doc__? will still see > the leading, trailing, and indenting whitespace; but the > programmer doing so isn't the recipient of the docstring as a > docstring. > > So I don't see what problem there is to be solved. > > """ > pass > > -- > \ ?Guaranteed to work throughout its useful life.? ?packaging for | > `\ clockwork toy, Hong Kong | > _o__) | > Ben Finney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 17 00:43:57 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Aug 2011 15:43:57 -0700 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 2:26 PM, Benjamin Peterson wrote: > David Townshend writes: > >> >> >> This idea was proposed in the discussion on shutil.move, but I thought it > would be worth posting separately to avoid confusing the discussion. >> >> The idea is to add a create mode ('c') to the builtin open() function, which > will have the same effect as os.open(file, os.O_EXCL|os.O_CREAT). ?I have added > an issue (http://bugs.python.org/issue12760) for this, including a patch. > > I am -1 because > > - Possibly not portable or at least subject to implementations of > varying quality. > - No precedence in other languages or fopen() for that matter. > - It's not hard to use os.fdopen(). Agreed. Also I think that in most cases the right thing to do is to quietly overwrite the file. If you're implementing a UI where you want look-before-you-leap, the app should code an explicit test so it can issue a proper error message (the exception will not be fun for the user :-). -- --Guido van Rossum (python.org/~guido) From bruce at leapyear.org Wed Aug 17 01:18:12 2011 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 16 Aug 2011 16:18:12 -0700 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 3:43 PM, Guido van Rossum wrote: > Agreed. Also I think that in most cases the right thing to do is to > quietly overwrite the file. If you're implementing a UI where you want > look-before-you-leap, the app should code an explicit test so it can > issue a proper error message (the exception will not be fun for the > user :-). > > This isn't look before you leap. It's catching a failure when the file can't be created, just as you would if you don't have permission to write a file in that directory or the file already exists and is locked so you don't have permission to overwrite it. It just adds one more reason the file can't be written. The app should already have code to translate those exceptions into human-readable error messages so I don't think that's a good objection. (If the implementation of this function in some OS needs LBYL then I think that's an unfortunate defect in those OS.) I've had enough working with programs that do things like silently eat exceptions and I consider silently overwriting a file in the same class. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 17 01:46:49 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Aug 2011 16:46:49 -0700 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 4:18 PM, Bruce Leban wrote: > > On Tue, Aug 16, 2011 at 3:43 PM, Guido van Rossum wrote: >> >> Agreed. Also I think that in most cases the right thing to do is to >> quietly overwrite the file. If you're implementing a UI where you want >> look-before-you-leap, the app should code an explicit test so it can >> issue a proper error message (the exception will not be fun for the >> user :-). >> > > This isn't look before you leap. It's catching a failure when the file can't > be created, just as you would if you don't have permission to write a file > in that directory or the file already exists and is locked so you don't have > permission to overwrite it. It just adds one more reason the file can't be > written. So what's the use case? In general when using a command line environment overwriting the file is what you *want* to happen. Like with Unix "foo >bar". I don't even think there *is* a shell syntax for not overwriting an existing file, though you can use >> to append instead of overwrite -- this is open(filename, 'a'). > The app should already have code to translate those exceptions into > human-readable error messages so I don't think that's a good objection. > (If the implementation of this function in some OS needs LBYL then I think > that's an unfortunate defect in those OS.) Agreed. > I've had enough working with programs that do things like silently eat > exceptions and I consider silently overwriting a file in the same class. Always? How would you update an existing file if you can't overwrite files? -- --Guido van Rossum (python.org/~guido) From ben+python at benfinney.id.au Wed Aug 17 01:55:45 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 17 Aug 2011 09:55:45 +1000 Subject: [Python-ideas] Syntax for dedented strings References: <87mxf9c92r.fsf@benfinney.id.au> Message-ID: <87ippwdi7y.fsf@benfinney.id.au> Michael Foord writes: > The place I'm concerned about is the interactive interpreter, > virtually the only place I look at docstrings that isn't directly in > the source code (or pre-processed by a doc tool - but I don't care > about that). Right, and the interactive interpreter's ?help? follows the PEP 257 specification for handling whitespace in docstrings. If it doesn't, I'd consider that a bug to be fixed. On the other hand, if you're doing things like ?print foo.__doc__?, I don't think we need to do anything special to support that. You have ?help? available in the interpreter to nicely format an object's docstrings, right? -- \ ?Let others praise ancient times; I am glad I was born in | `\ these.? ?Ovid (43 BCE?18 CE) | _o__) | Ben Finney From steve at pearwood.info Wed Aug 17 01:36:51 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 17 Aug 2011 09:36:51 +1000 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: References: Message-ID: <4E4AFF13.9070903@pearwood.info> Michael Foord wrote: > How about *another* string prefix for dedented strings: > > class Thing(object): > d""" > This text will be, > nicely dedented, > thank you very much. > """" I think you need to explain the problem being solved, and your requirements, in a bit more detail here. As I understand it, the above is equivalent to this: class Thing(object): """ This text will be, nicely dedented, thank you very much. """ except that it looks indented in the source file. Compare that to the usual practice: class Thing(object): """ This text will be, nicely dedented, thank you very much. """ and the only difference is a bunch of leading spaces. If all you do with the docstrings is read them with help() in the interactive interpreter, why do you care about saving a few spaces? help() calls pydoc, which does its own reformatting of the docstring. Unless you regularly inspect instance.__doc__ by hand (not via help), I'm not sure what you hope to accomplish here. -- Steven From bruce at leapyear.org Wed Aug 17 02:12:25 2011 From: bruce at leapyear.org (Bruce Leban) Date: Tue, 16 Aug 2011 17:12:25 -0700 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 4:46 PM, Guido van Rossum wrote: > > So what's the use case? In general when using a command line > environment overwriting the file is what you *want* to happen. Like > with Unix "foo >bar". I don't even think there *is* a shell syntax for > not overwriting an existing file, though you can use >> to append > instead of overwrite -- this is open(filename, 'a'). > We weren't just discussing command line tools. That said, I'm sure I'm not the only person on this list who has inadvertently overwritten a file using foo > bar. While we may have grown accustomed to this behavior and some people (like you) may even consider it desirable, not everyone does. > > I've had enough working with programs that do things like silently eat > > exceptions and I consider silently overwriting a file in the same class. > > Always? How would you update an existing file if you can't overwrite files? > I didn't say never overwrite. What I don't like is programs overwriting files without explicitly intending to do that. Yes, there's a long legacy of overwriting files without warning or intent. I suppose I'm fighting an uphill battle (and it's not my highest priority complaint about bad code for that matter). --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From lyricconch at gmail.com Wed Aug 17 02:46:31 2011 From: lyricconch at gmail.com (=?UTF-8?B?5rW36Z+1?=) Date: Wed, 17 Aug 2011 08:46:31 +0800 Subject: [Python-ideas] multiple objects handling In-Reply-To: References: <4E4591CA.10302@altlinux.ru> <4E459F7C.9050706@scottdial.com> <4E45A6BD.4020108@altlinux.ru> Message-ID: = =!, try this: @vars class _DispatchTable(): def add(...): pass def sub(...): pass def process(..., Switch=_DispatchTable): ... val = Switch[action](...) ... 2011/8/13 Nick Coghlan : > On Sat, Aug 13, 2011 at 8:18 AM, Peter V. Saveliev wrote: >>> Have you actually profiled the performance of your program? I would >>> guess that the time spent dispatching through a dictionary is dwarfed by >>> the time spent constructing those event objects and the ultimate >>> processing of them. >> >> You're right, but having really huge stream of packets I try to minimize >> any overhead ? using C modules, ctypes library and so on. But I hope to >> save high-level logic in Python, and before to reject a chance to speed >> up parsing, I asked for some alternatives and contras. I believe that >> method calls here are unnecessary while the ?if? statements tree makes >> the code hard to support. > > This kind of logic micro-optimisation is best handled by a JIT > compiler rather than messing with the language definition and asking > people to do it by hand. I suggest running your application on PyPy > and seeing what kind of speed up you get. > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ben+python at benfinney.id.au Wed Aug 17 02:58:36 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 17 Aug 2011 10:58:36 +1000 Subject: [Python-ideas] Add create mode to open() References: Message-ID: <8762lwdfb7.fsf@benfinney.id.au> Guido van Rossum writes: > So what's the use case? I agree that I don't see the use case that isn't already adequately covered. You generally get what you explicitly ask for, and that's how it should be. A point of correction, though: > In general when using a command line environment overwriting the file > is what you *want* to happen. Like with Unix "foo >bar". I don't even > think there *is* a shell syntax for not overwriting an existing file, The ?bash(1)? man page describes the ?noclobber? option: If the redirection operator is >, and the noclobber option to the set builtin has been enabled, the redirection will fail if the file whose name results from the expansion of word exists and is a regular file. -- \ ?If you go flying back through time and you see somebody else | `\ flying forward into the future, it's probably best to avoid eye | _o__) contact.? ?Jack Handey | Ben Finney From guido at python.org Wed Aug 17 03:11:38 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Aug 2011 18:11:38 -0700 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 5:08 PM, Mike Meyer wrote: > On Tue, Aug 16, 2011 at 4:46 PM, Guido van Rossum wrote: >> >> On Tue, Aug 16, 2011 at 4:18 PM, Bruce Leban wrote: >> > On Tue, Aug 16, 2011 at 3:43 PM, Guido van Rossum >> > wrote: >> >> >> >> Agreed. Also I think that in most cases the right thing to do is to >> >> quietly overwrite the file. If you're implementing a UI where you want >> >> look-before-you-leap, the app should code an explicit test so it can >> >> issue a proper error message (the exception will not be fun for the >> >> user :-). >> >> >> > This isn't look before you leap. It's catching a failure when the file >> > can't >> > be created, just as you would if you don't have permission to write a >> > file >> > in that directory or the file already exists and is locked so you don't >> > have >> > permission to overwrite it. It just adds one more reason the file can't >> > be >> > written. >> >> So what's the use case? In general when using a command line >> environment overwriting the file is what you *want* to happen. Like >> with Unix "foo >bar". I don't even think there *is* a shell syntax for >> not overwriting an existing file, though you can use >> to append >> instead of overwrite -- this is open(filename, 'a'). > > The use case is for dealing with precious files. Normally, you deal with > this stuff with locks to make sure things are safe. ?The shell method of > doing this is to set the noclobber option (bash, csh, probably others). > Given that the use case is precious files, it should be 100% safe on every > major platform, and should throw an exception on any platform that doesn't > have a 100% safe implementation. Hm. Isn't the usual solution creating a temporary file in the same directory, and then atomically moving it with os.rename()? At least that's the Unix way that I am used to. The only other platform is Windows and I'm sure there's a similar solution using native syscalls, though I don't know what it is (and don't bring up the "Posix" extension :-). > I'm about +.5. I would expect this to be more portable than the application > doing locking, but you have to provide something like that for those > platforms anyway. It seems that open(fn, 'c') is only part of the solution -- you'd still have to do a move dance. -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Aug 17 03:13:22 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Aug 2011 18:13:22 -0700 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 5:12 PM, Bruce Leban wrote: > > On Tue, Aug 16, 2011 at 4:46 PM, Guido van Rossum wrote: >> >> So what's the use case? In general when using a command line >> environment overwriting the file is what you *want* to happen. Like >> with Unix "foo >bar". I don't even think there *is* a shell syntax for >> not overwriting an existing file, though you can use >> to append >> instead of overwrite -- this is open(filename, 'a'). > > We weren't just discussing command line tools. Well if anything, Python is *lower* level than command line tools, not higher. > That said, I'm sure I'm not > the only person on this list who has?inadvertently?overwritten a file using > foo > bar. While we may have grown accustomed to this behavior and some > people (like you) may even consider it desirable, not everyone does. But are they right? >> > I've had enough working with programs that do things like silently eat >> > exceptions and I consider silently overwriting a file in the same class. >> >> Always? How would you update an existing file if you can't overwrite >> files? > > I didn't say never overwrite. What I don't like is programs overwriting > files without explicitly intending to do that. Ah, but when you write open(fn, 'w') you *do* explicitly intend to overwrite. That's what it does. > Yes, there's a long legacy of overwriting files without warning or intent. I > suppose I'm fighting an uphill battle (and it's not my highest priority > complaint about bad code for that matter). Right. -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Aug 17 03:14:28 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 16 Aug 2011 18:14:28 -0700 Subject: [Python-ideas] Add create mode to open() In-Reply-To: <8762lwdfb7.fsf@benfinney.id.au> References: <8762lwdfb7.fsf@benfinney.id.au> Message-ID: On Tue, Aug 16, 2011 at 5:58 PM, Ben Finney wrote: > Guido van Rossum writes: > >> So what's the use case? > > I agree that I don't see the use case that isn't already adequately > covered. You generally get what you explicitly ask for, and that's how > it should be. > > A point of correction, though: > >> In general when using a command line environment overwriting the file >> is what you *want* to happen. Like with Unix "foo >bar". I don't even >> think there *is* a shell syntax for not overwriting an existing file, > > The ?bash(1)? man page describes the ?noclobber? option: > > ? ?If the redirection operator is >, and the noclobber option to the > ? ?set builtin has been enabled, the redirection will fail if the file > ? ?whose name results from the expansion of word exists and is a > ? ?regular file. Thanks, I'd forgotten that. It seems wrong that it is a global flag though. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Wed Aug 17 03:41:32 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 17 Aug 2011 11:41:32 +1000 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: <4E4B1C4C.6060808@pearwood.info> Guido van Rossum wrote: > On Tue, Aug 16, 2011 at 5:12 PM, Bruce Leban wrote: >> I didn't say never overwrite. What I don't like is programs overwriting >> files without explicitly intending to do that. > > Ah, but when you write open(fn, 'w') you *do* explicitly intend to > overwrite. That's what it does. No. I rarely intend to over-write an existing file. Usually I intend to create a new file. So I nearly always do this: if not os.path.exists(fn): open(fn, 'w') in the full knowledge that there's a race condition there, and that if I have multiple processes writing to files at the same time, as I often do, I could very well lose data. And don't think for one second I'm even close to happy about that, but at least it is *some* protection, even if not very much. >> Yes, there's a long legacy of overwriting files without warning or intent. I >> suppose I'm fighting an uphill battle (and it's not my highest priority >> complaint about bad code for that matter). > > Right. I'm not sure that a "create" mode is the right solution, but I am sure that, just like Bruce, I want a good, battle-hardened, standard, platform independent (as much as possible) solution to the above race condition bug. Perhaps it should be a module, like the tempfile module. (There's a thread about adding something like this to shutil.) -- Steven From ironfroggy at gmail.com Wed Aug 17 04:10:22 2011 From: ironfroggy at gmail.com (Calvin Spealman) Date: Tue, 16 Aug 2011 22:10:22 -0400 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 11:45 AM, David Townshend wrote: > This idea was proposed in the discussion on shutil.move, but I thought it > would be worth posting separately to avoid confusing the discussion. > The idea is to add a create mode ('c') to the builtin open() function, which > will have the same effect as os.open(file, os.O_EXCL|os.O_CREAT). ?I have > added an issue (http://bugs.python.org/issue12760) for this, including a > patch. While there are plenty of responders debating how unix-y or sensible this is from the standpoint of other languages or precedents, I think this makes sense from the standpoint of new users and a fresh viewpoints. Creating a new file is a very sensible thing to want in a single operation. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From ncoghlan at gmail.com Wed Aug 17 05:33:38 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Aug 2011 13:33:38 +1000 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: <4E4AFF13.9070903@pearwood.info> References: <4E4AFF13.9070903@pearwood.info> Message-ID: On Wed, Aug 17, 2011 at 9:36 AM, Steven D'Aprano wrote: > Unless you regularly inspect instance.__doc__ by hand (not via help), I'm > not sure what you hope to accomplish here. One trick I did discover thanks to this thread: 'pydoc.pager = pydoc.plainpager' Eliminates the potentially annoying behaviour where help() takes over the entire terminal window and doesn't appear in the scrollback history. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Aug 17 05:42:59 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Aug 2011 13:42:59 +1000 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Wed, Aug 17, 2011 at 12:10 PM, Calvin Spealman wrote: > While there are plenty of responders debating how unix-y or sensible > this is from the standpoint of other languages or precedents, I think > this makes sense from the standpoint of new users and a fresh > viewpoints. Creating a new file is a very sensible thing to want in a > single operation. Yes, this suggestion came out of the shutil.move discussion, precisely *because* it is so hard to write a platform-independent, multi-process safe operation that will reliably either create a new file or else throw an exception if the file already exists, or if the underlying filesystem doesn't provide the necessary primitives to provide the appropriate guarantees. I don't think it needs to be added to the open builtin, but a shutil function specifically for this operation certainly sounds like a reasonable request. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jeanpierreda at gmail.com Wed Aug 17 05:43:56 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Tue, 16 Aug 2011 23:43:56 -0400 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: > Agreed. Also I think that in most cases the right thing to do is to > quietly overwrite the file. If you're implementing a UI where you want > look-before-you-leap, the app should code an explicit test so it can > issue a proper error message (the exception will not be fun for the > user :-). The explicit test being "if not os.path.exists(pth)"? That has a race condition. LBYL is impossible here without a race condition, in fact, because the situation can change between looking and leaping. An exception, or else return code, is the only way. These can be checked for after the fact. I'd also point out that for those that don't want race conditions, Python is discouraging. The correct incantation involves two undocumented constants, plus a unique and rarely used way of opening files that involves unix file descriptors. Devin On Tue, Aug 16, 2011 at 6:43 PM, Guido van Rossum wrote: > On Tue, Aug 16, 2011 at 2:26 PM, Benjamin Peterson wrote: >> David Townshend writes: >> >>> >>> >>> This idea was proposed in the discussion on shutil.move, but I thought it >> would be worth posting separately to avoid confusing the discussion. >>> >>> The idea is to add a create mode ('c') to the builtin open() function, which >> will have the same effect as os.open(file, os.O_EXCL|os.O_CREAT). ?I have added >> an issue (http://bugs.python.org/issue12760) for this, including a patch. >> >> I am -1 because >> >> - Possibly not portable or at least subject to implementations of >> varying quality. >> - No precedence in other languages or fopen() for that matter. >> - It's not hard to use os.fdopen(). > > Agreed. Also I think that in most cases the right thing to do is to > quietly overwrite the file. If you're implementing a UI where you want > look-before-you-leap, the app should code an explicit test so it can > issue a proper error message (the exception will not be fun for the > user :-). > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From benjamin at python.org Wed Aug 17 05:49:04 2011 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 17 Aug 2011 03:49:04 +0000 (UTC) Subject: [Python-ideas] Add create mode to open() References: Message-ID: Devin Jeanpierre writes: > > LBYL is impossible here without a race condition, in fact, because the > situation can change between looking and leaping. An exception, or > else return code, is the only way. These can be checked for after the > fact. How often in this used? In every application I've written, writing a file usually results from the user giving a path, in which case it's intended to replace whatever is already there. > > I'd also point out that for those that don't want race conditions, > Python is discouraging. The correct incantation involves two > undocumented constants, plus a unique and rarely used way of opening > files that involves unix file descriptors. If you truly want to avoid all filesystem race conditions, you're going to be dealing with file descriptors and low-level syscalls galore. Moving one aspect to a higher level is not too helpful on the whole. From ncoghlan at gmail.com Wed Aug 17 06:11:32 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Aug 2011 14:11:32 +1000 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Wed, Aug 17, 2011 at 1:43 PM, Devin Jeanpierre wrote: > I'd also point out that for those that don't want race conditions, > Python is discouraging. The correct incantation involves two > undocumented constants, plus a unique and rarely used way of opening > files that involves unix file descriptors. FWIW, when you control the filename, you can include an additional subdirectory precisely for the exception when a second process attempts to create the same subdirectory. You can even play games along those lines for file access control on arbitrary filenames via a shadow hierarchy of directories. For example, Skip's lockfile package (http://packages.python.org/lockfile/lockfile.html) uses directories to provide cooperative file locks on Windows. That only helps the cooperative locking case, though - it does nothing against hostile file substitutions. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jeanpierreda at gmail.com Wed Aug 17 06:13:17 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 17 Aug 2011 00:13:17 -0400 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On Tue, Aug 16, 2011 at 11:49 PM, Benjamin Peterson wrote: > Devin Jeanpierre writes: >> >> LBYL is impossible here without a race condition, in fact, because the >> situation can change between looking and leaping. An exception, or >> else return code, is the only way. These can be checked for after the >> fact. > > How often in this used? In every application I've written, writing a file > usually results from the user giving a path, in which case it's intended to > replace whatever is already there. Unless it isn't. Most GUI apps ask you to confirm whether you want to open a file even where one exists. Suppose you do a LBYL approach: you check to see if no file is there, then somebody writes a file there, then you overwrite it because you used 'w' mode because no file was there. It's not a disaster, since this is kind of hard to do by accident, but it is incorrect behavior if you wanted to actually ask if overwriting was kosher. On the other hand, if you ask _after_ trying to open the file, then one of two things can happen: the user says "abort", or the user says, "overwrite it". In the former case, we start over. In the latter case, the only remaining race condition is one that doesn't matter: the file might disappear before you overwrite it! >> I'd also point out that for those that don't want race conditions, >> Python is discouraging. The correct incantation involves two >> undocumented constants, plus a unique and rarely used way of opening >> files that involves unix file descriptors. > > If you truly want to avoid all filesystem race conditions, you're going to be > dealing with file descriptors and low-level syscalls galore. Moving one aspect > to a higher level is not too helpful on the whole. Well, eh, not really. As far as I know this particular primitive is the probably the most important one. It's certainly the only one I've ever wanted to use Devin From aquavitae69 at gmail.com Wed Aug 17 07:46:05 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Wed, 17 Aug 2011 07:46:05 +0200 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: It seems that the only problem raised which hasn't been already responded to is that it won't necessarily work on all platforms. Could anyone elaborate on which platforms it won't work on? From what I've read it will work on all unix and windows systems which I think covers the majority. I can't see that it would be a problem if it doesn't work on a few specialised systems as long as the documentation is clear on this (any it might turn out that it can be handled, just through os-specific code). It at least allows the feature to be available for the 90% of use cases when it will work. David On Wed, Aug 17, 2011 at 6:13 AM, Devin Jeanpierre wrote: > On Tue, Aug 16, 2011 at 11:49 PM, Benjamin Peterson > wrote: > > Devin Jeanpierre writes: > >> > >> LBYL is impossible here without a race condition, in fact, because the > >> situation can change between looking and leaping. An exception, or > >> else return code, is the only way. These can be checked for after the > >> fact. > > > > How often in this used? In every application I've written, writing a file > > usually results from the user giving a path, in which case it's intended > to > > replace whatever is already there. > > Unless it isn't. Most GUI apps ask you to confirm whether you want to > open a file even where one exists. Suppose you do a LBYL approach: you > check to see if no file is there, then somebody writes a file there, > then you overwrite it because you used 'w' mode because no file was > there. It's not a disaster, since this is kind of hard to do by > accident, but it is incorrect behavior if you wanted to actually ask > if overwriting was kosher. > > On the other hand, if you ask _after_ trying to open the file, then > one of two things can happen: the user says "abort", or the user says, > "overwrite it". In the former case, we start over. In the latter case, > the only remaining race condition is one that doesn't matter: the file > might disappear before you overwrite it! > > >> I'd also point out that for those that don't want race conditions, > >> Python is discouraging. The correct incantation involves two > >> undocumented constants, plus a unique and rarely used way of opening > >> files that involves unix file descriptors. > > > > If you truly want to avoid all filesystem race conditions, you're going > to be > > dealing with file descriptors and low-level syscalls galore. Moving one > aspect > > to a higher level is not too helpful on the whole. > > Well, eh, not really. As far as I know this particular primitive is > the probably the most important one. It's certainly the only one I've > ever wanted to use > > Devin > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Wed Aug 17 10:10:56 2011 From: dreamingforward at gmail.com (Mark Janssen) Date: Wed, 17 Aug 2011 02:10:56 -0600 Subject: [Python-ideas] A "compound" object for Python like in ABC Message-ID: Just throwing out an idea here. I want to consider the programming value of a Compound data type, such as found in the ABC programming language. I think it could considerably simplify Python in several different ways although it would require some significant changes. Crudely, a Compound is simply the ability to associate a label or key with an object -- flatly; i.e. in a "planar" dimension rather than a depth dimension like one might do with a variable name. We can denote a Compound with a (key:value) syntax, using a colon to signify the relationshiop. Compounds may collide when they are put into a Set (if their keys are the same). The default behavior can be like dict: overwrite the value -- but users can subclass the Compound data type and specify what to do. In the case of a Bag, it could __add__ the values; in the case of a database, it could throw an exception, etc. A Compound also has a __default__ value in the case of colliding with a non-compound type. For a countable Compound this would likely be 1. A dictionary becomes simply a set of Compounds; for example, {'a':1, 'b':2} is a *set* containing two Compound data elements. Nicely, this allows the empty set to have the normal syntax of "{}". Set would have to add the [] set/getitem syntax to deal with the common case of possible compound relationships among its members. It would return None if the item is not a Compound type otherwise it returns the compound's value. Additionally, set should NOT quietly ignore adding a duplicate element, they should collide and python's pre-defined collision behavior decides what happens. Creating a bag (or Counter) container now becomes very simple: make a Compound that run the add function for collisions. As another example, a Node (in a graph) is now simply "Node1":{"edge1", "edge2"} (a Compound) and a Graph simply a set of such Compounds. Some other possibilities: the Compound is a special data type that exist in the border between an atom and a collection. The Compound's constructor can exploit this. If a collection is passed as the first paramater to the compound's constructor, all sorts of things can be done. Compound(myset, 1) could return a dictionary with all values set to 1. Given a list instead, it could return an enumeration. This is all rather sloppy, but I wanted to put it out here to see what kind of interest there might be.... Mark From aquavitae69 at gmail.com Wed Aug 17 10:45:45 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Wed, 17 Aug 2011 10:45:45 +0200 Subject: [Python-ideas] A "compound" object for Python like in ABC In-Reply-To: References: Message-ID: Maybe I'm missing something, but I can't see the advantage of this over the current dict. Why would you want to create an individual key/value pair, and why not just use a tuple? On Wed, Aug 17, 2011 at 10:10 AM, Mark Janssen wrote: > Just throwing out an idea here. I want to consider the programming > value of a Compound data type, such as found in the ABC programming > language. > I think it could considerably simplify Python in several different > ways although it would require some significant changes. > > Crudely, a Compound is simply the ability to associate a label or key > with an object -- flatly; i.e. in a "planar" dimension rather than a > depth dimension like one might do with a variable name. > We can denote a Compound with a (key:value) syntax, using a colon to > signify the relationshiop. > Compounds may collide when they are put into a Set (if their keys are > the same). The default behavior can be like dict: overwrite the value > -- but users can subclass the Compound data type and specify what to > do. In the case of a Bag, it could __add__ the values; in the case of > a database, it could throw an exception, etc. > A Compound also has a __default__ value in the case of colliding with > a non-compound type. For a countable Compound this would likely be 1. > > A dictionary becomes simply a set of Compounds; for example, {'a':1, > 'b':2} is a *set* containing two Compound data elements. > Nicely, this allows the empty set to have the normal syntax of "{}". > Set would have to add the [] set/getitem syntax to deal with the > common case of possible compound relationships among its members. It > would return None if the item is not a Compound type otherwise it > returns the compound's value. > Additionally, set should NOT quietly ignore adding a duplicate > element, they should collide and python's pre-defined collision > behavior decides what happens. > > Creating a bag (or Counter) container now becomes very simple: make a > Compound that run the add function for collisions. > As another example, a Node (in a graph) is now simply > "Node1":{"edge1", "edge2"} (a Compound) and a Graph simply a set of > such Compounds. > > Some other possibilities: the Compound is a special data type that > exist in the border between an atom and a collection. The Compound's > constructor can exploit this. If a collection is passed as the first > paramater to the compound's constructor, all sorts of things can be > done. Compound(myset, 1) could return a dictionary with all values > set to 1. Given a list instead, it could return an enumeration. > > This is all rather sloppy, but I wanted to put it out here to see what > kind of interest there might be.... > > Mark > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Wed Aug 17 11:30:25 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Aug 2011 11:30:25 +0200 Subject: [Python-ideas] Add create mode to open() References: Message-ID: <20110817113025.317f52f3@pitrou.net> On Wed, 17 Aug 2011 14:11:32 +1000 Nick Coghlan wrote: > On Wed, Aug 17, 2011 at 1:43 PM, Devin Jeanpierre > wrote: > > I'd also point out that for those that don't want race conditions, > > Python is discouraging. The correct incantation involves two > > undocumented constants, plus a unique and rarely used way of opening > > files that involves unix file descriptors. > > FWIW, when you control the filename, you can include an additional > subdirectory precisely for the exception when a second process > attempts to create the same subdirectory. How do you create a directory and a file atomically? Regards Antoine. From solipsis at pitrou.net Wed Aug 17 11:43:36 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Aug 2011 11:43:36 +0200 Subject: [Python-ideas] Add create mode to open() References: Message-ID: <20110817114336.2e71a807@pitrou.net> On Wed, 17 Aug 2011 13:42:59 +1000 Nick Coghlan wrote: > On Wed, Aug 17, 2011 at 12:10 PM, Calvin Spealman wrote: > > While there are plenty of responders debating how unix-y or sensible > > this is from the standpoint of other languages or precedents, I think > > this makes sense from the standpoint of new users and a fresh > > viewpoints. Creating a new file is a very sensible thing to want in a > > single operation. > > Yes, this suggestion came out of the shutil.move discussion, precisely > *because* it is so hard to write a platform-independent, multi-process > safe operation that will reliably either create a new file or else > throw an exception if the file already exists, Platform-independent is not hard. O_EXCL is specified by POSIX: http://pubs.opengroup.org/onlinepubs/9699919799/functions/open.html Works under Linux: >>> import os >>> os.open("LICENSE", os.O_CREAT | os.O_EXCL) Traceback (most recent call last): File "", line 1, in FileExistsError: [Errno 17] File exists: 'LICENSE' Works under Windows: >>> import os >>> os.open("LICENSE", os.O_CREAT | os.O_EXCL) Traceback (most recent call last): File "", line 1, in FileExistsError: [Errno 17] File exists (yes, I'm taking the opportunity to showcase PEP 3151) Regards Antoine. From mal at egenix.com Wed Aug 17 11:52:03 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 17 Aug 2011 11:52:03 +0200 Subject: [Python-ideas] Add create mode to open() In-Reply-To: <20110817113025.317f52f3@pitrou.net> References: <20110817113025.317f52f3@pitrou.net> Message-ID: <4E4B8F43.70006@egenix.com> Antoine Pitrou wrote: > On Wed, 17 Aug 2011 14:11:32 +1000 > Nick Coghlan wrote: >> On Wed, Aug 17, 2011 at 1:43 PM, Devin Jeanpierre >> wrote: >>> I'd also point out that for those that don't want race conditions, >>> Python is discouraging. The correct incantation involves two >>> undocumented constants, plus a unique and rarely used way of opening >>> files that involves unix file descriptors. >> >> FWIW, when you control the filename, you can include an additional >> subdirectory precisely for the exception when a second process >> attempts to create the same subdirectory. > > How do you create a directory and a file atomically? On Windows, directories are created atomically. On Unix, too, but symlinks are faster. You can use those to implement cooperative file locks in a fairly cross-platform way. See e.g. mx.Misc.FileLock in http://www.egenix.com/products/python/mxBase/ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 17 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-10-04: PyCon DE 2011, Leipzig, Germany 48 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Wed Aug 17 11:58:23 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Aug 2011 11:58:23 +0200 Subject: [Python-ideas] Add create mode to open() In-Reply-To: <4E4B8F43.70006@egenix.com> References: <20110817113025.317f52f3@pitrou.net> <4E4B8F43.70006@egenix.com> Message-ID: <1313575103.2933.1.camel@localhost.localdomain> Le mercredi 17 ao?t 2011 ? 11:52 +0200, M.-A. Lemburg a ?crit : > Antoine Pitrou wrote: > > On Wed, 17 Aug 2011 14:11:32 +1000 > > Nick Coghlan wrote: > >> On Wed, Aug 17, 2011 at 1:43 PM, Devin Jeanpierre > >> wrote: > >>> I'd also point out that for those that don't want race conditions, > >>> Python is discouraging. The correct incantation involves two > >>> undocumented constants, plus a unique and rarely used way of opening > >>> files that involves unix file descriptors. > >> > >> FWIW, when you control the filename, you can include an additional > >> subdirectory precisely for the exception when a second process > >> attempts to create the same subdirectory. > > > > How do you create a directory and a file atomically? > > On Windows, directories are created atomically. On Unix, > too, but symlinks are faster. You can use those to implement > cooperative file locks in a fairly cross-platform way. I was thinking of creating both the directory and the file in a single atomic operation. But if the directory is only ever used for that file, I guess it's ok. (there's still a problem when deleting the directory and the file, which can't be atomic, and the file has to be deleted before the directory, meaning if the process crashes in between, there are "legitimate" situations where the directory exists but not the file in it...) Regards Antoine. From mal at egenix.com Wed Aug 17 12:17:04 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 17 Aug 2011 12:17:04 +0200 Subject: [Python-ideas] Add create mode to open() In-Reply-To: <1313575103.2933.1.camel@localhost.localdomain> References: <20110817113025.317f52f3@pitrou.net> <4E4B8F43.70006@egenix.com> <1313575103.2933.1.camel@localhost.localdomain> Message-ID: <4E4B9520.5080501@egenix.com> Antoine Pitrou wrote: > Le mercredi 17 ao?t 2011 ? 11:52 +0200, M.-A. Lemburg a ?crit : >> Antoine Pitrou wrote: >>> On Wed, 17 Aug 2011 14:11:32 +1000 >>> Nick Coghlan wrote: >>>> On Wed, Aug 17, 2011 at 1:43 PM, Devin Jeanpierre >>>> wrote: >>>>> I'd also point out that for those that don't want race conditions, >>>>> Python is discouraging. The correct incantation involves two >>>>> undocumented constants, plus a unique and rarely used way of opening >>>>> files that involves unix file descriptors. >>>> >>>> FWIW, when you control the filename, you can include an additional >>>> subdirectory precisely for the exception when a second process >>>> attempts to create the same subdirectory. >>> >>> How do you create a directory and a file atomically? >> >> On Windows, directories are created atomically. On Unix, >> too, but symlinks are faster. You can use those to implement >> cooperative file locks in a fairly cross-platform way. > > I was thinking of creating both the directory and the file in a single > atomic operation. But if the directory is only ever used for that file, > I guess it's ok. > (there's still a problem when deleting the directory and the file, which > can't be atomic, and the file has to be deleted before the directory, > meaning if the process crashes in between, there are "legitimate" > situations where the directory exists but not the file in it...) The directory is only used as locking mechanism. You normally don't need to create any files within that lock directory unless you want to store extra lock information. The lock directory can be create alongside the file you want to lock or in a separate directory (on the same file system). This mechanism can also be used to create directory/file pairs - simply lock the directory (using a separate lock directory), create the file, do something, remove the file, remove the directory, remove lock. If your process fails, you can use the information from the lock directory to implement timeouts and cleanup actions (which would then also remove the files in the directory you locked). If you additionally add a lock info file to the lock directory, you can make the checks even more sophisticated and check whether the owning process still exists, the host owning the lock is still available, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 17 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-10-04: PyCon DE 2011, Leipzig, Germany 48 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From solipsis at pitrou.net Wed Aug 17 12:19:53 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 17 Aug 2011 12:19:53 +0200 Subject: [Python-ideas] Add create mode to open() References: <20110817113025.317f52f3@pitrou.net> <4E4B8F43.70006@egenix.com> <1313575103.2933.1.camel@localhost.localdomain> <4E4B9520.5080501@egenix.com> Message-ID: <20110817121953.499553f1@pitrou.net> On Wed, 17 Aug 2011 12:17:04 +0200 "M.-A. Lemburg" wrote: > Antoine Pitrou wrote: > > Le mercredi 17 ao?t 2011 ? 11:52 +0200, M.-A. Lemburg a ?crit : > >> Antoine Pitrou wrote: > >>> On Wed, 17 Aug 2011 14:11:32 +1000 > >>> Nick Coghlan wrote: > >>>> On Wed, Aug 17, 2011 at 1:43 PM, Devin Jeanpierre > >>>> wrote: > >>>>> I'd also point out that for those that don't want race conditions, > >>>>> Python is discouraging. The correct incantation involves two > >>>>> undocumented constants, plus a unique and rarely used way of opening > >>>>> files that involves unix file descriptors. > >>>> > >>>> FWIW, when you control the filename, you can include an additional > >>>> subdirectory precisely for the exception when a second process > >>>> attempts to create the same subdirectory. > >>> > >>> How do you create a directory and a file atomically? > >> > >> On Windows, directories are created atomically. On Unix, > >> too, but symlinks are faster. You can use those to implement > >> cooperative file locks in a fairly cross-platform way. > > > > I was thinking of creating both the directory and the file in a single > > atomic operation. But if the directory is only ever used for that file, > > I guess it's ok. > > (there's still a problem when deleting the directory and the file, which > > can't be atomic, and the file has to be deleted before the directory, > > meaning if the process crashes in between, there are "legitimate" > > situations where the directory exists but not the file in it...) > > The directory is only used as locking mechanism. You normally > don't need to create any files within that lock directory unless > you want to store extra lock information. Ah, I thought Nick proposed to create the file in that directory. My bad. Regards Antoine. From steve at pearwood.info Wed Aug 17 12:40:51 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 17 Aug 2011 20:40:51 +1000 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: <4E4B9AB3.5010005@pearwood.info> Benjamin Peterson wrote: > Devin Jeanpierre writes: >> LBYL is impossible here without a race condition, in fact, because the >> situation can change between looking and leaping. An exception, or >> else return code, is the only way. These can be checked for after the >> fact. > > How often in this used? In every application I've written, writing a file > usually results from the user giving a path, in which case it's intended to > replace whatever is already there. I frequently write scripts that, e.g. rename or move files with little or no human intervention. wget is a prime example of an application which needs to avoid overwriting files without human intervention, although I don't know how well it deals with race conditions. -- Steven From ncoghlan at gmail.com Wed Aug 17 14:56:19 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 17 Aug 2011 22:56:19 +1000 Subject: [Python-ideas] Add create mode to open() In-Reply-To: <20110817121953.499553f1@pitrou.net> References: <20110817113025.317f52f3@pitrou.net> <4E4B8F43.70006@egenix.com> <1313575103.2933.1.camel@localhost.localdomain> <4E4B9520.5080501@egenix.com> <20110817121953.499553f1@pitrou.net> Message-ID: On Wed, Aug 17, 2011 at 8:19 PM, Antoine Pitrou wrote: > Ah, I thought Nick proposed to create the file in that directory. My > bad. I did (although I noted the file could be outside the directory, too). For any cooperative locking system based on persistent artifacts, crashing processes that don't release the locks properly are a definite problem (often dealt with by punting the problem to a human via an appropriate error message). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at gmail.com Wed Aug 17 14:58:39 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Wed, 17 Aug 2011 13:58:39 +0100 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: References: <4E4AFF13.9070903@pearwood.info> Message-ID: On 17 August 2011 04:33, Nick Coghlan wrote: > On Wed, Aug 17, 2011 at 9:36 AM, Steven D'Aprano > wrote: > > Unless you regularly inspect instance.__doc__ by hand (not via help), I'm > > not sure what you hope to accomplish here. > > One trick I did discover thanks to this thread: 'pydoc.pager = > pydoc.plainpager' > > Thanks Nick, that does make help substantially less annoying. I'll add it to my python startup. All the best, Michael > Eliminates the potentially annoying behaviour where help() takes over > the entire terminal window and doesn't appear in the scrollback > history. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Wed Aug 17 15:31:06 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Wed, 17 Aug 2011 14:31:06 +0100 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: References: <4E4AFF13.9070903@pearwood.info> Message-ID: On 17 August 2011 13:58, Michael Foord wrote: > > > On 17 August 2011 04:33, Nick Coghlan wrote: > >> On Wed, Aug 17, 2011 at 9:36 AM, Steven D'Aprano >> wrote: >> > Unless you regularly inspect instance.__doc__ by hand (not via help), >> I'm >> > not sure what you hope to accomplish here. >> >> One trick I did discover thanks to this thread: 'pydoc.pager = >> pydoc.plainpager' >> >> > Thanks Nick, that does make help substantially less annoying. I'll add it > to my python startup. > > Although `help(klass)` is still (often) way too verbose if I *just* want to see the class docstring (help shows all methods and their docstrings too). And as you can't modify a class docstring there's no way to fix indented text there. All the best, Michael > All the best, > > Michael > > > >> Eliminates the potentially annoying behaviour where help() takes over >> the entire terminal window and doesn't appear in the scrollback >> history. >> >> Cheers, >> Nick. >> >> -- >> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > > > -- > > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed Aug 17 18:35:19 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 17 Aug 2011 10:35:19 -0600 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: References: <4E4AFF13.9070903@pearwood.info> Message-ID: On Wed, Aug 17, 2011 at 7:31 AM, Michael Foord wrote: > > > On 17 August 2011 13:58, Michael Foord wrote: >> >> >> On 17 August 2011 04:33, Nick Coghlan wrote: >>> >>> On Wed, Aug 17, 2011 at 9:36 AM, Steven D'Aprano >>> wrote: >>> > Unless you regularly inspect instance.__doc__ by hand (not via help), >>> > I'm >>> > not sure what you hope to accomplish here. >>> >>> One trick I did discover thanks to this thread: 'pydoc.pager = >>> pydoc.plainpager' >>> >> >> Thanks Nick, that does make help substantially less annoying. I'll add it >> to my python startup. >> > > Although `help(klass)` is still (often) way too verbose if I *just* want to > see the class docstring (help shows all methods and their docstrings too). > And as you can't modify a class docstring there's no way to fix indented > text there. See http://bugs.python.org/issue12773. -eric > > All the best, > > Michael > >> >> All the best, >> >> Michael >> >> >>> >>> Eliminates the potentially annoying behaviour where help() takes over >>> the entire terminal window and doesn't appear in the scrollback >>> history. >>> >>> Cheers, >>> Nick. >>> >>> -- >>> Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >> >> >> >> -- >> >> http://www.voidspace.org.uk/ >> >> May you do good and not evil >> >> May you find forgiveness for yourself and forgive others >> >> May you share freely, never taking more than you give. >> -- the sqlite blessing http://www.sqlite.org/different.html > > > > -- > > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > From tjreedy at udel.edu Wed Aug 17 19:19:04 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 17 Aug 2011 13:19:04 -0400 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: On 8/16/2011 11:49 PM, Benjamin Peterson wrote: > If you truly want to avoid all filesystem race conditions, you're going to be > dealing with file descriptors and low-level syscalls galore. Moving one aspect > to a higher level is not too helpful on the whole. Perhaps we need a HOW-TO on working with files that discusses special-case needs and solutions that use os, tempfile, etc alternatives to builtin open. -- Terry Jan Reedy From aquavitae69 at gmail.com Thu Aug 18 07:54:32 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Thu, 18 Aug 2011 07:54:32 +0200 Subject: [Python-ideas] Add create mode to open() In-Reply-To: References: Message-ID: > > Perhaps we need a HOW-TO on working with files that discusses special-case > needs and solutions that use os, tempfile, etc alternatives to builtin open. > Not a bad idea, but would it not get too complicated? Starting with ways to use os.open is fine, but to do it properly we would need to go on to copying and moving, and this would involve basically rewriting the implementation of shutils into the HOW-TO. I think that this is a good idea, but not enough on its own. My suggestion, to do this properly, would be to first implement the open create mode in open. Then update (or write new) copy functions (i.e. copyfile, copy2, copytree) to use the create mode, thereby effectively changing the linux shell commands represented from "cp" to "cp -n". Finally implement a new move function which, if possible, uses link/unlink or the immutable attribute (as discussed in the tread on shutil.move), and falls back to copy/unlink using the new copytree. The resulting functions would correspond to the "no-clobber" versions of the equivalent shell commands. The only problem is that O_EXCL may not be supported on all platforms. Can anyone tell me which platforms these are? I would like to see if I can find a way to achieve the same effect on those platforms, but so far I haven't been able to find out what they are. A HOW-TO would be useful to discuss other methods, such as tempfile, which will be a lot easier to use with the no-clobber versions. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Aug 18 11:35:09 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 18 Aug 2011 11:35:09 +0200 Subject: [Python-ideas] Add create mode to open() References: Message-ID: <20110818113509.282efbfe@pitrou.net> On Thu, 18 Aug 2011 07:54:32 +0200 David Townshend wrote: > > The only problem is that O_EXCL may not be supported on all platforms. Can > anyone tell me which platforms these are? That sounds unlikely. O_EXCL is a POSIX standard. It is also supported under Windows by the _open/_wopen compatibility functions (which we use for file I/O). Probably there are very old systems which don't support it, and perhaps new systems that don't implement it *correctly* (meaning not atomically); for the former I'd say we just don't care (who's gonna run Python 3 on a 1995 system?) and for the latter, well, if the OS designers think it's fine, let's just expose it as it is. As for NFS, there's an interesting comment from 2007 here: http://lwn.net/Articles/251971/ ?My NFS tester shows that it at least appears to work with Linux, Solaris and FreeBSD: http://www.dovecot.org/list/dovecot/2007-July/024102.html. Looking at Linux 2.6 sources it doesn't look like it tries to implement a racy O_EXCL check in client side (fs/nfs/nfs3proc.c nfs3_proc_create()), so the test's results should be correct. I don't know if other OSes do that. I guess it would be nice to have a better O_EXCL tester which tries to catch race conditions.? Regards Antoine. From aquavitae69 at gmail.com Thu Aug 18 13:21:40 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Thu, 18 Aug 2011 13:21:40 +0200 Subject: [Python-ideas] Add create mode to open() In-Reply-To: <20110818113509.282efbfe@pitrou.net> References: <20110818113509.282efbfe@pitrou.net> Message-ID: So if it is cross-platform, then what's the problem? On Thu, Aug 18, 2011 at 11:35 AM, Antoine Pitrou wrote: > On Thu, 18 Aug 2011 07:54:32 +0200 > David Townshend > wrote: > > > > The only problem is that O_EXCL may not be supported on all platforms. > Can > > anyone tell me which platforms these are? > > That sounds unlikely. O_EXCL is a POSIX standard. It is also supported > under Windows by the _open/_wopen compatibility functions (which we use > for file I/O). > > Probably there are very old systems which don't support it, and perhaps > new systems that don't implement it *correctly* (meaning not > atomically); for the former I'd say we just don't care (who's gonna run > Python 3 on a 1995 system?) and for the latter, well, if the OS > designers think it's fine, let's just expose it as it is. > > As for NFS, there's an interesting comment from 2007 here: > http://lwn.net/Articles/251971/ > > ?My NFS tester shows that it at least appears to work with Linux, > Solaris and FreeBSD: > http://www.dovecot.org/list/dovecot/2007-July/024102.html. Looking at > Linux 2.6 sources it doesn't look like it tries to implement a racy > O_EXCL check in client side (fs/nfs/nfs3proc.c nfs3_proc_create()), so > the test's results should be correct. I don't know if other OSes do > that. I guess it would be nice to have a better O_EXCL tester which > tries to catch race conditions.? > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dag.odenhall at gmail.com Fri Aug 19 14:50:33 2011 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Fri, 19 Aug 2011 14:50:33 +0200 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: References: Message-ID: <1313758233.2533.0.camel@ganesh> On Tue, 2011-08-16 at 22:33 +0100, Michael Foord wrote: > How about *another* string prefix for dedented strings: > > class Thing(object): > d""" > This text will be, > nicely dedented, > thank you very much. > """" +1, I do the textwrap.dedent dance a lot in unit tests. From dag.odenhall at gmail.com Fri Aug 19 14:55:09 2011 From: dag.odenhall at gmail.com (Dag Odenhall) Date: Fri, 19 Aug 2011 14:55:09 +0200 Subject: [Python-ideas] Syntax for dedented strings In-Reply-To: References: <87mxf9c92r.fsf@benfinney.id.au> Message-ID: <1313758509.2533.4.camel@ganesh> On Tue, 2011-08-16 at 23:43 +0100, Michael Foord wrote: > The place I'm concerned about is the interactive interpreter, virtually the > only place I look at docstrings that isn't directly in the source code (or > pre-processed by a doc tool - but I don't care about that). Why don't you use bpython? It shows the docstring automatically as you're typing a class, without all the methods, and properly formatted. Also if you're working with docstrings yourself there's inspect.getdoc which handles whitespace/indentation. (But I still consider your proposal useful.) From aquavitae69 at gmail.com Fri Aug 19 16:32:52 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 19 Aug 2011 16:32:52 +0200 Subject: [Python-ideas] Syntax for dedented strings Message-ID: +1 for strings in unit tests, but not so useful for docstrings. But the proposal was as a new string type, not necessarily docstrings. But maybe a string method would be better: >>> """ ... This ... is ... indented.""".deindent() """ This is indented""" On Aug 19, 2011 2:55 PM, "Dag Odenhall" wrote: > > On Tue, 2011-08-16 at 23:43 +0100, Michael Foord wrote: > > The place I'm concerned about is the interactive interpreter, virtually the > > only place I look at docstrings that isn't directly in the source code (or > > pre-processed by a doc tool - but I don't care about that). > > Why don't you use bpython? It shows the docstring automatically as > you're typing a class, without all the methods, and properly formatted. > > Also if you're working with docstrings yourself there's inspect.getdoc > which handles whitespace/indentation. > > (But I still consider your proposal useful.) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghostwriter402 at gmail.com Fri Aug 19 23:07:03 2011 From: ghostwriter402 at gmail.com (Spectral One) Date: Fri, 19 Aug 2011 16:07:03 -0500 Subject: [Python-ideas] Enums In-Reply-To: References: Message-ID: <4E4ED077.9040302@gmail.com> A thought occurred after reading through this thread. Given that one of the most common uses of enums is to define the range of valid options to pass and to demystify such magic numbers, it seems that the proper place to have enums' variable names defined is the module in which they are declared and used, but in cases where the meaning is clear, and only there, pull them out. (that is, normally, one needs to type "EnumInstanceName.Code" e.g. "Colors.red", but when in a place where the author specified a specific enum, the code is all that's required. e.g. "red") After all, unless someone wants to allow MULTIPLE enums as an argument, the normal collision rules should prevent most of the problems. This should work for either the single-value or multiple value cases, since it merely imports the namespace, and the parser otherwise works as normal. (Such importation would need to override, and pull from the enum, wherever it's called from.) I suppose this could be a general soft-typing syntax, and the Enum object type could simply be a valid instance, but I don't think I like that notion. I'd prefer the syntax to be clearly a set enumeration. Of course, I'd also like some fast and easy way of declaring whether the enum in question should allow aggregate/composite types, or singular values from the list, with an exception raised on failure to comply. something like the following: #exa.py: enum(single) pc_make: #Declare pc_make as an enum type with the following contents. apple ibm=2.5 atari="Just to demonstrate non-numeric" #I picture defaulting to unique, non-overlapping integer values, unless otherwise defined. #Valid types would best be hashable #I suppose "pc_make=enum(single)" is more consistent, if we think of enum as a class, #Also, enum(1) might be better, for defining values accepted. I digress. def Box: def SetMake( new_make:enum(pcmake)=ibm ): #to set the id to "new_make", the enum to pcmake, and the default setting to "ibm" pass #2nd_module.py: from exa import Box b=Box() b.SetMake(apple) #Works fine b.SetMake(ibm + apple) $Raises an enum_error, since the enum was declared singular. (In hindsight, the argument definition is a better place to to define the singular/multiple values option, though having the ability to defining once for all would make use easier.) Hopefully the idea is clear. The repr(enum_code)='enum_code' notion would certainly be compatible. -Nate > That's assuming the data export format doesn't support enums. > > I would fine enums whose str() is just an int pretty annoying. Note > that str(True) == repr(True) == 'True' and that's how I'd like it to > be for enums in general. If you want the int value, use int(e). If you > want the class name, use e.__class__.__name__ (maybe e.__class__ is > what you're after anyway). My observation is that most enums have > sufficiently unique names that during a debugging session the class is > not a big mystery; it's the mapping from value to name that's hard to > memorize. As a compromise, I could live with str(e) == 'red' and > repr(e) == 'Color.red'. But I don't think I could stomach str(e) == 1. > > -- --Guido van Rossum (python.org/~guido) From julian at grayvines.com Sat Aug 20 01:23:45 2011 From: julian at grayvines.com (Julian Berman) Date: Fri, 19 Aug 2011 19:23:45 -0400 Subject: [Python-ideas] Columns for pprint.PrettyPrinter Message-ID: <987401CE-FF96-4E83-A5B3-19190AFCFE3B@GrayVines.com> For the most part I find pprint to be lovely. Simple and does a useful thing that I don't want to have to write every time, especially if I'm using it just to debug. However, I often find that I'm using it to print out a long-length singly nested structure whose elements' reprs are not really that long, for which its output is cumbersome. Try to pprint.pprint(range(100)). Instead of seeing a long really skinny column, it'd be nice if it could quickly provide a way (combined with the width arg that it already takes) to split up the elements of the structure within the width, so that instead of seeing things like >>> pprint.pprint(range(30)) [0, 1, 2, ... ] it could be coerced into something like >>> pprint.pprint(range(30), columns=5) [0, 1, 2, 3, 4 3, 4, 5, 6, 7, ... ] or for something nested, which I'm less thrilled with, and haven't thought out how to implement unless you have a somewhat balanced structure, but for posterity: {"foo" : {"bar" : 1, {"hello" : 2, {"other" : 1, "baz" : 2, "world" : 1}, "thing" : 2, "foo" : 3}, "here" : 3}, ... } Obviously it's meant to be simple, the comment at the top of the module even says so, and doing something like ^ is easy enough, but for what it's good for (saving me from having to write code that makes my objects easier to debug by displaying them nicely), just making it do a bit more would make life easier. From greg.ewing at canterbury.ac.nz Sat Aug 20 02:31:44 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 20 Aug 2011 12:31:44 +1200 Subject: [Python-ideas] Enums In-Reply-To: <4E4ED077.9040302@gmail.com> References: <4E4ED077.9040302@gmail.com> Message-ID: <4E4F0070.4000305@canterbury.ac.nz> Spectral One wrote: > def Box: > def SetMake( new_make:enum(pcmake)=ibm ): > > b.SetMake(apple) #Works fine How do you propose to implement this? Currently, function arguments are evaluated without any knowledge of the function to which they are going to be passed. To what extent would you change that, and how? -- Greg From jxo6948 at rit.edu Sat Aug 20 09:11:16 2011 From: jxo6948 at rit.edu (John O'Connor) Date: Sat, 20 Aug 2011 00:11:16 -0700 Subject: [Python-ideas] Columns for pprint.PrettyPrinter In-Reply-To: <987401CE-FF96-4E83-A5B3-19190AFCFE3B@GrayVines.com> References: <987401CE-FF96-4E83-A5B3-19190AFCFE3B@GrayVines.com> Message-ID: This is slightly off topic but I recently came across a need to generate some quick reports which had console based tabular output. I wondered if there was a quick way to do table-like formatting with text. string.format() is definitely awesome but the sizing is fixed width. If you use a column delimiter, say '|' for ex. you have to put it in the format string. Automatically sizing table columns based on the cell value lengths would be helpful, and seems like an interesting idea but for something other than pprint. > or for something nested, which I'm less thrilled with, and haven't thought > out how to implement unless you have a somewhat balanced structure, but for > posterity: > > {"foo" : > {"bar" : 1, {"hello" : 2, {"other" : > 1, > "baz" : 2, "world" : 1}, "thing" : > 2, > "foo" : 3}, > "here" : 3}, > > ... > } > That is not a valid python dict. I think you may have been after something more like... 'foo': { 'bar': 1, 'baz': 2, 'biz': { 'x': 1, 'y': 2 } } which reflects the nesting and is suitable for the interpreter. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghostwriter402 at gmail.com Sat Aug 20 10:53:08 2011 From: ghostwriter402 at gmail.com (Spectral One) Date: Sat, 20 Aug 2011 03:53:08 -0500 Subject: [Python-ideas] Enums In-Reply-To: References: Message-ID: <4E4F75F4.2070303@gmail.com> > Subject: > Re: [Python-ideas] Enums > From: > Greg Ewing > Date: > 8/19/2011 7:31 PM > > To: > python-ideas at python.org > > > Spectral One wrote: > >> def Box: >> def SetMake( new_make:enum(pcmake)=ibm ): > > >> b.SetMake(apple) #Works fine > > How do you propose to implement this? > > Currently, function arguments are evaluated without any > knowledge of the function to which they are going to be > passed. To what extent would you change that, and how? > > -- > Greg Hmm, is setting a flag to switch twixt normal and another namespace really all that tricky? It would probably help to know how Python handles namespaces already, but I don't. In fact, I don't know Python well enough to do much more than speculate on how it's implemented. I was just figuring that the syntax and API are the harder parts to do well, anyhow, so I suggested in that mode. That NO evaluation is done before entry is a bit of a surprise. Obviously, the number of arguments is stored and checked, since TypeError is raised on argument number mismatches. There are obviously various ways to accomplish this, but the idea was, I confess, unconstrained by implementation details I don't know. Some possibilities that pop to mind: Such things could be held in a list, much like object properties or classes, which is available to be interrogated by the parser. When checking the argument list, it could also check lists for associated enums. If functions are complete objects, you could embed the information in the object for each specified argument in some standard way. 'function.__enums__()' maybe It seems that the function, on call, would need to know to check these things or not, but that's something that could be handled behind the scenes with a block to test that's simply conditionally skipped or added as needed. That would still require that the values be processed differently while being interpreted, unless some global rule was instituted to get around it. (Late-parsing arguments might work, or making sure that the names of variables are passed as strings to the function, so the function can ignore incorrectly-parsed identifiers. Of course THAT would require a tone more processing, or would preclude multiple values, so I really don't like it.) Actually, if our hearts aren't set on passing the enum names as identifiers, we could pass them as strings that are parsed on the other side. (rules for combining them would be part of the standard enum object definition.) This wouldn't look as neat to me, but it would move the namespace issue completely inside the recipient function. A function could fake it or actually use the the enum functions based on string parsing. def Box: def SetMake( new_make=enum(pcmake)["ibm"] ) ... b.SetMake("apple") Beyond that, it's mostly just a matter of determining proper behavior for an enum object type and the events it should raise, etc. (An extended dictionary class which stashes {key:(key,value)} could work as a basis.) -Nate From k.bx at ya.ru Thu Aug 25 11:28:14 2011 From: k.bx at ya.ru (k_bx) Date: Thu, 25 Aug 2011 12:28:14 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere Message-ID: <379591314264494@web30.yandex.ru> Hi! There's a certain problem right now in python that when people need to build string from pieces they really often do something like this:: def main_pure(): b = u"initial value" for i in xrange(30000): b += u"more data" return b The bad thing about it is that new string is created every time you do +=, so it performs bad on CPython (and horrible on PyPy). If people would use, for example, list of strings it would be much better (performance):: def main_list_append(): b = [u"initial value"] for i in xrange(3000000): b.append(u"more data") return u"".join(b) The results are:: kost at kost-laptop:~/tmp$ time python string_bucket_pure.py real 0m7.194s user 0m3.590s sys 0m3.580s kost at kost-laptop:~/tmp$ time python string_bucket_append.py real 0m0.417s user 0m0.330s sys 0m0.080s Fantastic, isn't it? Also, now let's forget about speed and think about semantics a little: your task is: "build a string from it's pieces", or in other words "build a string from list of pieces", so from this point of view you can say that using [] and u"".join is better in semantic way. Java has it's StringBuilder class for a long time (I'm not really into java, I've just been told about that), and what I think is that python should have it's own StringBuilder:: class StringBuilder(object): """Use it instead of doing += for building unicode strings from pieces""" def __init__(self, val=u""): self.val = val self.appended = [] def __iadd__(self, other): self.appended.append(other) return self def __unicode__(self): self.val = u"".join((self.val, u"".join(self.appended))) self.appended = [] return self.val Why StringBuilder class, not just use [] + u''.join ? Well, I have two reasons for that: 1. It has caching 2. You can document it, because when programmer looks at [] + u"" method he doesn't see _WHY_ is it done so, while when he sees StringBuilder class he can go ahead and read it's help(). Performance of StringBuilder is ok compared to [] + u"" (I've increased number of += from 30000 to 30000000): def main_bucket(): b = StringBuilder(u"initial value ") for i in xrange(30000000): b += u"more data" return unicode(b) For CPython:: kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py real 0m12.944s user 0m11.670s sys 0m1.260s kost at kost-laptop:~/tmp$ time python string_bucket_append.py real 0m3.540s user 0m2.830s sys 0m0.690s For PyPy 1.6:: (pypy)kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py real 0m18.593s user 0m12.930s sys 0m5.600s (pypy)kost at kost-laptop:~/tmp$ time python string_bucket_append.py real 0m16.214s user 0m11.750s sys 0m4.280s Of course, C implementation could be done to make things faster for CPython, I guess, but really, in comparision to += method it doesn't matter now. It's done to be explicit. p.s.: also, why not use cStringIO? 1. it's not semantically right to create file-like string just to join multiple string pieces into one. 2. if you talk about using it in your code right away -- you can see that noone still uses it because people want += (while with StringBuilder you give them +=). 3. it's somehow slow on pypy right now :-) Thanks. From mal at egenix.com Thu Aug 25 11:45:55 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 25 Aug 2011 11:45:55 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <379591314264494@web30.yandex.ru> References: <379591314264494@web30.yandex.ru> Message-ID: <4E5619D3.5050809@egenix.com> k_bx wrote: > Hi! > > There's a certain problem right now in python that when people need to build string from pieces they really often do something like this:: > > def main_pure(): > b = u"initial value" > for i in xrange(30000): > b += u"more data" > return b > > The bad thing about it is that new string is created every time you do +=, so it performs bad on CPython (and horrible on PyPy). If people would use, for example, list of strings it would be much better (performance):: > > def main_list_append(): > b = [u"initial value"] > for i in xrange(3000000): > b.append(u"more data") > return u"".join(b) > > The results are:: > > kost at kost-laptop:~/tmp$ time python string_bucket_pure.py > > real 0m7.194s > user 0m3.590s > sys 0m3.580s > kost at kost-laptop:~/tmp$ time python string_bucket_append.py > > real 0m0.417s > user 0m0.330s > sys 0m0.080s > > Fantastic, isn't it? > > Also, now let's forget about speed and think about semantics a little: your task is: "build a string from it's pieces", or in other words "build a string from list of pieces", so from this point of view you can say that using [] and u"".join is better in semantic way. > > Java has it's StringBuilder class for a long time (I'm not really into java, I've just been told about that), and what I think is that python should have it's own StringBuilder:: > > class StringBuilder(object): > """Use it instead of doing += for building unicode strings from pieces""" > def __init__(self, val=u""): > self.val = val > self.appended = [] > > def __iadd__(self, other): > self.appended.append(other) > return self > > def __unicode__(self): > self.val = u"".join((self.val, u"".join(self.appended))) > self.appended = [] > return self.val > > Why StringBuilder class, not just use [] + u''.join ? Well, I have two reasons for that: > > 1. It has caching > 2. You can document it, because when programmer looks at [] + u"" method he doesn't see _WHY_ is it done so, while when he sees StringBuilder class he can go ahead and read it's help(). > > Performance of StringBuilder is ok compared to [] + u"" (I've increased number of += from 30000 to 30000000): > > def main_bucket(): > b = StringBuilder(u"initial value ") > for i in xrange(30000000): > b += u"more data" > return unicode(b) > > For CPython:: > > kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py > > real 0m12.944s > user 0m11.670s > sys 0m1.260s > > kost at kost-laptop:~/tmp$ time python string_bucket_append.py > > real 0m3.540s > user 0m2.830s > sys 0m0.690s > > For PyPy 1.6:: > > (pypy)kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py > > real 0m18.593s > user 0m12.930s > sys 0m5.600s > > (pypy)kost at kost-laptop:~/tmp$ time python string_bucket_append.py > > real 0m16.214s > user 0m11.750s > sys 0m4.280s > > Of course, C implementation could be done to make things faster for CPython, I guess, but really, in comparision to += method it doesn't matter now. It's done to be explicit. > > p.s.: also, why not use cStringIO? > 1. it's not semantically right to create file-like string just to join multiple string pieces into one. > 2. if you talk about using it in your code right away -- you can see that noone still uses it because people want += (while with StringBuilder you give them +=). > 3. it's somehow slow on pypy right now :-) I think you should use cStringIO in your class implementation. The list + join idiom is nice, but it has the disadvantage of creating and keeping alive many small string objects (with all the memory overhead and fragmentation that goes along with it). AFAIR, the most efficient approach is using arrays: >>> import array >>> t = array.array('u') >>> t.extend(u'???') >>> t array('u', u'\xe4\xf6\xfc') >>> t.tounicode() u'\xe4\xf6\xfc' -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 25 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-10-04: PyCon DE 2011, Leipzig, Germany 40 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From cmjohnson.mailinglist at gmail.com Thu Aug 25 11:53:49 2011 From: cmjohnson.mailinglist at gmail.com (Carl Matthew Johnson) Date: Wed, 24 Aug 2011 23:53:49 -1000 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <379591314264494@web30.yandex.ru> References: <379591314264494@web30.yandex.ru> Message-ID: Interesting semantics? What version of Python were you using? The current documentation has this to say: ? CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use thestr.join() method which assures consistent linear concatenation performance across versions and implementations. Changed in version 2.4: Formerly, string concatenation never occurred in-place. It's my understanding that the na?ve approach should now have performance comparable to the "proper" list append technique as long as you use CPython >2.4. -- Carl Johnson From k.bx at ya.ru Thu Aug 25 11:56:42 2011 From: k.bx at ya.ru (k_bx) Date: Thu, 25 Aug 2011 12:56:42 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: References: <379591314264494@web30.yandex.ru> Message-ID: <262271314266202@web10.yandex.ru> 25.08.2011, 12:53, "Carl Matthew Johnson" : > ?Interesting semantics? > > ?What version of Python were you using? The current documentation has this to say: > > ?????????? CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use thestr.join() method which assures consistent linear concatenation performance across versions and implementations. > > ?Changed in version 2.4: Formerly, string concatenation never occurred in-place. > > ? > > ?It's my understanding that the na?ve approach should now have performance comparable to the "proper" list append technique as long as you use CPython >2.4. > > ?-- Carl Johnson > ?_______________________________________________ > ?Python-ideas mailing list > ?Python-ideas at python.org > ?http://mail.python.org/mailman/listinfo/python-ideas I use cpython 2.7 that comes with Ubuntu Natty with latest updates. From k.bx at ya.ru Thu Aug 25 11:57:22 2011 From: k.bx at ya.ru (k_bx) Date: Thu, 25 Aug 2011 12:57:22 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <4E5619D3.5050809@egenix.com> References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> Message-ID: <393121314266242@web28.yandex.ru> 25.08.2011, 12:45, "M.-A. Lemburg" : > k_bx wrote: > >> ?Hi! >> >> ?There's a certain problem right now in python that when people need to build string from pieces they really often do something like this:: >> >> ?????def main_pure(): >> ?????????b = u"initial value" >> ?????????for i in xrange(30000): >> ?????????????b += u"more data" >> ?????????return b >> >> ?The bad thing about it is that new string is created every time you do +=, so it performs bad on CPython (and horrible on PyPy). If people would use, for example, list of strings it would be much better (performance):: >> >> ?????def main_list_append(): >> ?????????b = [u"initial value"] >> ?????????for i in xrange(3000000): >> ?????????????b.append(u"more data") >> ?????????return u"".join(b) >> >> ?The results are:: >> >> ?????kost at kost-laptop:~/tmp$ time python string_bucket_pure.py >> >> ?????real 0m7.194s >> ?????user 0m3.590s >> ?????sys 0m3.580s >> ?????kost at kost-laptop:~/tmp$ time python string_bucket_append.py >> >> ?????real 0m0.417s >> ?????user 0m0.330s >> ?????sys 0m0.080s >> >> ?Fantastic, isn't it? >> >> ?Also, now let's forget about speed and think about semantics a little: your task is: "build a string from it's pieces", or in other words "build a string from list of pieces", so from this point of view you can say that using [] and u"".join is better in semantic way. >> >> ?Java has it's StringBuilder class for a long time (I'm not really into java, I've just been told about that), and what I think is that python should have it's own StringBuilder:: >> >> ?????class StringBuilder(object): >> ?????????"""Use it instead of doing += for building unicode strings from pieces""" >> ?????????def __init__(self, val=u""): >> ?????????????self.val = val >> ?????????????self.appended = [] >> >> ?????????def __iadd__(self, other): >> ?????????????self.appended.append(other) >> ?????????????return self >> >> ?????????def __unicode__(self): >> ?????????????self.val = u"".join((self.val, u"".join(self.appended))) >> ?????????????self.appended = [] >> ?????????????return self.val >> >> ?Why StringBuilder class, not just use [] + u''.join ? Well, I have two reasons for that: >> >> ?1. It has caching >> ?2. You can document it, because when programmer looks at [] + u"" method he doesn't see _WHY_ is it done so, while when he sees StringBuilder class he can go ahead and read it's help(). >> >> ?Performance of StringBuilder is ok compared to [] + u"" (I've increased number of += from 30000 to 30000000): >> >> ?????def main_bucket(): >> ?????????b = StringBuilder(u"initial value ") >> ?????????for i in xrange(30000000): >> ?????????????b += u"more data" >> ?????????return unicode(b) >> >> ?For CPython:: >> >> ?????????kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py >> >> ?????????real 0m12.944s >> ?????????user 0m11.670s >> ?????????sys 0m1.260s >> >> ?????????kost at kost-laptop:~/tmp$ time python string_bucket_append.py >> >> ?????????real 0m3.540s >> ?????????user 0m2.830s >> ?????????sys 0m0.690s >> >> ?For PyPy 1.6:: >> >> ?????????(pypy)kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py >> >> ?????????real 0m18.593s >> ?????????user 0m12.930s >> ?????????sys 0m5.600s >> >> ?????????(pypy)kost at kost-laptop:~/tmp$ time python string_bucket_append.py >> >> ?????????real 0m16.214s >> ?????????user 0m11.750s >> ?????????sys 0m4.280s >> >> ?Of course, C implementation could be done to make things faster for CPython, I guess, but really, in comparision to += method it doesn't matter now. It's done to be explicit. >> >> ?p.s.: also, why not use cStringIO? >> ?1. it's not semantically right to create file-like string just to join multiple string pieces into one. >> ?2. if you talk about using it in your code right away -- you can see that noone still uses it because people want += (while with StringBuilder you give them +=). >> ?3. it's somehow slow on pypy right now :-) > > I think you should use cStringIO in your class implementation. > The list + join idiom is nice, but it has the disadvantage of > creating and keeping alive many small string objects (with all > the memory overhead and fragmentation that goes along with it). > > AFAIR, the most efficient approach is using arrays: > >>>> ?import array >>>> ?t = array.array('u') >>>> ?t.extend(u'???') >>>> ?t > > array('u', u'\xe4\xf6\xfc') > >>>> ?t.tounicode() > > u'\xe4\xf6\xfc' > > -- > Marc-Andre Lemburg > eGenix.com > > Professional Python Services directly from the Source ?(#1, Aug 25 2011) >>>> Python/Zope Consulting and Support ... ???????http://www.egenix.com/ >>>> mxODBC.Zope.Database.Adapter ... ????????????http://zope.egenix.com/ >>>> mxODBC, mxDateTime, mxTextTools ... ???????http://python.egenix.com/ > ________________________________________________________________________ > 2011-10-04: PyCon DE 2011, Leipzig, Germany ???????????????40 days to go > > ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: > > ??eGenix.com Software, Skills and Services GmbH ?Pastor-Loeh-Str.48 > ???D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg > ??????????Registered at Amtsgericht Duesseldorf: HRB 46611 > ??????????????http://www.egenix.com/company/contact/ I'm perfectly ok with different implementation of StringBuilder, but the main idea and proposal here is to make it in standard library somehow and force (and promote) uses of it everywhere, maybe write some FAQ. So that when you see some new += code all you need it so go and fix that without worrying about complains :-D From masklinn at masklinn.net Thu Aug 25 12:01:47 2011 From: masklinn at masklinn.net (Masklinn) Date: Thu, 25 Aug 2011 12:01:47 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: References: <379591314264494@web30.yandex.ru> Message-ID: <70829A2C-3914-44DC-BEF6-15845019F21E@masklinn.net> On 2011-08-25, at 11:53 , Carl Matthew Johnson wrote: > Interesting semantics? > > > What version of Python were you using? The current documentation has this to say: > > ? CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use thestr.join() method which assures consistent linear concatenation performance across versions and implementations. > > Changed in version 2.4: Formerly, string concatenation never occurred in-place. > > > > It's my understanding that the na?ve approach should now have performance comparable to the "proper" list append technique as long as you use CPython >2.4. Hah. Alex Gaynor complained about this very CPython strick recently: http://twitter.com/#!/alex_gaynor/status/104326041920749569 From mal at egenix.com Thu Aug 25 12:19:55 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 25 Aug 2011 12:19:55 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <393121314266242@web28.yandex.ru> References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> <393121314266242@web28.yandex.ru> Message-ID: <4E5621CB.8050909@egenix.com> k_bx wrote: > I'm perfectly ok with different implementation of StringBuilder, but the main idea and proposal here is to make it in standard library somehow and force (and promote) uses of it everywhere, maybe write some FAQ. So that when you see some new += code all you need it so go and fix that without worrying about complains :-D I guess adding something like this to string.py would be worthwhile exploring. It's a very common use case and the list-idiom doesn't read well in practice. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 25 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-10-04: PyCon DE 2011, Leipzig, Germany 40 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From larry at hastings.org Thu Aug 25 12:34:14 2011 From: larry at hastings.org (Larry Hastings) Date: Thu, 25 Aug 2011 03:34:14 -0700 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <4E5621CB.8050909@egenix.com> References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> <393121314266242@web28.yandex.ru> <4E5621CB.8050909@egenix.com> Message-ID: <4E562526.1080203@hastings.org> On 08/25/2011 03:19 AM, M.-A. Lemburg wrote: > k_bx wrote: >> I'm perfectly ok with different implementation of StringBuilder, but the main idea and proposal here is to make it in standard library somehow and force (and promote) uses of it everywhere, maybe write some FAQ. So that when you see some new += code all you need it so go and fix that without worrying about complains :-D > I guess adding something like this to string.py would be worthwhile > exploring. It's a very common use case and the list-idiom doesn't > read well in practice. > I think the right place to do this is inside Python itself. I proposed something to do that several years ago, been meaning to revive it. http://bugs.python.org/issue1569040 /larry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From k.bx at ya.ru Thu Aug 25 12:38:44 2011 From: k.bx at ya.ru (k_bx) Date: Thu, 25 Aug 2011 13:38:44 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <379591314264494@web30.yandex.ru> References: <379591314264494@web30.yandex.ru> Message-ID: <279771314268724@web84.yandex.ru> 25.08.2011, 12:28, "k_bx" : > Hi! > > There's a certain problem right now in python that when people need to build string from pieces they really often do something like this:: > > ????def main_pure(): > ????????b = u"initial value" > ????????for i in xrange(30000): > ????????????b += u"more data" > ????????return b > > The bad thing about it is that new string is created every time you do +=, so it performs bad on CPython (and horrible on PyPy). If people would use, for example, list of strings it would be much better (performance):: > > ????def main_list_append(): > ????????b = [u"initial value"] > ????????for i in xrange(3000000): > ????????????b.append(u"more data") > ????????return u"".join(b) > > The results are:: > > ????kost at kost-laptop:~/tmp$ time python string_bucket_pure.py > > ????real 0m7.194s > ????user 0m3.590s > ????sys 0m3.580s > ????kost at kost-laptop:~/tmp$ time python string_bucket_append.py > > ????real 0m0.417s > ????user 0m0.330s > ????sys 0m0.080s > > Fantastic, isn't it? > > Also, now let's forget about speed and think about semantics a little: your task is: "build a string from it's pieces", or in other words "build a string from list of pieces", so from this point of view you can say that using [] and u"".join is better in semantic way. > > Java has it's StringBuilder class for a long time (I'm not really into java, I've just been told about that), and what I think is that python should have it's own StringBuilder:: > > ????class StringBuilder(object): > ????????"""Use it instead of doing += for building unicode strings from pieces""" > ????????def __init__(self, val=u""): > ????????????self.val = val > ????????????self.appended = [] > > ????????def __iadd__(self, other): > ????????????self.appended.append(other) > ????????????return self > > ????????def __unicode__(self): > ????????????self.val = u"".join((self.val, u"".join(self.appended))) > ????????????self.appended = [] > ????????????return self.val > > Why StringBuilder class, not just use [] + u''.join ? Well, I have two reasons for that: > > 1. It has caching > 2. You can document it, because when programmer looks at [] + u"" method he doesn't see _WHY_ is it done so, while when he sees StringBuilder class he can go ahead and read it's help(). > > Performance of StringBuilder is ok compared to [] + u"" (I've increased number of += from 30000 to 30000000): > > ????def main_bucket(): > ????????b = StringBuilder(u"initial value ") > ????????for i in xrange(30000000): > ????????????b += u"more data" > ????????return unicode(b) > > For CPython:: > > ????????kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py > > ????????real 0m12.944s > ????????user 0m11.670s > ????????sys 0m1.260s > > ????????kost at kost-laptop:~/tmp$ time python string_bucket_append.py > > ????????real 0m3.540s > ????????user 0m2.830s > ????????sys 0m0.690s > > For PyPy 1.6:: > > ????????(pypy)kost at kost-laptop:~/tmp$ time python string_bucket_bucket.py > > ????????real 0m18.593s > ????????user 0m12.930s > ????????sys 0m5.600s > > ????????(pypy)kost at kost-laptop:~/tmp$ time python string_bucket_append.py > > ????????real 0m16.214s > ????????user 0m11.750s > ????????sys 0m4.280s > > Of course, C implementation could be done to make things faster for CPython, I guess, but really, in comparision to += method it doesn't matter now. It's done to be explicit. > > p.s.: also, why not use cStringIO? > 1. it's not semantically right to create file-like string just to join multiple string pieces into one. > 2. if you talk about using it in your code right away -- you can see that noone still uses it because people want += (while with StringBuilder you give them +=). > 3. it's somehow slow on pypy right now :-) > > Thanks. Oh, and also, I really like how Python had it's MutableString class since forever, but deprecated in python 3. From g.brandl at gmx.net Thu Aug 25 12:50:37 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 25 Aug 2011 12:50:37 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <279771314268724@web84.yandex.ru> References: <379591314264494@web30.yandex.ru> <279771314268724@web84.yandex.ru> Message-ID: Am 25.08.2011 12:38, schrieb k_bx: > > Oh, and also, I really like how Python had it's MutableString class since forever, but deprecated in python 3. You do realize that MutableString's __iadd__ just performs += on str operands? Georg From k.bx at ya.ru Thu Aug 25 12:55:17 2011 From: k.bx at ya.ru (k_bx) Date: Thu, 25 Aug 2011 13:55:17 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: References: <379591314264494@web30.yandex.ru> <279771314268724@web84.yandex.ru> Message-ID: <295751314269717@web105.yandex.ru> 25.08.2011, 13:50, "Georg Brandl" : > Am 25.08.2011 12:38, schrieb k_bx: > >> ?Oh, and also, I really like how Python had it's MutableString class since forever, but deprecated in python 3. > > You do realize that MutableString's __iadd__ just performs += on str operands? > > Georg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas Oh, I'm sorry, I thought it uses cStringIO internally. Let's forget about MutableString then. From solipsis at pitrou.net Thu Aug 25 13:36:43 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 25 Aug 2011 13:36:43 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere References: <379591314264494@web30.yandex.ru> Message-ID: <20110825133643.38a9e1a3@pitrou.net> On Thu, 25 Aug 2011 12:28:14 +0300 k_bx wrote: > > The results are:: > > kost at kost-laptop:~/tmp$ time python string_bucket_pure.py > > real 0m7.194s > user 0m3.590s > sys 0m3.580s > kost at kost-laptop:~/tmp$ time python string_bucket_append.py > > real 0m0.417s > user 0m0.330s > sys 0m0.080s > > Fantastic, isn't it? > > Also, now let's forget about speed and think about semantics a little: your task is: "build a string from it's pieces", or in other words "build a string from list of pieces", so from this point of view you can say that using [] and u"".join is better in semantic way. > > Java has it's StringBuilder class for a long time And Python has io.StringIO. I don't think we need to reinvent the wheel under another name. http://docs.python.org/library/io.html#io.StringIO By the way, when prototyping snippets for the purpose of demonstrating new features, you should really use Python 3, because Python 2 is in bugfix-only mode. (same applies to benchmark results, actually) Regards Antoine. From ncoghlan at gmail.com Thu Aug 25 13:47:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 25 Aug 2011 21:47:05 +1000 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <379591314264494@web30.yandex.ru> References: <379591314264494@web30.yandex.ru> Message-ID: If the join idiom really bothers you... import io def build_str(iterable): # Essentially ''.join, just with str() coercion # and less memory fragmentation target = io.StringIO() for item in iterable: target.write(str(item)) return target.getvalue() # Caution: decorator abuse ahead # I'd prefer this to a StringBuilder class, though :) def gen_str(g): return build_str(g()) >>> @gen_str ... def example(): ... yield 0 ... for i in range(1, 10): ... yield ',' ... yield i ... >>> print(example) 0,1,2,3,4,5,6,7,8,9 Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mikegraham at gmail.com Thu Aug 25 14:31:58 2011 From: mikegraham at gmail.com (Mike Graham) Date: Thu, 25 Aug 2011 08:31:58 -0400 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <379591314264494@web30.yandex.ru> References: <379591314264494@web30.yandex.ru> Message-ID: On Thu, Aug 25, 2011 at 5:28 AM, k_bx wrote: > ? ?def main_bucket(): > ? ? ? ?b = StringBuilder(u"initial value ") > ? ? ? ?for i in xrange(30000000): > ? ? ? ? ? ?b += u"more data" > ? ? ? ?return unicode(b) This doesn't seem nicer to read and write to me than the list form. I also do not see any reason to believe it will stop people from doing it the quadratic way if the ubiquitous make-a-list-then-join idiom does not. Mike From steve at pearwood.info Thu Aug 25 15:57:14 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 25 Aug 2011 23:57:14 +1000 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: References: <379591314264494@web30.yandex.ru> Message-ID: <4E5654BA.7040806@pearwood.info> Carl Matthew Johnson wrote: > Interesting semantics? > > > What version of Python were you using? The current documentation has this to say: > > ? CPython implementation detail: If s and t are both strings, some Python implementations such as CPython can usually perform an in-place optimization for assignments of the form s = s + t or s += t. When applicable, this optimization makes quadratic run-time much less likely. This optimization is both version and implementation dependent. For performance sensitive code, it is preferable to use thestr.join() method which assures consistent linear concatenation performance across versions and implementations. > > Changed in version 2.4: Formerly, string concatenation never occurred in-place. > > > > It's my understanding that the na?ve approach should now have performance comparable to the "proper" list append technique as long as you use CPython >2.4. Relying on that is a bad idea. It is not portable from CPython to any other Python (none of IronPython, Jython or PyPy can include that optimization), it also depends on details of the memory manager used by your operating system (what is fast on one computer can be slow on another), and it doesn't even work under all circumstances (it relies on the string having exactly one reference as well as the exact form of the concatenation). Here's a real-world example of how the idiom of repeated string concatenation goes bad: http://www.mail-archive.com/pypy-dev at python.org/msg00682.html Here's another example, from a few years back, where part of the standard library using string concatenation was *extremely* slow under Windows. Linux users saw no slowdown and it was very hard to diagnose the problem: http://www.mail-archive.com/python-dev at python.org/msg40692.html -- Steven From steve at pearwood.info Thu Aug 25 16:00:40 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 26 Aug 2011 00:00:40 +1000 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: References: <379591314264494@web30.yandex.ru> Message-ID: <4E565588.7060308@pearwood.info> Mike Graham wrote: > On Thu, Aug 25, 2011 at 5:28 AM, k_bx wrote: >> def main_bucket(): >> b = StringBuilder(u"initial value ") >> for i in xrange(30000000): >> b += u"more data" >> return unicode(b) > > This doesn't seem nicer to read and write to me than the list form. I > also do not see any reason to believe it will stop people from doing > it the quadratic way if the ubiquitous make-a-list-then-join idiom > does not. Agreed. Just because the Java idiom is StringBuilder doesn't mean Python should ape it. Python already has a "build strings efficiently" idiom: ''.join(iterable_of_strings) If people can't, or won't, learn this idiom, why would they learn to use StringBuilder instead? -- Steven From arnodel at gmail.com Thu Aug 25 16:02:41 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Thu, 25 Aug 2011 15:02:41 +0100 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: References: <379591314264494@web30.yandex.ru> Message-ID: On 25 August 2011 13:31, Mike Graham wrote: > On Thu, Aug 25, 2011 at 5:28 AM, k_bx wrote: >> ? ?def main_bucket(): >> ? ? ? ?b = StringBuilder(u"initial value ") >> ? ? ? ?for i in xrange(30000000): >> ? ? ? ? ? ?b += u"more data" >> ? ? ? ?return unicode(b) > > This doesn't seem nicer to read and write to me than the list form. I > also do not see any reason to believe it will stop people from doing > it the quadratic way if the ubiquitous make-a-list-then-join idiom > does not. +1 -- Arnaud From stefan_ml at behnel.de Thu Aug 25 17:15:40 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 25 Aug 2011 17:15:40 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <4E565588.7060308@pearwood.info> References: <379591314264494@web30.yandex.ru> <4E565588.7060308@pearwood.info> Message-ID: Steven D'Aprano, 25.08.2011 16:00: > Python already has a "build strings efficiently" idiom: > ''.join(iterable_of_strings) > > If people can't, or won't, learn this idiom, why would they learn to use > StringBuilder instead? Plus, StringBuilder is only a special case. Joining a string around other delimiters is straight forward once you've learned about ''.join(). Doing the same with StringBuilder is non-trivial (as the Java example nicely shows). Stefan From tjreedy at udel.edu Thu Aug 25 17:24:40 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 25 Aug 2011 11:24:40 -0400 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <379591314264494@web30.yandex.ru> References: <379591314264494@web30.yandex.ru> Message-ID: On 8/25/2011 5:28 AM, k_bx wrote: > class StringBuilder(object): > """Use it instead of doing += for building unicode strings from pieces""" > def __init__(self, val=u""): > self.val = val > self.appended = [] > > def __iadd__(self, other): > self.appended.append(other) > return self > > def __unicode__(self): > self.val = u"".join((self.val, u"".join(self.appended))) > self.appended = [] > return self.val I do not see the need to keep the initial piece separate and do the double join. For Py3 class StringBuilder(object): """Use it instead of doing += for building unicode strings from pieces""" def __init__(self, val=""): self.pieces = [val] def __iadd__(self, item): self.pieces.append(item) return self def __str__(self): val = "".join(self.pieces) self.pieces = [val] return val s = StringBuilder('a') s += 'b' s += 'c' print(s) s += 'd' print(s) >>> abc abcd I am personally happy enough with [].append, but I can see the attraction of += if doing many separate lines rather than .append within a loop. -- Terry Jan Reedy From k.bx at ya.ru Thu Aug 25 17:28:34 2011 From: k.bx at ya.ru (k_bx) Date: Thu, 25 Aug 2011 18:28:34 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere Message-ID: <549901314286114@web119.yandex.ru> > This doesn't seem nicer to read and write to me than the list form. I > also do not see any reason to believe it will stop people from doing > it the quadratic way if the ubiquitous make-a-list-then-join idiom > does not. The whole point is that people don't write u''.join idiom just because they don't know it is slow. And when they see StringBuilder -- they can ask themselves "why is he using that". I don't mind using u''.join, but it just doesn't make people think about speed at all. The most popular (as from what I can see) thing right now where people start seeing that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, who is still slow) and ask "why my pypy code is sooooo slow". With StringBuilder used widely that would not be the case. From tjreedy at udel.edu Thu Aug 25 17:41:11 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 25 Aug 2011 11:41:11 -0400 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <279771314268724@web84.yandex.ru> References: <379591314264494@web30.yandex.ru> <279771314268724@web84.yandex.ru> Message-ID: On 8/25/2011 6:38 AM, k_bx wrote: > Oh, and also, I really like how Python had it's MutableString class > since forever, but deprecated in python 3. (removed, i presume you mean...) and added bytearray. I have no idea if += on such is any better than O(n*n) -- Terry Jan Reedy From solipsis at pitrou.net Thu Aug 25 18:35:42 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 25 Aug 2011 18:35:42 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere References: <379591314264494@web30.yandex.ru> <279771314268724@web84.yandex.ru> Message-ID: <20110825183542.2c222c0a@pitrou.net> On Thu, 25 Aug 2011 11:41:11 -0400 Terry Reedy wrote: > On 8/25/2011 6:38 AM, k_bx wrote: > > > Oh, and also, I really like how Python had it's MutableString class > > since forever, but deprecated in python 3. > > (removed, i presume you mean...) and added bytearray. I have no idea if > += on such is any better than O(n*n) On bytearray? Yes, it is. It's a similar algorithm as lists, and therefore O(total length) amortized. Regards Antoine. From solipsis at pitrou.net Thu Aug 25 18:40:44 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 25 Aug 2011 18:40:44 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere References: <549901314286114@web119.yandex.ru> Message-ID: <20110825184044.33214b4f@pitrou.net> On Thu, 25 Aug 2011 18:28:34 +0300 k_bx wrote: > > I don't mind using u''.join, but it just doesn't make people think about speed at all. Realistically, not many workloads have performance issues with string concatenation in the first place. So not caring is the right thing to do in most cases. > The most popular (as from what I can see) thing right now where people start seeing > that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, > who is still slow) and ask "why my pypy code is sooooo slow". Different implementations having different performance characteristics is not totally unexpected, is it? (and I'm sure the PyPy developers wouldn't mind adding another hack) Regards Antoine. From masklinn at masklinn.net Thu Aug 25 18:50:51 2011 From: masklinn at masklinn.net (Masklinn) Date: Thu, 25 Aug 2011 18:50:51 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <20110825184044.33214b4f@pitrou.net> References: <549901314286114@web119.yandex.ru> <20110825184044.33214b4f@pitrou.net> Message-ID: <14F1DD6B-A0F4-4C4B-9D8F-A2733D4CF384@masklinn.net> On 2011-08-25, at 18:40 , Antoine Pitrou wrote: >> The most popular (as from what I can see) thing right now where people start seeing >> that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, >> who is still slow) and ask "why my pypy code is sooooo slow". > > Different implementations having different performance characteristics > is not totally unexpected, is it? > (and I'm sure the PyPy developers wouldn't mind adding another hack) This one can not be done, as it relies on knowing there's only one reference to the string (so it can be realloc'd in place), therefore on using a refcounting GC. Since Pypy does not use refcounting, it can't do that as a rule (it might be possible to handle it for a limited number of cases via escape analysis, proving there can be only one reference to the string, but I'd say there are more interesting things to use escape analysis for). Also, http://twitter.com/#!/alex_gaynor/status/104326041920749569 > Wish CPython didn't contains hacks which make str += str faster, > sometimes, depending on refcounting details :( From stefan_ml at behnel.de Thu Aug 25 18:58:01 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 25 Aug 2011 18:58:01 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <549901314286114@web119.yandex.ru> References: <549901314286114@web119.yandex.ru> Message-ID: k_bx, 25.08.2011 17:28: > I don't mind using u''.join, but it just doesn't make people think about speed at all. When I see something like a StringBuilder, I guess the first thing I'd wonder about is why the programmer didn't just use StringIO() or even just ''.join(). That makes the code appear much more magic than it eventually turns out to be when looking closer. Plus, it's doomed to be slower, simply because it goes through more indirections. You may be right that using StringIO won't make people think about speed. Somebody who doesn't know it would likely go: "oh, that's nice - writing to a string as if it were a file - I get that". So it tells you what it does, instead of distracting you into thinking about performance implications. That's the beauty of it. Optimisations are just that: optimisations. They are orthogonal to what the code does - or at least they should be. Even string concatenation can preform just fine in many cases. > The most popular (as from what I can see) thing right now where people start seeing > that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, > who is still slow) and ask "why my pypy code is sooooo slow". Sounds like yet another reason not to do it then. Seriously, there are hardly any language runtimes out there where continued string concatenation is efficient, let alone guaranteed to be so. You just shouldn't expect that it is. The optimisation in CPython was simply done because it *can* be done, so that simple cases (and stupid benchmarks) can continue to use simple concatenation and still be efficient. Well, in some cases at least. Stefan From solipsis at pitrou.net Thu Aug 25 18:58:24 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 25 Aug 2011 18:58:24 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <14F1DD6B-A0F4-4C4B-9D8F-A2733D4CF384@masklinn.net> References: <549901314286114@web119.yandex.ru> <20110825184044.33214b4f@pitrou.net> <14F1DD6B-A0F4-4C4B-9D8F-A2733D4CF384@masklinn.net> Message-ID: <1314291504.3547.5.camel@localhost.localdomain> Le jeudi 25 ao?t 2011 ? 18:50 +0200, Masklinn a ?crit : > On 2011-08-25, at 18:40 , Antoine Pitrou wrote: > >> The most popular (as from what I can see) thing right now where people start seeing > >> that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, > >> who is still slow) and ask "why my pypy code is sooooo slow". > > > > Different implementations having different performance characteristics > > is not totally unexpected, is it? > > (and I'm sure the PyPy developers wouldn't mind adding another hack) > This one can not be done, as it relies on knowing there's only one reference > to the string (so it can be realloc'd in place), therefore on using a > refcounting GC. Ah, you're right. However, PyPy has another (and quite broader) set of optimizations available: http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string-optimizations Besides: > Since Pypy does not use refcounting, it can't do that as a rule (it might be > possible to handle it for a limited number of cases via escape analysis, > proving there can be only one reference to the string, but I'd say there > are more interesting things to use escape analysis for). The CPython optimization itself works in a limited number of cases, because having a refcount of 1 essentially means it's a local variable, and because of the result having to be stored back immediately in the same local variable (otherwise you can't recycle the original object's storage). Regards Antoine. From tlesher at gmail.com Thu Aug 25 19:02:19 2011 From: tlesher at gmail.com (Tim Lesher) Date: Thu, 25 Aug 2011 13:02:19 -0400 Subject: [Python-ideas] Python memory allocation in embedded interpreters Message-ID: Issue 3329 (http://bugs.python.org/issue3329, "API for setting the memory allocator used by Python") called for the ability to redirect Python's memory allocation at the lowest level from the C runtime malloc/realloc/free to a user-supplied allocator. The consensus seemed to turn to a set of macros, similar to the (now-deprecated) PyMem_XXXX family of macros, that can redirect to a user-supplied allocator based on a compile-time switch, without forcing an indirection on "normal" builds of Python. Additionally, in the comments, Jukka noted that there's still an issue with static pointers that retain their values across multiple Py_Initialize()/Py_Finalize() invocations, which causes problems when a program that embeds a Python interpreter wants to segregate the memory usage of each interpreter. That issue's been dormant for exactly two years today (happy anniversary, I guess), but I know that at least two projects (the Nokia S60 port and Vocollect's CE port) have had to implement the first issue, and fought with the second issue. The first one is pretty straightforward (mostly just replacing malloc/realloc/free calls with the appropriate macro), but the second is a little more complicated. The typical scenario in CPython code looks like this: ---- static PyObject* someDict; PyObject* getSomeDict() { if (! someDict) { someDict = PyDict_New(); /* initialize someDict */ } return someDict; } ---- This "leaks" someDict, and worse, a later PyInitialize/PyFinalize call will reuse the pointer without reallocating it, which dies if (in the meantime) the second PyInitialize uses a different allocator (in our case, a private segregated heap to avoid fragmentation). One way to fix this would be to "register" static PyObject pointers so that PyFinalize() could reset them to NULL. The usage is pretty straightforward: ---- static PyObject* someDict; PyObject* getSomeDict() { if (! someDict) { someDict = PyDict_New(); PyMem_registerStatic(&someDict); /* initialize someDict */ } return someDict; } ---- It's still a manual step, but I don't see an obvious way around that in C (C++ would do registration-on-construction). Thoughts? If it seems reasonable, I'll turn our local implementation into a patch set to address this. -- Tim Lesher From stefan_ml at behnel.de Thu Aug 25 19:21:45 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 25 Aug 2011 19:21:45 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <1314291504.3547.5.camel@localhost.localdomain> References: <549901314286114@web119.yandex.ru> <20110825184044.33214b4f@pitrou.net> <14F1DD6B-A0F4-4C4B-9D8F-A2733D4CF384@masklinn.net> <1314291504.3547.5.camel@localhost.localdomain> Message-ID: Antoine Pitrou, 25.08.2011 18:58: > Le jeudi 25 ao?t 2011 ? 18:50 +0200, Masklinn a ?crit : >> On 2011-08-25, at 18:40 , Antoine Pitrou wrote: >>>> The most popular (as from what I can see) thing right now where people start seeing >>>> that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, >>>> who is still slow) and ask "why my pypy code is sooooo slow". >>> >>> Different implementations having different performance characteristics >>> is not totally unexpected, is it? >>> (and I'm sure the PyPy developers wouldn't mind adding another hack) >> This one can not be done, as it relies on knowing there's only one reference >> to the string (so it can be realloc'd in place), therefore on using a >> refcounting GC. > > Ah, you're right. > However, PyPy has another (and quite broader) set of optimizations > available: > http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string-optimizations And its JIT could potentially just enable its string-join optimisation automatically when it sees that a variable holds a string, and is never being assigned to inside of a loop or sequence of operations except for the += operator. Any other operation on the string would then just turn it back into a normal string by joining it first. But this is seriously getting off-topic now. Stefan From masklinn at masklinn.net Thu Aug 25 19:35:39 2011 From: masklinn at masklinn.net (Masklinn) Date: Thu, 25 Aug 2011 19:35:39 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <1314291504.3547.5.camel@localhost.localdomain> References: <549901314286114@web119.yandex.ru> <20110825184044.33214b4f@pitrou.net> <14F1DD6B-A0F4-4C4B-9D8F-A2733D4CF384@masklinn.net> <1314291504.3547.5.camel@localhost.localdomain> Message-ID: <1A908342-9FC2-4808-BEC9-7D2BCEB65C7A@masklinn.net> On 2011-08-25, at 18:58 , Antoine Pitrou wrote: > Le jeudi 25 ao?t 2011 ? 18:50 +0200, Masklinn a ?crit : >> On 2011-08-25, at 18:40 , Antoine Pitrou wrote: >>>> The most popular (as from what I can see) thing right now where people start seeing >>>> that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, >>>> who is still slow) and ask "why my pypy code is sooooo slow". >>> >>> Different implementations having different performance characteristics >>> is not totally unexpected, is it? >>> (and I'm sure the PyPy developers wouldn't mind adding another hack) >> This one can not be done, as it relies on knowing there's only one reference >> to the string (so it can be realloc'd in place), therefore on using a >> refcounting GC. > Ah, you're right. > However, PyPy has another (and quite broader) set of optimizations > available: > http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#string-optimizations Yeah but none of them has reached enough utility to be the default pypy state, so it's not like it's a good idea to rely on them (it's nice to know these options exist as they can be nice in a per-case basis though). > Besides: >> Since Pypy does not use refcounting, it can't do that as a rule (it might be >> possible to handle it for a limited number of cases via escape analysis, >> proving there can be only one reference to the string, but I'd say there >> are more interesting things to use escape analysis for). > > The CPython optimization itself works in a limited number of cases, > because having a refcount of 1 essentially means it's a local variable, > and because of the result having to be stored back immediately in the > same local variable (otherwise you can't recycle the original object's > storage). True, I realized afterwards I should probably have written "a likely even more limited number of cases than CPython". Oh well. From ron3200 at gmail.com Thu Aug 25 18:24:06 2011 From: ron3200 at gmail.com (ron3200) Date: Thu, 25 Aug 2011 11:24:06 -0500 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <549901314286114@web119.yandex.ru> References: <549901314286114@web119.yandex.ru> Message-ID: <1314289446.25060.15.camel@Gutsy> On Thu, 2011-08-25 at 18:28 +0300, k_bx wrote: > The most popular (as from what I can see) thing right now where people start seeing > that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, > who is still slow) and ask "why my pypy code is sooooo slow". I think a FAQ on "How can I make my python program faster?", with suggestions such as using list .join for building large strings instead of using += would be better. There probably already is one some place... Yep... http://wiki.python.org/moin/PythonSpeed/PerformanceTips This in my opinion is more about fitting the code to the problem than it is about speeding up general python code. I once wrote a text comparison engine that solved cryptograms by comparing to a text source. A large text source was read into a dictionary of words to be compared to. At first it was quite slow, but by presorting the data and putting it into smaller dictionaries, it sped up the program by several order of magnitudes. Cheers, Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Fri Aug 26 00:45:45 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 25 Aug 2011 15:45:45 -0700 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser Message-ID: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> The http://gumbyapp.com/ project succeeded in getting Python to run in the browser but it can't import pure python modules (because there is no filesystem in the browser). I think it would be wonderful to beef-up this project by bundling-in the rest of the standard library. Gumby was built by compiling CPython to LLVM and then generating Javascript. ISTM it would be possible to write a script to transform pure python standard library modules into C strings that could also be part of the final build. The import-statement would have to be hooked to search for the C string instead of a physical file. If that technique works, it may not be hard to extend it so that user defined python modules could also be incorporated. If so, it would become possible to create standalone Python apps that run in the browser. The process is likely to be inefficient, but the gumbyapp site shows that it might be good enough for some purposes. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Fri Aug 26 02:27:05 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 26 Aug 2011 12:27:05 +1200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <549901314286114@web119.yandex.ru> References: <549901314286114@web119.yandex.ru> Message-ID: <4E56E859.3090504@canterbury.ac.nz> k_bx wrote: > The most popular (as from what I can see) thing right now where people start seeing > that += is slow is when they try to do that on PyPy (which doesn't have hack like CPython, > who is still slow) and ask "why my pypy code is sooooo slow". To me that suggests it may have been a mistake to try to optimise += at all in CPython, as it gives people misleading expectations. -- Greg From guido at python.org Fri Aug 26 02:45:16 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 25 Aug 2011 17:45:16 -0700 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser In-Reply-To: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> References: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> Message-ID: Maybe it would be possible to use HTM5's sqlite storage? On Thu, Aug 25, 2011 at 3:45 PM, Raymond Hettinger wrote: > The?http://gumbyapp.com/?project succeeded in getting Python to run in the > browser but it can't import pure python modules (because there is no > filesystem in the browser). > I think it would be wonderful to beef-up this project by bundling-in the > rest of the standard library. ?Gumby was built by compiling CPython to LLVM > and then generating Javascript. ?ISTM it would be possible to write a script > to transform pure python standard library modules into C strings that could > also be part of the final build. ?The import-statement would have to be > hooked to search for the C string instead of a physical file. > If that technique works, it may not be hard to extend it so that user > defined python modules could also be incorporated. ?If so, it would become > possible to create standalone Python apps that run in the browser. ? The > process is likely to be inefficient, but the gumbyapp site shows that it > might be good enough for some purposes. > > Raymond > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- --Guido van Rossum (python.org/~guido) From digitalxero at gmail.com Fri Aug 26 02:59:05 2011 From: digitalxero at gmail.com (Dj Gilcrease) Date: Thu, 25 Aug 2011 20:59:05 -0400 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser In-Reply-To: References: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> Message-ID: Or maybe a URL importer that is restricted to same domain http://code.google.com/p/importers/issues/detail?id=1 From fuzzyman at gmail.com Fri Aug 26 03:11:13 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Fri, 26 Aug 2011 02:11:13 +0100 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser In-Reply-To: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> References: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> Message-ID: On 25 August 2011 23:45, Raymond Hettinger wrote: > The http://gumbyapp.com/ project succeeded in > getting Python to run in the browser but it can't import pure python modules > (because there is no filesystem in the browser). > > I think it would be wonderful to beef-up this project by bundling-in the > rest of the standard library. Gumby was built by compiling CPython to LLVM > and then generating Javascript. ISTM it would be possible to write a script > to transform pure python standard library modules into C strings that could > also be part of the final build. The import-statement would have to be > hooked to search for the C string instead of a physical file. > > If that technique works, it may not be hard to extend it so that user > defined python modules could also be incorporated. If so, it would become > possible to create standalone Python apps that run in the browser. The > process is likely to be inefficient, but the gumbyapp site shows that it > might be good enough for some purposes. > > There's also Skulpt, an implementation of Python in Javascript (no idea how complete it is but I saw some impressive demos a while ago): http://code.google.com/p/skulpt/ I've no idea how skulpt handles imports (or even if it does). There is also pyjamas which translates Python code to Javascript: http://pyjs.org/ Pyjamas translates dependencies too, but I guess it can't do dynamic imports. Plus IronPython runs in the Silverlight runtime. Probably of less interest to this crowd though. :-) All the best, Michael Foord > > Raymond > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Aug 26 04:28:18 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 26 Aug 2011 12:28:18 +1000 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser In-Reply-To: References: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> Message-ID: On Fri, Aug 26, 2011 at 11:11 AM, Michael Foord wrote: > Plus IronPython runs in the Silverlight runtime. Probably of less interest > to this crowd though. :-) In the talk at PyConAU that mentioned gumbyapp [1], trypython was the first version Tim showed. Gumbyapp was his follow-up for the cases where Silverlight wasn't an option (or ran too slowly). Although it turns out many browsers aren't happy about being sent 2.8 MB JSON objects, either :) It's actually a really cool talk (and my personal favourite of the whole weekend at PyConAU) about how the National Computer Science School run by the University of Sydney uses OS level sandboxing to permit safe execution of arbitrary Python code on the NCSS servers (alas, UoS has not made the code backing the site open source at this point in time and Tim wasn't sure if or when that would happen). To add another possible mechanism into the mix, freezing modules may be another way to get them into the LLVM bytecode. Dynamic import mechanisms are hard, since you run into bootstrapping issues (cf. Brett's hassles with making importlib the underlying implementation of the __import__ builtin). [1] http://www.youtube.com/watch?v=y-WPPdhTKBU&feature=channel_video_title Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dglick at gmail.com Fri Aug 26 04:42:12 2011 From: dglick at gmail.com (David Glick) Date: Thu, 25 Aug 2011 19:42:12 -0700 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser In-Reply-To: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> References: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> Message-ID: <4E570804.3090207@gmail.com> On 8/25/11 3:45 PM, Raymond Hettinger wrote: > The http://gumbyapp.com/ project succeeded in > getting Python to run in the browser but it can't import pure python > modules (because there is no filesystem in the browser). > > I think it would be wonderful to beef-up this project by bundling-in > the rest of the standard library. Gumby was built by compiling > CPython to LLVM and then generating Javascript. ISTM it would be > possible to write a script to transform pure python standard library > modules into C strings that could also be part of the final build. > The import-statement would have to be hooked to search for the C > string instead of a physical file. > > If that technique works, it may not be hard to extend it so that user > defined python modules could also be incorporated. If so, it would > become possible to create standalone Python apps that run in the > browser. The process is likely to be inefficient, but the gumbyapp > site shows that it might be good enough for some purposes. Mostly as a joke for this past April Fool's day, Matthew Wilkes and I cobbled together an import-over-AJAX mechanism [1] for the Python interpreter produced by Emscripten [2] (which similarly translates LLVM into Javascript). [1] http://davisagli.com/blog/the-making-of-zodb.ws [2] https://github.com/kripken/emscripten/wiki Our goal was getting ZODB running in the browser, with storage in HTML5 localstorage, as demonstrated at http://zodb.ws -- so we focused only on the pieces of the stdlib necessary to get that running; the emscripten interpreter was missing a lot and we didn't have time to learn the emscripten toolchain so we resorted to various hacks (e.g. a simple incrementer in place of time.time()) and borrowing pure-Python implementations from pypy. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From k.bx at ya.ru Fri Aug 26 07:30:28 2011 From: k.bx at ya.ru (k_bx) Date: Fri, 26 Aug 2011 08:30:28 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere Message-ID: <797341314336628@web154.yandex.ru> Ok, so while I think that cStringIO.StringIO is not what we shoud use, io.StringIO should be okay (since it wasn't explpicitly created to be file-like), so there's no reason not to use that (and it performs good, on pypy also) ((except that I don't like it's API, of course)). Thanks everyone! p.s.: it seems people read code much more often then they do read FAQ From masklinn at masklinn.net Fri Aug 26 08:19:33 2011 From: masklinn at masklinn.net (Masklinn) Date: Fri, 26 Aug 2011 08:19:33 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <797341314336628@web154.yandex.ru> References: <797341314336628@web154.yandex.ru> Message-ID: On 2011-08-26, at 07:30 , k_bx wrote: > Ok, so while I think that cStringIO.StringIO is not what we shoud use, io.StringIO should be okay (since it wasn't explpicitly created to be file-like) It was. Hence its extending io.TextIOBase. io.StringIO is roughly the py3k version of StringIO.StringIO (when the latter was used with unicode strings) From turnbull at sk.tsukuba.ac.jp Fri Aug 26 08:50:13 2011 From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull) Date: Fri, 26 Aug 2011 15:50:13 +0900 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <797341314336628@web154.yandex.ru> References: <797341314336628@web154.yandex.ru> Message-ID: <87r548ll96.fsf@uwakimon.sk.tsukuba.ac.jp> k_bx writes: > Ok, so while I think that cStringIO.StringIO is not what we shoud > use, io.StringIO should be okay (since it wasn't explpicitly > created to be file-like), so there's no reason not to use that (and > it performs good, on pypy also) ((except that I don't like it's > API, of course)). Well, you are free to use StringBuilder in your own programs (though I don't recommend that). > p.s.: it seems people read code much more often then they do read FAQ Sure, but that says more about the people than it does about the FAQ. We can't write all the code that they're going to read, and we can't choose what examples they'll follow. The best we can do is follow the Zen of Python, specifically, "There should be one -- and preferable only one -- obvious way to do it." (Don't say, "but the sep.join(lst) idiom is hardly obvious!" The Zen has an answer to that, too. Try "python -m this" if you don't know about the Zen.) The other ways of doing it are more specialized, ie, optimized for particular cases. The point is that the StringBuilder class doesn't do things any better than the existing idioms. Its only advantage is a somewhat more discoverable name. That's not a good enough reason to proliferate names for good ways to do this. And .join() is often natural, so it won't be deprecated in favor of StringBuilder. From amauryfa at gmail.com Fri Aug 26 09:14:19 2011 From: amauryfa at gmail.com (Amaury Forgeot d'Arc) Date: Fri, 26 Aug 2011 09:14:19 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <4E56E859.3090504@canterbury.ac.nz> References: <549901314286114@web119.yandex.ru> <4E56E859.3090504@canterbury.ac.nz> Message-ID: 2011/8/26 Greg Ewing > The most popular (as from what I can see) thing right now where people >> start seeing >> that += is slow is when they try to do that on PyPy (which doesn't have >> hack like CPython, >> who is still slow) and ask "why my pypy code is sooooo slow". >> > > To me that suggests it may have been a mistake to try to > optimise += at all in CPython, as it gives people misleading > expectations. The author of this optimization is also the lead designer of PyPy... -- Amaury Forgeot d'Arc -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Fri Aug 26 12:55:54 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Fri, 26 Aug 2011 11:55:54 +0100 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser In-Reply-To: References: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> Message-ID: On 26 August 2011 03:28, Nick Coghlan wrote: > On Fri, Aug 26, 2011 at 11:11 AM, Michael Foord > wrote: > > Plus IronPython runs in the Silverlight runtime. Probably of less > interest > > to this crowd though. :-) > > In the talk at PyConAU that mentioned gumbyapp [1], trypython was the > first version Tim showed. Gumbyapp was his follow-up for the cases > where Silverlight wasn't an option (or ran too slowly). Although it > turns out many browsers aren't happy about being sent 2.8 MB JSON > objects, either :) > > Interesting. Last year I wrote a commercial app with a Silverlight front-end (choice of the client) where we were sending 10mb or more of json over the wire (per view). We found IronPython in Silverlight *much faster* than Javascript, both for json handling (using the Silverlight json apis) and for the user interface, which was a grid displaying the large amounts of data we were sending. (I was porting a Javascript app to Silverlight and performance was one of the big reasons.) My understanding is that even with recent javascript engines the Silverlight runtime *typically* runs code faster than Javascript. As gumbyapp is translating llvm bytecode to Javascript I'd be *surprised* if it was faster than Silverlight (not that aren't other reasons to prefer a 'browser native' solution though). Just because I'd be surprised doesn't make it impossible of course. :-) All the best, Michael Foord > It's actually a really cool talk (and my personal favourite of the > whole weekend at PyConAU) about how the National Computer Science > School run by the University of Sydney uses OS level sandboxing to > permit safe execution of arbitrary Python code on the NCSS servers > (alas, UoS has not made the code backing the site open source at this > point in time and Tim wasn't sure if or when that would happen). > > To add another possible mechanism into the mix, freezing modules may > be another way to get them into the LLVM bytecode. Dynamic import > mechanisms are hard, since you run into bootstrapping issues (cf. > Brett's hassles with making importlib the underlying implementation of > the __import__ builtin). > > [1] http://www.youtube.com/watch?v=y-WPPdhTKBU&feature=channel_video_title > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Fri Aug 26 13:12:20 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Fri, 26 Aug 2011 12:12:20 +0100 Subject: [Python-ideas] Summer of Code Project Idea: Python Apps in the Browser In-Reply-To: References: <98154FC8-AFB3-4CCF-AD06-09CCB4267924@gmail.com> Message-ID: On 26 August 2011 03:28, Nick Coghlan wrote: > On Fri, Aug 26, 2011 at 11:11 AM, Michael Foord > wrote: > > Plus IronPython runs in the Silverlight runtime. Probably of less > interest > > to this crowd though. :-) > > In the talk at PyConAU that mentioned gumbyapp [1], trypython was the > first version Tim showed. Gumbyapp was his follow-up for the cases > where Silverlight wasn't an option (or ran too slowly). Although it > turns out many browsers aren't happy about being sent 2.8 MB JSON > objects, either :) > Ooh, nice. Here's the "Try Python" variant they created for the students / teachers to run Python, with an interactive interpreter and editor: http://challenge.ncss.edu.au/trypython/ Thanks for pointing this out. (Try Python provides a "filesystem" based on browser local storage and patches "open" to work with this. Adding an import hook that can import from the browser filesystem wouldn't be very hard and it looks like they have *something* like that in place.) All the best, Michael Foord > > It's actually a really cool talk (and my personal favourite of the > whole weekend at PyConAU) about how the National Computer Science > School run by the University of Sydney uses OS level sandboxing to > permit safe execution of arbitrary Python code on the NCSS servers > (alas, UoS has not made the code backing the site open source at this > point in time and Tim wasn't sure if or when that would happen). > > To add another possible mechanism into the mix, freezing modules may > be another way to get them into the LLVM bytecode. Dynamic import > mechanisms are hard, since you run into bootstrapping issues (cf. > Brett's hassles with making importlib the underlying implementation of > the __import__ builtin). > > [1] http://www.youtube.com/watch?v=y-WPPdhTKBU&feature=channel_video_title > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Aug 27 02:45:48 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 27 Aug 2011 02:45:48 +0200 Subject: [Python-ideas] Performance of the "".join() idiom References: <379591314264494@web30.yandex.ru> Message-ID: <20110827024548.63668e60@pitrou.net> For the record, the "".join() idiom also has its downsides. If you build a list of many tiny strings, memory consumption can grow beyond the reasonable (in one case, building a 600MB JSON string outgrew the RAM of an 8GB machine). One solution is to regularly accumulate the primary list into a secondary accumulation list as done in http://hg.python.org/cpython/rev/47176e8d7060 Regards Antoine. On Thu, 25 Aug 2011 12:28:14 +0300 k_bx wrote: > Hi! > > There's a certain problem right now in python that when people need to build string from pieces they really often do something like this:: > > def main_pure(): > b = u"initial value" > for i in xrange(30000): > b += u"more data" > return b > From digitalxero at gmail.com Sat Aug 27 21:57:26 2011 From: digitalxero at gmail.com (Dj Gilcrease) Date: Sat, 27 Aug 2011 15:57:26 -0400 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] Message-ID: In the thread about replacing re with regex someone mentioned adding to __future__ which isnt a great idea as future APIs are already solidified, they just live there to give developer time to adapt their code. The idea of a __experimental__ area is good for any pep's or stliib additions that are somewhat controversial (API isnt agreed on, code may take a while to integrate properly, developer wants some time to hash out any edge case bugs or API clarifications that may come up in large scale testing, etc). __experimental__ should emit a warning on import that says anything in here may change or be removed at any time and should not be used in stable code. __experimental__ features should behave the same as __future__ in that they can add new keywords or semantics to the existing language __experimental__ features can move directly to the stlib or builtins if they do not add new keywords and/or are backwards compatible with the feature they are replacing. Otherwise they move into __future__ for how ever many releases are deemed reasonable time for developers to adapt their code. From mikegraham at gmail.com Sun Aug 28 04:50:48 2011 From: mikegraham at gmail.com (Mike Graham) Date: Sat, 27 Aug 2011 22:50:48 -0400 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: Message-ID: On Sat, Aug 27, 2011 at 3:57 PM, Dj Gilcrease wrote: > In the thread about replacing re with regex someone mentioned adding > to __future__ which isnt a great idea as future APIs are already > solidified, they just live there to give developer time to adapt their > code. The idea of a __experimental__ area is good for any pep's or > stliib additions that are somewhat controversial (API isnt agreed on, > code may take a while to integrate properly, developer wants some time > to hash out any edge case bugs or API clarifications that may come up > in large scale testing, etc). > > __experimental__ should emit a warning on import that says anything in > here may change or be removed at any time and should not be used in > stable code. > > __experimental__ features should behave the same as __future__ in that > they can add new keywords or semantics to the existing language > > __experimental__ features can move directly to the stlib or builtins > if they do not add new keywords and/or are backwards compatible with > the feature they are replacing. Otherwise they move into __future__ > for how ever many releases are deemed reasonable time for developers > to adapt their code. If something's still experimental, why ship it as stdlib? Why not just keep it third party until integration? No reason to tempt people to do anything that needs a warning. If they want some software, they can install it. Mike From guido at python.org Sun Aug 28 05:37:48 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 27 Aug 2011 20:37:48 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: Message-ID: On Sat, Aug 27, 2011 at 7:50 PM, Mike Graham wrote: > On Sat, Aug 27, 2011 at 3:57 PM, Dj Gilcrease wrote: >> In the thread about replacing re with regex someone mentioned adding >> to __future__ which isnt a great idea as future APIs are already >> solidified, they just live there to give developer time to adapt their >> code. The idea of a __experimental__ area is good for any pep's or >> stliib additions that are somewhat controversial (API isnt agreed on, >> code may take a while to integrate properly, developer wants some time >> to hash out any edge case bugs or API clarifications that may come up >> in large scale testing, etc). >> >> __experimental__ should emit a warning on import that says anything in >> here may change or be removed at any time and should not be used in >> stable code. >> >> __experimental__ features should behave the same as __future__ in that >> they can add new keywords or semantics to the existing language >> >> __experimental__ features can move directly to the stlib or builtins >> if they do not add new keywords and/or are backwards compatible with >> the feature they are replacing. Otherwise they move into __future__ >> for how ever many releases are deemed reasonable time for developers >> to adapt their code. > > If something's still experimental, why ship it as stdlib? Why not just > keep it third party until integration? No reason to tempt people to do > anything that needs a warning. If they want some software, they can > install it. Putting it in the stdlib labeled as experimental would send a signal that it's slated for stdlib inclusion (in this case to replace the re module) and makes it available to everyone with the right Python version. Keeping it third-party means many people will be reluctant to add it as a dependency to any code they put out. That said, I'm not all that keen on my __experimental__ idea. The point of __future__ was that it would be recognized by the parser as it was parsing the file, so it could then modify its parsing tables on the fly. That's not needed for an experimental module. Telling people to use "import regex as re" is probably good enough, as we go that route. But personally, I'd much rather see either the existing re module upgraded, or the regex module replace the re module as of Python 3.3. Either of those sounds like a better solution for users of Python 3.3. But I realize it's more work for the core developers. -- --Guido van Rossum (python.org/~guido) From jeanpierreda at gmail.com Sun Aug 28 06:35:32 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sun, 28 Aug 2011 00:35:32 -0400 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: Message-ID: > Keeping it third-party means many people will be reluctant to > add it as a dependency to any code they put out. I'd spin this as a good thing: the code is, after all, experimental. Devin On Sat, Aug 27, 2011 at 11:37 PM, Guido van Rossum wrote: > On Sat, Aug 27, 2011 at 7:50 PM, Mike Graham wrote: >> On Sat, Aug 27, 2011 at 3:57 PM, Dj Gilcrease wrote: >>> In the thread about replacing re with regex someone mentioned adding >>> to __future__ which isnt a great idea as future APIs are already >>> solidified, they just live there to give developer time to adapt their >>> code. The idea of a __experimental__ area is good for any pep's or >>> stliib additions that are somewhat controversial (API isnt agreed on, >>> code may take a while to integrate properly, developer wants some time >>> to hash out any edge case bugs or API clarifications that may come up >>> in large scale testing, etc). >>> >>> __experimental__ should emit a warning on import that says anything in >>> here may change or be removed at any time and should not be used in >>> stable code. >>> >>> __experimental__ features should behave the same as __future__ in that >>> they can add new keywords or semantics to the existing language >>> >>> __experimental__ features can move directly to the stlib or builtins >>> if they do not add new keywords and/or are backwards compatible with >>> the feature they are replacing. Otherwise they move into __future__ >>> for how ever many releases are deemed reasonable time for developers >>> to adapt their code. >> >> If something's still experimental, why ship it as stdlib? Why not just >> keep it third party until integration? No reason to tempt people to do >> anything that needs a warning. If they want some software, they can >> install it. > > Putting it in the stdlib labeled as experimental would send a signal > that it's slated for stdlib inclusion (in this case to replace the re > module) and makes it available to everyone with the right Python > version. Keeping it third-party means many people will be reluctant to > add it as a dependency to any code they put out. > > That said, I'm not all that keen on my __experimental__ idea. The > point of __future__ was that it would be recognized by the parser as > it was parsing the file, so it could then modify its parsing tables on > the fly. That's not needed for an experimental module. Telling people > to use "import regex as re" is probably good enough, as we go that > route. But personally, I'd much rather see either the existing re > module upgraded, or the regex module replace the re module as of > Python 3.3. Either of those sounds like a better solution for users of > Python 3.3. But I realize it's more work for the core developers. > > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ncoghlan at gmail.com Sun Aug 28 09:50:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 28 Aug 2011 17:50:16 +1000 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: Message-ID: On Sun, Aug 28, 2011 at 2:35 PM, Devin Jeanpierre wrote: >> Keeping it third-party means many people will be reluctant to >> add it as a dependency to any code they put out. > > I'd spin this as a good thing: the code is, after all, experimental. Indeed. And using a DVCS means it is easier these days for people to get hold of experimental code (e.g. anyone that wants to play with the yield from expression can grab it from bitbucket and build it locally: https://bitbucket.org/ncoghlan/cpython_sandbox#pep380) Documenting that packages are up for standard lib inclusion is good (specifically regex right now, but likely a couple of others before 3.3), but I don't think that means actually *shipping* them in the stdlib is a good idea. A meta-package on PyPI for "stdlib-experimental" might be a way to let people grab candidates all at the same time if it's considered worthwhile, but the stdlib itself should only get stuff we plan to keep around. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jeremy at jeremysanders.net Sun Aug 28 11:58:24 2011 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Sun, 28 Aug 2011 10:58:24 +0100 Subject: [Python-ideas] ctypes and extensions Message-ID: I was following the discussion on python-dev about ctypes not being suitable for the standard library for linking to C libraries, and extension modules not suitable for alternative implementations. Ctypes is fragile when it comes linking to libraries whose API might change giving no kind of error. Extension modules are too closely linked to cpython and reference counting. Why not have some sort of intermediate C layer to produce a library which can be immediatedly loaded by any python implementation? You would write a short C program which would be linked to the library to be wrapped. It would automatically expose what it is wrapping to python so no python definitions are necessary. It would be compile-time linked to the library so that ABI changes are apparent at compile-time. The interface would be designed so that no python implementation features leak out. You'd need some sort of loading code (perhaps based on ctypes initially) on each implementation. Imagine something like this #include #include #include /* wrap function int mylibstrlen(char*) and expose as strlen */ WRAP_FUNCTION(mylibstrlen, "strlen", WRAP_INT, WRAP_CHARPTR); /* wrap constant M_PI and expose as PI */ WRAP_CONST(M_PI, "PI", WRAP_DOUBLE); If these macros exposed symbols with some sort of C++-like mangled name they could automatically be interpreted by python on loading the module. Alternatively the macros could generate some sort of table which python interprets. Now you could imagine that it might be possible to write the above code definition in something like Python or XML and then converted to C to be linked against the library in question e.g. import wrap lib = wrap.Library("mlib", "mylib.h") lib.definefunction("mylibstrlen", "strlen", wrap.int, wrap.charptr) lib.writeC("out.c") # compile, etc... I quite like the direct C approach as it is an obvious place to write any additional code to convert between the C library API and a nicer API usable by Python, though you could have both. It would also be able to link to C++ symbols without python knowing about C++ name mangling, if the module was compiled with a C++ compiler. Any opinions? Jeremy From paul at colomiets.name Sun Aug 28 12:27:23 2011 From: paul at colomiets.name (Paul Colomiets) Date: Sun, 28 Aug 2011 13:27:23 +0300 Subject: [Python-ideas] ctypes and extensions In-Reply-To: References: Message-ID: Hi Jeremy? On Sun, Aug 28, 2011 at 12:58 PM, Jeremy Sanders wrote: > I was following the discussion on python-dev about ctypes not being suitable > for the standard library for linking to C libraries, and extension modules > not suitable for alternative implementations. Ctypes is fragile when it > comes linking to libraries whose API might change giving no kind of error. > Extension modules are too closely linked to cpython and reference counting. > > Why not have some sort of intermediate C layer to produce a library which > can be immediatedly loaded by any python implementation? > [.. snip ..] > Any opinions? > So what's the problem with cython? It solves your problem well. -- Paul From jeremy at jeremysanders.net Sun Aug 28 12:35:45 2011 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Sun, 28 Aug 2011 11:35:45 +0100 Subject: [Python-ideas] ctypes and extensions References: Message-ID: Paul Colomiets wrote: > So what's the problem with cython? It solves your problem well. I think it works for the cpython case outside the standard library. However, my idea was a simple binary ABI which would be very easy for any python implementation to use, and so would would be suitable for wrapping libraries in the standard library. Maybe there are cython implementations coming soon available for pypy, jython, ironpython, etc...? These would still have to compile to C code to get build time API checking, which is one of the ctypes issues. Jeremy From guido at python.org Sun Aug 28 19:10:07 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 28 Aug 2011 10:10:07 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: Message-ID: On Sun, Aug 28, 2011 at 12:50 AM, Nick Coghlan wrote: > On Sun, Aug 28, 2011 at 2:35 PM, Devin Jeanpierre > wrote: >>> Keeping it third-party means many people will be reluctant to >>> add it as a dependency to any code they put out. >> >> I'd spin this as a good thing: the code is, after all, experimental. > > Indeed. And using a DVCS means it is easier these days for people to > get hold of experimental code (e.g. anyone that wants to play with the > yield from expression can grab it from bitbucket and build it locally: > https://bitbucket.org/ncoghlan/cpython_sandbox#pep380) > > Documenting that packages are up for standard lib inclusion is good > (specifically regex right now, but likely a couple of others before > 3.3), but I don't think that means actually *shipping* them in the > stdlib is a good idea. > > A meta-package on PyPI for "stdlib-experimental" might be a way to let > people grab candidates all at the same time if it's considered > worthwhile, but the stdlib itself should only get stuff we plan to > keep around. I still see a huge difference between something in PyPI and something marked as experimental in the stdlib. Something in PyPI may or may not be promoted to stdlib status, but realistically that rarely happens (I think the last time it happened it was a JSON thing?). Something marked as experimental in the stdlib has a much higher likelihood to be promoted to regular stdlib status -- in fact that's the most likely outcome. The main difference between experimental and regular status in the stdlib is that for experimental modules we reserve the right to make incompatible changes in subsequent versions (possibly even in bugfix releases, although that's worth a separate discussion). An experimental feature in the stdlib also signals a certain commitment from the core developers. We shouldn't use experimental as a dumping ground for random stuff; that's what PyPI is for. We should use experimental within the context of the stdlib. Another thing to keep in mind is that not everybody has the same attitude towards installing 3rd party dependencies from PyPI. E.g. at Google, that requires someone to import the 3rd party library into our own code repo first, and that in turn requires a (light-weight) review. The net effect is that a Googler who happens to be the first to want to use a certain 3rd party package has to do a lot of work; it's *not* just a matter of downloading and installing the package. I imagine it's the same within many other organizations -- some better, some worse. (I was going to write something about many package developers being reluctant to require dependencies on other packages, but that's a pretty weak argument, since depending on an experimental stdlib package would limit them to a specific Python version, or dynamic presence-checking which isn't much better than having optional PyPI dependencies.) The power of "batteries included" is still with us, and experimental batteries are still batteries! -- --Guido van Rossum (python.org/~guido) From stefan_ml at behnel.de Sun Aug 28 20:34:11 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 28 Aug 2011 20:34:11 +0200 Subject: [Python-ideas] ctypes and extensions In-Reply-To: References: Message-ID: Jeremy Sanders, 28.08.2011 12:35: > Maybe there are cython implementations coming soon available for pypy, > jython, ironpython, etc...? There are ports to PyPy and IronPython being written, yes. Not sure if a Jython port would be all that competitive, because there is quite some overhead involved in the JNI bridge (AFAIR - this may have changed since the last time I considered it). But it would still be interesting for many projects even with a noticeable performance penalty, I guess. > These would still have to compile to C code to > get build time API checking, which is one of the ctypes issues. Well, the PyPy port actually uses ctypes as a backend for C interaction, so it's not really PyPy specific, but it's supposed to work particularly well on PyPy. The IronPython port uses a mixture of C++ and CLR code, but I haven't been involved with it in any way. It was written to port NumPy to .NET, and that reportedly worked out quite well. Stefan From aquavitae69 at gmail.com Sun Aug 28 21:25:37 2011 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 28 Aug 2011 21:25:37 +0200 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] Message-ID: I don't know if the intention is to use __experimental__ only for additions to stdlib or whether it would include language changes, but if so, maybe its worth pointing out that language extensions wouldn't be so easy (if even possible) to write as 3rd party on pypi. On Aug 28, 2011 7:10 PM, "Guido van Rossum" wrote: -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon Aug 29 03:29:11 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 29 Aug 2011 10:29:11 +0900 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: Message-ID: <87obz9kntk.fsf@uwakimon.sk.tsukuba.ac.jp> Guido van Rossum writes: > Another thing to keep in mind is that not everybody has the same > attitude towards installing 3rd party dependencies from PyPI. E.g. at > Google, that requires someone to import the 3rd party library into our > own code repo first, and that in turn requires a (light-weight) > review. I would think that most of the discussants already have this in mind. It comes up every time a module is proposed for inclusion in the stdlib, and presumably is the primary reason for supporting __experimental__. > The net effect is that a Googler who happens to be the first > to want to use a certain 3rd party package has to do a lot of work; > it's *not* just a matter of downloading and installing the package. I > imagine it's the same within many other organizations -- some better, > some worse. But the friction imposed by this required is presumably considered a net good thing by Google, no? Some benefit is received, or some cost avoided, right? So does __experimental__ just reduce friction for carefully selected proposed modules, and so increase net benefit, or does reduced friction have adverse impacts (ie, side effects of too- easily imported modules on the rest of the organization) that might outweigh the reduction of friction, too? From ncoghlan at gmail.com Mon Aug 29 04:53:04 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 29 Aug 2011 12:53:04 +1000 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <87obz9kntk.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87obz9kntk.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Aug 29, 2011 at 11:29 AM, Stephen J. Turnbull wrote: > ?> The net effect is that a Googler who happens to be the first > ?> to want to use a certain 3rd party package has to do a lot of work; > ?> it's *not* just a matter of downloading and installing the package. I > ?> imagine it's the same within many other organizations -- some better, > ?> some worse. > > But the friction imposed by this required is presumably considered a > net good thing by Google, no? ?Some benefit is received, or some cost > avoided, right? ?So does __experimental__ just reduce friction for > carefully selected proposed modules, and so increase net benefit, or > does reduced friction have adverse impacts (ie, side effects of too- > easily imported modules on the rest of the organization) that might > outweigh the reduction of friction, too? I'd say if we went down this path, a key motivator would be specifically to make it easier for downstream organisations to point to it and say "this is covered by our existing license and packaging review for the Python standard library". ActiveState/Enthought/etc would ship it with their installers, Debian/Fedora/etc would package it, and so on and so forth. That's a harder case to make for arbitrary PyPI packages, since there's far more variation in the licensing and the code isn't tested on our buildbot suite or covered by our source control and review processes. "We vetted the licensing for valid PSF redistributions rights, combined its test suite with ours, and regularly run those tests on our buildbots" really *is* a significantly stronger statement on our part than "we're considering including this module, you can get it from PyPI". I'm actually coming around to the idea, especially for cases like the ones Guido mentioned where we think the functionality is definitely desirable but aren't sure we have the API right yet. Currently, in those cases we have to choose between publishing on PyPI and locking in (potentially for a very long time) an API design that we're not entirely sure is right. (Possible case in point: we ignored some of Glyph's advice in the design of the concurrent.futures API by having the executors directly manipulate the future objects instead of introducing an additional level of indirection so that the object the executors were playing with wasn't the one clients had a direct reference to. Even if this turns out to have been a mistake, the API is unlikely to change within the 3.x series). An explicit "__experimental__" namespace would be a way to promise "this functionality will exist in *some* form in future versions of Python, but the API may change in backwards incompatible ways in the process of getting there". It reminds me in some ways of the journey of the set type from its initial addition as "sets.Set" to the final version as the "set" builtin. ImportEngine (PEP published to import-sig, I'll push it to python.org soonish), for example, would be a prime candidate for experimental status in 3.3 - it's too closely coupled to the interpreter details to make sense to distribute separately, so it would be good to have a mechanism to distribute it in a form that would allow feedback to be gathered over a full release cycle before we commit to a final API in 3.4 (assuming the PEP is accepted at all, of course, although I'm fairly confident I can persuade people it's a good idea). Here's another potentially useful litmus test question for Guido: if it was spelled "from __experimental__ import ipaddr", would you be more inclined to approve PEP 3144 (IP address library) for 3.3? Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From dirkjan at ochtman.nl Mon Aug 29 10:35:27 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 29 Aug 2011 10:35:27 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <4E5619D3.5050809@egenix.com> References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> Message-ID: On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg wrote: > I think you should use cStringIO in your class implementation. > The list + join idiom is nice, but it has the disadvantage of > creating and keeping alive many small string objects (with all > the memory overhead and fragmentation that goes along with it). AFAIK using cStringIO just for string building is much slower than using list.append() + join(). IIRC we tested some micro-benchmarks on this for Mercurial output (where it was a significant part of the profile for some commands). That was on Python 2, of course, it may be better in io.StringIO and/or Python 3. Cheers, Dirkjan From mal at egenix.com Mon Aug 29 11:27:23 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 29 Aug 2011 11:27:23 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> Message-ID: <4E5B5B7B.7000903@egenix.com> Dirkjan Ochtman wrote: > On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg wrote: >> I think you should use cStringIO in your class implementation. >> The list + join idiom is nice, but it has the disadvantage of >> creating and keeping alive many small string objects (with all >> the memory overhead and fragmentation that goes along with it). > > AFAIK using cStringIO just for string building is much slower than > using list.append() + join(). IIRC we tested some micro-benchmarks on > this for Mercurial output (where it was a significant part of the > profile for some commands). That was on Python 2, of course, it may be > better in io.StringIO and/or Python 3. Turns our you're right (list.append must have gotten a lot faster since I last tested this years ago, or I simply misremembered the results). > python2.6 teststringbuilding.py array cstringio listappend Running test array ... 669.68 ms Running test cstringio ... 563.95 ms Running test listappend ... 389.22 ms > python2.7 teststringbuilding.py array cstringio listappend Running test array ... 775.32 ms Running test cstringio ... 679.88 ms Running test listappend ... 375.19 ms Here's the Python2 code: """ TIMEIT_N = 10 N = 1000000 SIZES = (2, 10, 23, 30, 33, 22, 15, 16, 27) N_STRINGS = len(SIZES) STRINGS = ['x' * SIZES[i] for i in range(N_STRINGS)] REFERENCE = ''.join(STRINGS[i % N_STRINGS] for i in xrange(N)) def cstringio(): import cStringIO s = cStringIO.StringIO() write = s.write for i in xrange(N): write(STRINGS[i % N_STRINGS]) result = s.getvalue() assert result == REFERENCE def array(): import array s = array.array('c') write = s.fromstring for i in xrange(N): write(STRINGS[i % N_STRINGS]) result = s.tostring() assert result == REFERENCE def listappend(): l = [] append = l.append for i in xrange(N): append(STRINGS[i % N_STRINGS]) result = ''.join(l) assert result == REFERENCE if __name__ == '__main__': import sys, timeit for test in sys.argv[1:]: print 'Running test %s ...' % test t = timeit.timeit('%s()' % test, 'from __main__ import %s' % test, number=TIMEIT_N) print ' %.2f ms' % (t / TIMEIT_N * 1e3) """ Aside: For some reason cStringIO and array got slower in Python 2.7. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 29 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-10-04: PyCon DE 2011, Leipzig, Germany 36 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From masklinn at masklinn.net Mon Aug 29 11:44:13 2011 From: masklinn at masklinn.net (Masklinn) Date: Mon, 29 Aug 2011 11:44:13 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <4E5B5B7B.7000903@egenix.com> References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> <4E5B5B7B.7000903@egenix.com> Message-ID: <2D345575-A072-4ABE-BBC7-1C79FD538F39@masklinn.net> On 2011-08-29, at 11:27 , M.-A. Lemburg wrote: > Dirkjan Ochtman wrote: >> On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg wrote: >>> I think you should use cStringIO in your class implementation. >>> The list + join idiom is nice, but it has the disadvantage of >>> creating and keeping alive many small string objects (with all >>> the memory overhead and fragmentation that goes along with it). >> >> AFAIK using cStringIO just for string building is much slower than >> using list.append() + join(). IIRC we tested some micro-benchmarks on >> this for Mercurial output (where it was a significant part of the >> profile for some commands). That was on Python 2, of course, it may be >> better in io.StringIO and/or Python 3. > > Turns our you're right (list.append must have gotten a lot faster > since I last tested this years ago, or I simply misremembered > the results). > >> python2.6 teststringbuilding.py array cstringio listappend > Running test array ... > 669.68 ms > Running test cstringio ... > 563.95 ms > Running test listappend ... > 389.22 ms > >> python2.7 teststringbuilding.py array cstringio listappend > Running test array ... > 775.32 ms > Running test cstringio ... > 679.88 ms > Running test listappend ... > 375.19 ms Converting your code straight to bytes (so array still works) yields this on Python 3.2.1: > python3.2 timetest.py io array listappend Running test io ... 334.03 ms Running test array ... 776.66 ms Running test listappend ... 314.90 ms For string (excluding array): > python3.2 timetest.py io listappend Running test io ... 451.45 ms Running test listappend ... 356.39 ms From mal at egenix.com Mon Aug 29 12:25:45 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Mon, 29 Aug 2011 12:25:45 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <2D345575-A072-4ABE-BBC7-1C79FD538F39@masklinn.net> References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> <4E5B5B7B.7000903@egenix.com> <2D345575-A072-4ABE-BBC7-1C79FD538F39@masklinn.net> Message-ID: <4E5B6929.8020605@egenix.com> Masklinn wrote: > On 2011-08-29, at 11:27 , M.-A. Lemburg wrote: >> Dirkjan Ochtman wrote: >>> On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg wrote: >>>> I think you should use cStringIO in your class implementation. >>>> The list + join idiom is nice, but it has the disadvantage of >>>> creating and keeping alive many small string objects (with all >>>> the memory overhead and fragmentation that goes along with it). >>> >>> AFAIK using cStringIO just for string building is much slower than >>> using list.append() + join(). IIRC we tested some micro-benchmarks on >>> this for Mercurial output (where it was a significant part of the >>> profile for some commands). That was on Python 2, of course, it may be >>> better in io.StringIO and/or Python 3. >> >> Turns our you're right (list.append must have gotten a lot faster >> since I last tested this years ago, or I simply misremembered >> the results). >> >>> python2.6 teststringbuilding.py array cstringio listappend >> Running test array ... >> 669.68 ms >> Running test cstringio ... >> 563.95 ms >> Running test listappend ... >> 389.22 ms >> >>> python2.7 teststringbuilding.py array cstringio listappend >> Running test array ... >> 775.32 ms >> Running test cstringio ... >> 679.88 ms >> Running test listappend ... >> 375.19 ms > > Converting your code straight to bytes (so array still works) yields this on Python 3.2.1: > > > python3.2 timetest.py io array listappend > Running test io ... > 334.03 ms > Running test array ... > 776.66 ms > Running test listappend ... > 314.90 ms > > For string (excluding array): > > > python3.2 timetest.py io listappend > Running test io ... > 451.45 ms > Running test listappend ... > 356.39 ms Unicode works with the array module as well. Just use 'u' as array code and replace fromstring/tostring with fromunicode/tounicode. In any case, the array module approach appears to the be slowest of all three tests. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 29 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2011-10-04: PyCon DE 2011, Leipzig, Germany 36 days to go ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ron3200 at gmail.com Mon Aug 29 00:05:03 2011 From: ron3200 at gmail.com (ron3200) Date: Sun, 28 Aug 2011 17:05:03 -0500 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: Message-ID: <1314569103.4640.19.camel@Gutsy> On Sun, 2011-08-28 at 10:10 -0700, Guido van Rossum wrote: > The main difference between experimental and regular status in the > stdlib is that for experimental modules we reserve the right to make > incompatible changes in subsequent versions (possibly even in bugfix > releases, although that's worth a separate discussion). > > An experimental feature in the stdlib also signals a certain > commitment from the core developers. We shouldn't use experimental as > a dumping ground for random stuff; that's what PyPI is for. We should > use experimental within the context of the stdlib. It looks to me, that Importing from __future__ can do things that a regular import can't do, like add or change core syntax. So they tend to be python core features rather than library features. The term __experimental__ may have differing expectations to different people and may be a little too broad. I like __future__ for *early* core changes, and __lib_future__ or __sdtlib_future__, for early stdlib features. I describe them as "early" because they either aren't quite ready for the current release cycle or they are waiting for something else to be depreciated or done before they can included. I think It makes sense to split core and library future features this way because usually, future core features are built into the core, while a future standard library module can be imported from the disk. Cheers, Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Aug 29 13:14:08 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 29 Aug 2011 21:14:08 +1000 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <1314569103.4640.19.camel@Gutsy> References: <1314569103.4640.19.camel@Gutsy> Message-ID: On Mon, Aug 29, 2011 at 8:05 AM, ron3200 wrote: > It looks to me, that Importing from __future__ can do things that a regular > import can't do, like add or change core syntax.? So they tend to be python > core features rather than library features. If we want to, we can make __experimental__ just as magical as __future__. That's orthogonal to its semantics in the library case, which is the point of interest at the moment. If people fear relying on experimental features, that's OK - if this idea happens at all, allowing people that aren't prepared to cope with the risk of breakage to easily avoid it would be a key benefit. The essential idea here is to be able to add a feature, but flag the API as potentially unstable for an entire release cycle before promising to maintain the API in perpetuity. Early adopters would get *most* of python-dev's usual guarantees (which most PyPI packages don't offer), while more conservative types can use more stable alternatives. The benefit on our side is that we'd get an full 18-24 months of feedback on a feature from a broad audience before finally locking the API down in the subsequent release. The more I think about this idea, the more I think it could help alleviate some of the concerns that can currently hold up the evolution of the standard library. (The costs of making a mistake with standard library API design are rather high due to the backwards compatibility promise, which means iterative design mostly needs to happen *before* the module is included. PyPI and the PEP process are both useful for that, but they don't really compare to the feedback available from being part of a full standard lib release). We'd still need to be careful not to throw any old rubbish in there, but as a phased introduction for things like "from __experimental__ import ipaddr" (PEP 3144) and "from __experimental__ import re" (re replacement with regex, assuming someone steps up to write the PEP and eventual drop-in replacement looks feasible), the idea definitely has potential. > I describe them as "early" because they either aren't quite ready for the > current release cycle or they are waiting for something else to be > depreciated or done before they can included. No, that's *not* what __future__ means. __future__ changes are exactly what will become default behaviour in a future version - they aren't going to change, so code that uses them won't need to change. What may need changing in the case of __future__ is *old* code - the only reason we use a __future__ flag is when old code might break. The semantics of the new marker package would be to indicate that stuff is a little undercooked, but we've decided that it won't get enough exposure through PyPI (either through being too small to overcome NIH syndrome or else too closely coupled to a specific version of the interpreter). That's a completely different meaning, so we shouldn't reuse the same word. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Mon Aug 29 14:40:01 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Aug 2011 14:40:01 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere References: <379591314264494@web30.yandex.ru> <4E5619D3.5050809@egenix.com> <4E5B5B7B.7000903@egenix.com> Message-ID: <20110829144001.4eb269a9@pitrou.net> On Mon, 29 Aug 2011 11:27:23 +0200 "M.-A. Lemburg" wrote: > Dirkjan Ochtman wrote: > > On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg wrote: > >> I think you should use cStringIO in your class implementation. > >> The list + join idiom is nice, but it has the disadvantage of > >> creating and keeping alive many small string objects (with all > >> the memory overhead and fragmentation that goes along with it). > > > > AFAIK using cStringIO just for string building is much slower than > > using list.append() + join(). IIRC we tested some micro-benchmarks on > > this for Mercurial output (where it was a significant part of the > > profile for some commands). That was on Python 2, of course, it may be > > better in io.StringIO and/or Python 3. > > Turns our you're right (list.append must have gotten a lot faster > since I last tested this years ago, or I simply misremembered > the results). The join() idiom only does one big copy at the end, while the StringIO/BytesIO idiom copies at every resize (unless the memory allocator is very smart). Both are O(N) but the join() version does less copies and (re)allocations. (there are also the list resizings but that object is much smaller) Regards Antoine. From k.bx at ya.ru Mon Aug 29 18:04:21 2011 From: k.bx at ya.ru (k.bx at ya.ru) Date: Mon, 29 Aug 2011 19:04:21 +0300 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <20110829144001.4eb269a9@pitrou.net> Message-ID: <191401314633861@web152.yandex.ru> 29.08.11, 15:43, "Antoine Pitrou" : > > On Mon, 29 Aug 2011 11:27:23 +0200 > "M.-A. Lemburg" wrote: > > Dirkjan Ochtman wrote: > > > On Thu, Aug 25, 2011 at 11:45, M.-A. Lemburg wrote: > > >> I think you should use cStringIO in your class implementation. > > >> The list + join idiom is nice, but it has the disadvantage of > > >> creating and keeping alive many small string objects (with all > > >> the memory overhead and fragmentation that goes along with it). > > > > > > AFAIK using cStringIO just for string building is much slower than > > > using list.append() + join(). IIRC we tested some micro-benchmarks on > > > this for Mercurial output (where it was a significant part of the > > > profile for some commands). That was on Python 2, of course, it may be > > > better in io.StringIO and/or Python 3. > > > > Turns our you're right (list.append must have gotten a lot faster > > since I last tested this years ago, or I simply misremembered > > the results). > > The join() idiom only does one big copy at the end, while the > StringIO/BytesIO idiom copies at every resize (unless the memory > allocator is very smart). Both are O(N) but the join() version > does less copies and (re)allocations. > > (there are also the list resizings but that object is much smaller) > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas Ok, so I think the best approach would be to implement via join + [], but do flush every 1000 ops, since it can save memory. As for the whole idea -- I still think that creating something like this and adding to stdlib (with __iadd__ and . append() API, which makes refactoring need to be only one string, like doing StringBuilder(u"Foo")) and documenting that would be super-cool. So who says the last word on this? From solipsis at pitrou.net Mon Aug 29 18:10:11 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Aug 2011 18:10:11 +0200 Subject: [Python-ideas] Create a StringBuilder class and use it everywhere In-Reply-To: <191401314633861@web152.yandex.ru> References: <191401314633861@web152.yandex.ru> Message-ID: <1314634211.3551.8.camel@localhost.localdomain> Le lundi 29 ao?t 2011 ? 19:04 +0300, k.bx at ya.ru a ?crit : > > Ok, so I think the best approach would be to implement via join + [], > but do flush every 1000 ops, since it can save memory. That approach (or a similar one) could actually be integrated into StringIO and BytesIO. As long as you only write() at the end of the in-memory object, there's no need to actually concatenate. And it would be much easier (and less impacting on C extension code) to implement that approach in the StringIO and BytesIO objects, than in the bytes and str types as Larry did. Regards Antoine. From ron3200 at gmail.com Mon Aug 29 20:01:34 2011 From: ron3200 at gmail.com (ron3200) Date: Mon, 29 Aug 2011 13:01:34 -0500 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> Message-ID: <1314640894.9902.39.camel@Gutsy> On Mon, 2011-08-29 at 21:14 +1000, Nick Coghlan wrote: > The essential idea here is to be able to add a feature, but flag the > API as potentially unstable for an entire release cycle before > promising to maintain the API in perpetuity. Early adopters would get > *most* of python-dev's usual guarantees (which most PyPI packages > don't offer), while more conservative types can use more stable > alternatives. > We'd still need to be careful not to throw any old rubbish in there, > but as a phased introduction for things like "from __experimental__ > import ipaddr" (PEP 3144) and "from __experimental__ import re" (re > replacement with regex, assuming someone steps up to write the PEP and > eventual drop-in replacement looks feasible), the idea definitely has > potential. > The semantics of the new marker package would be to indicate that > stuff is a little undercooked, but we've decided that it won't get > enough exposure through PyPI (either through being too small to > overcome NIH syndrome or else too closely coupled to a specific > version of the interpreter). That's a completely different meaning, so > we shouldn't reuse the same word. >From the descriptions in this thread, it sounds like __experimental__ items will be pretty far on the way to inclusion. And maybe the only thing experimental about them is a few fine details that might be changed before release. (but probably won't be.) How often do you think things in __experimental__ will be aborted or canceled? Another way to think about "experimental" items is to use it in the way that science does. ie... An experiment designed to discover information, or test a very specific idea or theory. In that context, depending on the results of the experiment, a particular feature my be changed to take the results of the experiment into account. That's probably a bit too formal and restrictive, and it will be difficult to do in a very controlled way. The python community, and Python itself, tends to thrive in a more relaxed and informal atmosphere. But it may not hurt to have some guide lines on how to do a python "__experimental__" experiment so that it doesn't end up being a series of trial and error attempts at something half baked. Which do you think fits better with what you have in mind? __experimental__ __pre-view__ Cheers, Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Aug 29 23:16:13 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 29 Aug 2011 14:16:13 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <1314640894.9902.39.camel@Gutsy> References: <1314569103.4640.19.camel@Gutsy> <1314640894.9902.39.camel@Gutsy> Message-ID: <4E5C019D.8080509@stoneleaf.us> ron3200 wrote: > From the descriptions in this thread, it sounds like > __experimental__ items will be pretty far on the way to inclusion. And > maybe the only thing experimental about them is a few fine details that > might be changed before release. (but probably won't be.) > > How often do you think things in __experimental__ will be aborted or > canceled? Rarely, I would think. > Another way to think about "experimental" items is to use it in the way > that science does. ie... An experiment designed to discover > information, or test a very specific idea or theory. In that context, > depending on the results of the experiment, a particular feature my be > changed to take the results of the experiment into account. I think that's exactly how it is being thought of -- the experimental bit being primarily the API. ~Ethan~ From guido at python.org Tue Aug 30 02:00:15 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 29 Aug 2011 17:00:15 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <87obz9kntk.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, Aug 28, 2011 at 7:53 PM, Nick Coghlan wrote: [Lots of stuff I agree with, and then] > Here's another potentially useful litmus test question for Guido: if > it was spelled "from __experimental__ import ipaddr", would you be > more inclined to approve PEP 3144 (IP address library) for 3.3? IIRC the issue with that PEP is that there is a stand-off between two authors of competing implementations, and I don't have enough understanding of the issues to be able to tell who's right. (Or maybe they're both right and they're just aiming at different audiences -- but I can't even tell that.) I think we need to have someone who cares more (but is not either of those authors) to review the PEP and its criticism and decide on a way forward. (Or am I being too soft? If nobody cares I'd be happy to toss a coin.) -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Tue Aug 30 02:58:59 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Aug 2011 02:58:59 +0200 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] References: <87obz9kntk.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20110830025859.03ab7f1d@pitrou.net> On Mon, 29 Aug 2011 17:00:15 -0700 Guido van Rossum wrote: > On Sun, Aug 28, 2011 at 7:53 PM, Nick Coghlan wrote: > > [Lots of stuff I agree with, and then] > > > Here's another potentially useful litmus test question for Guido: if > > it was spelled "from __experimental__ import ipaddr", would you be > > more inclined to approve PEP 3144 (IP address library) for 3.3? > > IIRC the issue with that PEP is that there is a stand-off between two > authors of competing implementations, and I don't have enough > understanding of the issues to be able to tell who's right. (Or maybe > they're both right and they're just aiming at different audiences -- > but I can't even tell that.) That's my understanding as well. Also, I seem to remember at least one of the two implementations was criticized on some of its design decisions (was it the fact that networks and addresses used a common class? sorry if I'm making things up here), but the author didn't want to change the API. Regards Antoine. From bruce at leapyear.org Tue Aug 30 03:22:11 2011 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 29 Aug 2011 18:22:11 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> Message-ID: On Mon, Aug 29, 2011 at 4:14 AM, Nick Coghlan wrote: > > > The essential idea here is to be able to add a feature, but flag the > API as potentially unstable for an entire release cycle before > promising to maintain the API in perpetuity. > > > > No, that's *not* what __future__ means. __future__ changes are exactly > what will become default behaviour in a future version - they aren't > going to change, so code that uses them won't need to change. What may > need changing in the case of __future__ is *old* code - the only > reason we use a __future__ flag is when old code might break. > > The semantics of the new marker package would be to indicate that > stuff is a little undercooked, but we've decided that it won't get > enough exposure through PyPI (either through being too small to > overcome NIH syndrome or else too closely coupled to a specific > version of the interpreter). That's a completely different meaning, so > we shouldn't reuse the same word. > Unlike __future__ the API is unstable and therefore code that works with today's __experimental__ may fail tomorrow. If the API changes in an incompatible way, we would probably prefer that the import fail rather than succeed with random results. Therefore, I would propose a mechanism to support this, perhaps something like from __experimental__ import regex {2,3,4} as re which means import the regex module version 2, 3 or 4. A later version will be imported only if it is an expansion, i.e., compatible with all previous APIs and differs only in containing new features. If the version set is omitted from the import it's treated as {1} so that dealing with this is only necessary for those modules which do change in incompatible ways. In general, checking for specific capabilities rather than version numbers is more robust but in this case, perhaps multiple versions of experimental APIs will be discouraged making that less necessary. But we could allow strings specifying required features in the version set. Note that the reason for allowing import with more than one version number is to handle the case when a new version is available but the differences are in parts of the API that don't impact the application. --- Bruce Follow me: http://www.twitter.com/Vroo http://www.vroospeak.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Aug 30 04:09:50 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Aug 2011 12:09:50 +1000 Subject: [Python-ideas] IP addressing library, aka reviving PEP 3144 (Re: Add from __experimental__ import bla) Message-ID: On Tue, Aug 30, 2011 at 10:00 AM, Guido van Rossum wrote: > On Sun, Aug 28, 2011 at 7:53 PM, Nick Coghlan wrote: > > [Lots of stuff I agree with, and then] > >> Here's another potentially useful litmus test question for Guido: if >> it was spelled "from __experimental__ import ipaddr", would you be >> more inclined to approve PEP 3144 (IP address library) for 3.3? > > IIRC the issue with that PEP is that there is a stand-off between two > authors of competing implementations, and I don't have enough > understanding of the issues to be able to tell who's right. (Or maybe > they're both right and they're just aiming at different audiences -- > but I can't even tell that.) In reviewing the situation, that was actually the objection to the 3.1 incarnation of the ipaddr module (which was in SVN for a while before being reverted). Several changes were made to ipaddr before it was proposed for inclusion in 3.2 that largely resolved those objections (primarily, the 2.x series of ipaddr-py makes a much stronger distinction between networks and hosts: http://mail.python.org/pipermail/python-dev/2009-September/092384.html) The netaddr folks have a nice page (albeit a couple of years old now) listing some of the many incarnations of this particular wheel, so it's clearly base functionality worth providing as an included battery: https://code.google.com/p/netaddr/wiki/YetAnotherPythonIPModule > I think we need to have someone who cares more (but is not either of > those authors) to review the PEP and its criticism and decide on a way > forward. > > (Or am I being too soft? If nobody cares I'd be happy to toss a coin.) This request inspired me to go back and look at those old discussions. This comment from Scott Dial seems to summarise the core of the concern with the PEP 3144 version of the ipaddr API: http://mail.python.org/pipermail/python-dev/2009-September/091713.html RDM elaborated further here: http://mail.python.org/pipermail/python-dev/2009-September/092262.html It also turns out ipaddr is completely lacking in explicit prose documentation, so it would at least need that before it could be included. Such documentation would also be a useful adjunct to the PEP, since it would focus on what the module is like to *use* rather than how it is built. As near as I can tell, the core objection (the fact that IP network objects aren't normalised on creation, making their behaviour thoroughly surprising to anyone that actually understands the differences between IP addresses, IP networks, IP interfaces and IP hosts) remains valid, so I withdraw my suggestion that the current API should be added even as experimental code. I had forgotten about the identified problems with ipaddr until rereading the old thread reminded me of them. I believe the PEP would be significantly more palatable with the following changes/additions: 1. Draft ReStructuredText documentation for inclusion in the stdlib docs 2. Removal of the "ip" attribute of IP network objects (since it makes the nominal "networks" behave like IP interface definitions) 3. "network" property renamed to "netaddr" (since it returns an address object rather than a network object) 4. "strict" parameter removed from class signatures, replaced with class method for non-strict behaviour 5. Factory functions renamed so they don't look like class names (ip_network, ip_address, ip) 6. "strict" parameter on factory functions modified to default to True rather than False 7. Addition of an explicit "IPInterface" class to cover the association of an address with a specific network that is currently handled by storing arbitrary addresses on IP network objects IIRC, this is basically the point we reached last time, but Peter either wasn't interested in contributing/maintaining the module on those terms or else got sidetracked by other things. If he's no longer interested in contributing the module or not willing to implement any changes to address the concerns, it would be good to have an explicit statement to that effect, then we can mark the PEP as rejected and leave the field open to other proposals. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Aug 30 05:03:10 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Aug 2011 13:03:10 +1000 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> Message-ID: On Tue, Aug 30, 2011 at 11:22 AM, Bruce Leban wrote: > Unlike __future__ the API is unstable and therefore code that works with > today's __experimental__ may fail tomorrow. If the API changes in an > incompatible way, we would probably prefer that the import fail rather than > succeed with random results.?Therefore, I would propose a mechanism to > support this, perhaps something like > > from __experimental__ import regex {2,3,4} as re I'd advise against trying to overengineer this. If we decide to go down the path of breaking imports, then we'd probably just do it via naming conventions (e.g. give all experimental modules a suffix based on the Python version where they were first included). However, if you're concerned about subtle breakage due to backwards incompatibilities, then that's a good sign that depending on explicitly experimental modules is a *bad idea*. Adding a package specifically for candidate standard library modules with APIs that aren't yet locked in would be easy and low impact. Messing with the compiler or interpreter to provide additional features would be much higher impact, and consequently a much harder sell. However, you've highlighted the major problem with the idea: just because we *say* that the package is experimental and APIs may change without (programmatic) warning, doesn't mean people will pay attention. There are also a whole host of serialisation problems that arise when dealing with relocated modules, which could cause issues with the eventual migration to the "real" module location. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Tue Aug 30 06:20:13 2011 From: anacrolix at gmail.com (Matt Joiner) Date: Tue, 30 Aug 2011 14:20:13 +1000 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> Message-ID: This feature reminds me of "staging" in the Linux kernel. I often hear complaints that Python's bringing too many features on board, trying to explain __experimental__, {major, minor, patch} does not seem desirable. Then there's the consideration that any and all stuff floating around in __experimental__ will be overly encouraged, because honestly how many modules won't end up making the cut? Why can't a special tag/section be added in PyPi to indicate that a module is being considered for inclusion in future versions of Python, after all, we're all friends here. This might also give the oppurtunity to showcase some of the newer packaging infrastructure, and how easy it makes it to install third party modules. (Disclaimer: I haven't looked at the new one and I detest the old one.) -1 for adding yet more clutter Also -1 for renaming re: Some people call regex, regexp, you'll only be offending their sensibilities and adding to the confusion. Furthermore, a lot of Python's competitors have builtin regex support, such as =~ in Perl and bash. The terser the syntax for regex, the more at home and less cluttered it will appear to those users. I recommend that if a new regex module is created, that the name regex be used then, and the old be left alone. On Tue, Aug 30, 2011 at 1:03 PM, Nick Coghlan wrote: > On Tue, Aug 30, 2011 at 11:22 AM, Bruce Leban wrote: >> Unlike __future__ the API is unstable and therefore code that works with >> today's __experimental__ may fail tomorrow. If the API changes in an >> incompatible way, we would probably prefer that the import fail rather than >> succeed with random results.?Therefore, I would propose a mechanism to >> support this, perhaps something like >> >> from __experimental__ import regex {2,3,4} as re > > I'd advise against trying to overengineer this. If we decide to go > down the path of breaking imports, then we'd probably just do it via > naming conventions (e.g. give all experimental modules a suffix based > on the Python version where they were first included). > > However, if you're concerned about subtle breakage due to backwards > incompatibilities, then that's a good sign that depending on > explicitly experimental modules is a *bad idea*. > > Adding a package specifically for candidate standard library modules > with APIs that aren't yet locked in would be easy and low impact. > Messing with the compiler or interpreter to provide additional > features would be much higher impact, and consequently a much harder > sell. > > However, you've highlighted the major problem with the idea: just > because we *say* that the package is experimental and APIs may change > without (programmatic) warning, doesn't mean people will pay > attention. There are also a whole host of serialisation problems that > arise when dealing with relocated modules, which could cause issues > with the eventual migration to the "real" module location. > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ericsnowcurrently at gmail.com Tue Aug 30 08:10:23 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 30 Aug 2011 00:10:23 -0600 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Sun, Aug 7, 2011 at 6:10 AM, Guido van Rossum wrote: > But rather than a prolonged discussion of the > merits and use cases, I strongly recommend that somebody tries to come > up with a working implementation and we'll strengthen the PEP from > there. > > -- > --Guido van Rossum (python.org/~guido) > Here's a stab at a (relatively) simple approach to the function portion: http://bugs.python.org/issue12857 However, this does not expose __function__ in the locals. Instead it puts f_func in the frame object. Thus it provides the functionality, but doesn't invite abuse. And whether you like it or hate it, it's exactly what I've been looking for. -eric p.s. I tried doing it with a closure and, while it worked, I didn't like it. I will say that I came away with a _much_ better understanding of the CPython implementation (and gdb :). I feel much better about the f_func solution, particularly since we're really looking for the function associated with a frame. Trying to artificially force __function__ into the frame's locals just didn't feel right. From ncoghlan at gmail.com Tue Aug 30 10:25:51 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 30 Aug 2011 18:25:51 +1000 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Tue, Aug 30, 2011 at 4:10 PM, Eric Snow wrote: > Here's a stab at a (relatively) simple approach to the function portion: > > http://bugs.python.org/issue12857 > > However, this does not expose __function__ in the locals. ?Instead it > puts f_func in the frame object. ?Thus it provides the functionality, > but doesn't invite abuse. ?And whether you like it or hate it, it's > exactly what I've been looking for. That's actually quite an interesting idea, although I'm wondering if it could make some of our extant reference cycle issues associated with stack traces even worse. The fact that the patch is so simple is certainly rather appealing (although you have a few backwards compatibility issues to address, as I noted in my review). Also, I wouldn't be quite so quick to discard the function information in the class evaluation case. While the function involved there is a temporary one, it's still a real function. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ron3200 at gmail.com Tue Aug 30 05:02:03 2011 From: ron3200 at gmail.com (ron3200) Date: Mon, 29 Aug 2011 22:02:03 -0500 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> Message-ID: <1314673323.4733.66.camel@Gutsy> On Mon, 2011-08-29 at 18:22 -0700, Bruce Leban wrote: > > Unlike __future__ the API is unstable and therefore code that works > with today's __experimental__ may fail tomorrow. If the API changes in > an incompatible way, we would probably prefer that the import fail > rather than succeed with random results. Therefore, I would propose a > mechanism to support this, perhaps something like > > > from __experimental__ import regex {2,3,4} as re > > > which means import the regex module version 2, 3 or 4. A later version > will be imported only if it is an expansion, i.e., compatible with all > previous APIs and differs only in containing new features. If the > version set is omitted from the import it's treated as {1} so that > dealing with this is only necessary for those modules which do change > in incompatible ways. > > > In general, checking for specific capabilities rather than version > numbers is more robust but in this case, perhaps multiple versions of > experimental APIs will be discouraged making that less necessary. But > we could allow strings specifying required features in the version > set. Note that the reason for allowing import with more than one > version number is to handle the case when a new version is available > but the differences are in parts of the API that don't impact the > application. Encouraging limited use of __experimental__ may be good. If it looks like there are going to be more than two three different combinations, then it is likely too soon to even be in __experimental__. It probably needs more thought still. One possibility would be to have experimental features be of the form, which is better? (a or b). Then you could try things out and save timings and test results in a fairly straight forward way. results = {} for x in 'ab': from __experimental__[x] import regex as re for x in 'ab': print(results[x]) I can think of a few issues with this. Like how long do things live in __experimental__? Can the letters be reused, or should new letters be used each time? Is "from __experimental__[x] import feature" better or worse than "from __experimental__ import feature[x]"? And do subscripts even work in import statements? Then again, maybe version #'s would work better? Having a way to ask __experimental__ what is has and use that to do the importing could make doing some things easier. Cheers, Ron From peio.borthelle at gmail.com Tue Aug 30 15:20:49 2011 From: peio.borthelle at gmail.com (Peio Borthelle) Date: Tue, 30 Aug 2011 15:20:49 +0200 Subject: [Python-ideas] aliasing Message-ID: Hi, First, thank you to all the development community for this fabulous language (I'm french so my english is a bit...bad and basic). I'm quite new to programming and python is my first language. As beginner I had problems to deal with all aliasing stuff (in-place changes or not...). So my suggestion is perhaps not to change this (altough I find it a bit opposite to the python sense...) but to have a real aliasing fonction: ----------------------------- >>> a = 2 >>> b = alias("a") >>> a = 'foo' >>> b 'foo' ----------------------------- b is always a, it doesn't point to the same data, it points to the pointer a itself ! The arg is a string because otherwise the interpreter would give the value as arg, not the pointer. It could also be more complexe: ----------------------------- >>> a = 3 >>> b = alias("(a*3)+2") >>> b 11 >>> a = 5 >>> b 17 ----------------------------- Here I alias an expression (that's also why the arg must be a string). But I don't know if this last example is something good because a would have to stay a number (or generate an exception by calling b). If a is destroyed, then calling b would generate an exception...I don't have enough experience in python programming to really know how work the garbage collector, so I don't know if a could be destroyed by it. This is the end of my suggestion (I hope it wasn't already proposed, else...). Amicalement, Peio -------------- next part -------------- An HTML attachment was scrubbed... URL: From herman at swebpage.com Tue Aug 30 15:51:02 2011 From: herman at swebpage.com (Herman Sheremetyev) Date: Tue, 30 Aug 2011 22:51:02 +0900 Subject: [Python-ideas] aliasing In-Reply-To: References: Message-ID: On Tue, Aug 30, 2011 at 10:20 PM, Peio Borthelle wrote: > Hi, > First, thank you to all the development community for this fabulous language > (I'm french so my english is a bit...bad and basic).?I'm quite new to > programming and python is my first language. > As beginner I had problems to deal with all aliasing stuff (in-place changes > or not...). So my suggestion is perhaps not to change this (altough I find > it a bit opposite to the python sense...) but to have a real aliasing > fonction: > ----------------------------- >>>> a = 2 >>>> b = alias("a") b = lambda: a >>>> a = 'foo' >>>> b b() > 'foo' > ----------------------------- > b is always a, it doesn't point to the same data, it points to the pointer a > itself ! The arg is a string because otherwise the interpreter would give > the value as arg, not the pointer. > It could also be more complexe: > ----------------------------- >>>> a = 3 >>>> b = alias("(a*3)+2") b = lambda: a*3 + 2 >>>> b b() > 11 >>>> a = 5 >>>> b b() How's that? ;) From ghostwriter402 at gmail.com Tue Aug 30 17:46:54 2011 From: ghostwriter402 at gmail.com (Spectral One) Date: Tue, 30 Aug 2011 10:46:54 -0500 Subject: [Python-ideas] Expanding statistical functions in Python's std. lib. In-Reply-To: <4E56E859.3090504@canterbury.ac.nz> References: <549901314286114@web119.yandex.ru> <4E56E859.3090504@canterbury.ac.nz> Message-ID: <4E5D05EE.2050502@gmail.com> Wandering about, looking up statistics info for a program I was writing, I found a recommendation to add various useful 'special functions' to C's math library: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1069.pdf The arguments in that paper make a lot of sense to me, and apply well to Python. They came up with a good list, IMnsHO. I'd recommend implementing this list in some form as library functions in Python. Blindly copying wouldn't end up particularly 'Pythonic;' tweaking the API is required. Some of the selection choices, such as returning real only, ought to be reevaluated, for example. Obviously, any of the decisions to keep things C-like rather than object-oriented ought to shift, as well. Function names are only important as far as they are clear. I suggest naming per e.g. distribution_t(), or dist_F(), and include modification for algebraic order, as well, so gamma() and log_gamma(). That said, anything clear is fine. Thoughts on the matter? I noticed that the math library in 2.7+ added the gamma and log(gamma) functions, already, which was nice. Obviously, most, if not all, are already present in extensions modules such as NumPy, but there is value in having these things built into the language. "Batteries included, "and all that. By the by, if that is far too much for one suggestion, then please just treat this as a suggestion to add just the incomplete beta function. (P-values for binomial, F, and t are all nice, too, though with inc. beta, they aren't terrible to generate. I really think they should be included in the standard library.) -Nate From guido at python.org Tue Aug 30 18:47:54 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Aug 2011 09:47:54 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <1314673323.4733.66.camel@Gutsy> References: <1314569103.4640.19.camel@Gutsy> <1314673323.4733.66.camel@Gutsy> Message-ID: I'm sorry, the idea was never to add extra syntax to "from __experimental__". It was supposed to be just a plain import -- in fact plainer than "from __future__". There is hardly a need to use the double underscores; "from experimental" should work just as well. Why? It wasn't meant to be a language change. Just a convention to set aside an experimental namespace in the stdlib. If we're going to attempt to solve the problem of versioned dependencies, I recommend doing it outside the language syntax, at the package installer level, sort of the way setuptools / distribute do it. If you want to explore that idea, please read up on how it's done there and if you have ideas for improvement, start a new thread here. -- --Guido van Rossum (python.org/~guido) From ron3200 at gmail.com Tue Aug 30 23:55:46 2011 From: ron3200 at gmail.com (ron3200) Date: Tue, 30 Aug 2011 16:55:46 -0500 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> <1314673323.4733.66.camel@Gutsy> Message-ID: <1314741346.8261.11.camel@Gutsy> On Tue, 2011-08-30 at 09:47 -0700, Guido van Rossum wrote: > I'm sorry, the idea was never to add extra syntax to "from > __experimental__". It was supposed to be just a plain import -- in > fact plainer than "from __future__". There is hardly a need to use the > double underscores; "from experimental" should work just as well. > Why? It wasn't meant to be a language change. Just a convention to set > aside an experimental namespace in the stdlib. That sounds fine. I think if the release manager has final say on the content of experimental, (along with you of course), the rest will probably take care of itself. > If we're going to > attempt to solve the problem of versioned dependencies, I recommend > doing it outside the language syntax, at the package installer level, > sort of the way setuptools / distribute do it. If you want to explore > that idea, please read up on how it's done there and if you have ideas > for improvement, start a new thread here. Hopefully it won't ever get so large or complicated that that is needed. Cheers, Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From wuwei23 at gmail.com Wed Aug 31 04:03:53 2011 From: wuwei23 at gmail.com (alex23) Date: Tue, 30 Aug 2011 19:03:53 -0700 (PDT) Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> Message-ID: <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> On Aug 30, 2:20?pm, Matt Joiner wrote: > Why can't a special tag/section be added in PyPi to indicate that a > module is being considered for inclusion in future versions of Python, > after all, we're all friends here. +1 There was talk of making the standard library standalone. I think having a similar metapackage for experimental modules would be a more elegant way of achieving this. From ncoghlan at gmail.com Wed Aug 31 04:45:19 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 31 Aug 2011 12:45:19 +1000 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> Message-ID: On Wed, Aug 31, 2011 at 12:03 PM, alex23 wrote: > There was talk of making the standard library standalone. I think > having a similar metapackage for experimental modules would be a more > elegant way of achieving this. The key benefit lies in ensuring that anything in the mooted experimental namespace is clearly under python-dev's aegis from at least the following perspectives: - licensing (i.e. redistributed by the PSF under a Contributor Licensing Agreement) - testing (i.e. the module test suites are run on the python.org buildbot fleet and results published via http://www.python.org/dev/buildbot) - issue management (i.e. bugs and feature requests are handled on http://bugs.python.org) - source control (i.e. the master repository for the software is published on http://hg.python.org) Those are the things that will allow the experimental modules to be used under existing legal approvals that allow the use of Python itself (e.g. in a corporate or governmental environment). Handling the actual packaging for distribution is a somewhat orthogonal question. Whether we offer 3 installers (CPython core, standard library, experimental), 2 installers (CPython core + stdlib, experimental) or 1 monolithic installer, or a combination of those approaches, doesn't actually change the surrounding legal framework all that much. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From stephen at xemacs.org Wed Aug 31 05:23:51 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 31 Aug 2011 12:23:51 +0900 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> Message-ID: <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> alex23 writes: > On Aug 30, 2:20?pm, Matt Joiner wrote: > > Why can't a special tag/section be added in PyPi to indicate that a > > module is being considered for inclusion in future versions of Python, > > after all, we're all friends here. > > +1 > > There was talk of making the standard library standalone. I think > having a similar metapackage for experimental modules would be a more > elegant way of achieving this. "Although practicality beats purity." As already mentioned several times in this thread, PyPI, Bitbucket, and hg.python.org sandboxes as-is provide a plenty good technical solution for distribution of experimental code which is almost certain to be included in the core distribution at some point, but currently the API bikeshed is being painted (well, it will be if we can only come to consensus on color!) The problem to be solved is on the other side of the network connection, internal to the *using organizations*. Some of them have much stricter rules for adding new "approved" packages than for upgrading existing ones. In that case, developers "inside" get much freer access to "official experimental" modules, and can participate in development (including objecting to API changes they consider gratuitous :-) in a way that is hard for them to justify if *before* dealing with any API changes they have to run an internal obstacle course just to be allowed to use the code. As I understand Guido's position, "experimental" is a non-starter unless it can be expected to significantly increase beta tests of such modules by developers inside organizations that strictly control their internal code libraries. This requires that the "experimental" modules be distributed with the interpreter. Of course, if the stdlib was separated out, and the current stdlib simply bundled with the interpreter at distribution time, the experimental package could be given the same treatment. But the stdlib hasn't been done yet, and I don't think something labelled "experimental" would have the same credibility with IT Security/QA as the rest of core Python. This last might kill the whole idea, as QA might take the position that "yes, you're just upgrading Python 2.7 from 2.7.4 to 2.7.5, but we have no idea what might be in experimental, so you're going to have to make a separate request for that." (I have never worked in such an organization so I don't know if that's a realistic worry or not.) From guido at python.org Wed Aug 31 05:53:18 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Aug 2011 20:53:18 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Tue, Aug 30, 2011 at 8:23 PM, Stephen J. Turnbull wrote: > alex23 writes: > ?> On Aug 30, 2:20?pm, Matt Joiner wrote: > ?> > Why can't a special tag/section be added in PyPi to indicate that a > ?> > module is being considered for inclusion in future versions of Python, > ?> > after all, we're all friends here. > ?> > ?> +1 > ?> > ?> There was talk of making the standard library standalone. I think > ?> having a similar metapackage for experimental modules would be a more > ?> elegant way of achieving this. > > "Although practicality beats purity." > > As already mentioned several times in this thread, PyPI, Bitbucket, > and hg.python.org sandboxes as-is provide a plenty good technical > solution for distribution of experimental code which is almost certain > to be included in the core distribution at some point, but currently > the API bikeshed is being painted (well, it will be if we can only > come to consensus on color!) > > The problem to be solved is on the other side of the network > connection, internal to the *using organizations*. ?Some of them have > much stricter rules for adding new "approved" packages than for > upgrading existing ones. ?In that case, developers "inside" get much > freer access to "official experimental" modules, and can participate > in development (including objecting to API changes they consider > gratuitous :-) in a way that is hard for them to justify if *before* > dealing with any API changes they have to run an internal obstacle > course just to be allowed to use the code. > > As I understand Guido's position, "experimental" is a non-starter > unless it can be expected to significantly increase beta tests of such > modules by developers inside organizations that strictly control their > internal code libraries. ?This requires that the "experimental" > modules be distributed with the interpreter. ?Of course, if the stdlib > was separated out, and the current stdlib simply bundled with the > interpreter at distribution time, the experimental package could be > given the same treatment. ?But the stdlib hasn't been done yet, and I > don't think something labelled "experimental" would have the same > credibility with IT Security/QA as the rest of core Python. > > This last might kill the whole idea, as QA might take the position > that "yes, you're just upgrading Python 2.7 from 2.7.4 to 2.7.5, but > we have no idea what might be in experimental, so you're going to have > to make a separate request for that." ?(I have never worked in such an > organization so I don't know if that's a realistic worry or not.) I wasn't actually proposing to add to (or even change the API of modules in) the experimental package during such micro-version updates. TBH the best use case I can think of would actually be the ipaddr package, which is somewhat controversial but not overly so, and seems to lack a push to ever get it accepted. Putting it in experimental for 3.3 would let us distribute it with Python without committing 100% to the solution it offers over its nearest competitor. However the downside is that that's a very long wait, still provides a pretty strong bias (unless we were to include both competing packages), and still doesn't look like it might get enough beta testing, unless the uptake of 3.3 is huge. So maybe we should just count PyPI downloads and decide that way. "50,000,000 Elvis fans can't be wrong." (An interesting meme by itself. :-) --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Wed Aug 31 06:00:16 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 31 Aug 2011 14:00:16 +1000 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Aug 31, 2011 at 1:23 PM, Stephen J. Turnbull wrote: > As I understand Guido's position, "experimental" is a non-starter > unless it can be expected to significantly increase beta tests of such > modules by developers inside organizations that strictly control their > internal code libraries. ?This requires that the "experimental" > modules be distributed with the interpreter. Yep, that's a nice explanation of the motivation here. People that can just grab a module from PyPI and run it already have plenty of opportunities to provide feedback on in-development APIs. The idea of adding a new namespace within the standard library would be to have an 18-24 month window to gather similar feedback from folks in environments where they *can't* just grab packages from PyPI to try out, but *can* try out the latest version of CPython itself. (I suspect we'd also have an easier time getting feedback from folks that *could* retrieve modules PyPI if they wanted to, but in practice *don't*). > This last might kill the whole idea, as QA might take the position > that "yes, you're just upgrading Python 2.7 from 2.7.4 to 2.7.5, but > we have no idea what might be in experimental, so you're going to have > to make a separate request for that." ?(I have never worked in such an > organization so I don't know if that's a realistic worry or not.) It depends on the organization, but the main hurdle I experienced at my previous employer was the initial licensing and "development pedigree" approval process (to make sure that we were on solid legal ground in redistributing the component to our own customers and that we were prepared to shoulder the risk of latent defects - both bars were significantly lower if we added "for internal use only" to the review request). Due to the way the PSF and python-dev operate, CPython and the standard library check a lot of boxes in that kind of review that smaller projects often miss. Once a component was on the approved list, as long as the licensing didn't change, decisions about upgrades were then in the engineers' hands without needing to get the lawyers involved again. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Wed Aug 31 06:44:49 2011 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 30 Aug 2011 22:44:49 -0600 Subject: [Python-ideas] Access to function objects In-Reply-To: References: <4E3E2445.2070403@pearwood.info> Message-ID: On Tue, Aug 30, 2011 at 2:25 AM, Nick Coghlan wrote: > That's actually quite an interesting idea, although I'm wondering if > it could make some of our extant reference cycle issues associated > with stack traces even worse. The fact that the patch is so simple is > certainly rather appealing (although you have a few backwards > compatibility issues to address, as I noted in my review). Thanks for taking a look, Nick. I'm addressing the backward incompatible stuff in a new patch a little later, and have updated the issue regarding the reference cycles. > > Also, I wouldn't be quite so quick to discard the function information > in the class evaluation case. While the function involved there is a > temporary one, it's still a real function. Yeah, I had considered that. However, isn't that just a CPython implementation detail? Does that matter? I don't have a problem with setting f_func for classes (and whatever else). Just wanted to clarify first. Thanks. -eric > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > From peio.borthelle at gmail.com Wed Aug 31 13:43:36 2011 From: peio.borthelle at gmail.com (Peio Borthelle) Date: Wed, 31 Aug 2011 13:43:36 +0200 Subject: [Python-ideas] aliasing In-Reply-To: References: Message-ID: Le 30 ao?t 2011 ? 15:51, Herman Sheremetyev a ?crit : > b = lambda: a > > b = lambda: a*3 + 2 > > How's that? ;) Great, I hadn't tought to the lambda fonction, and it is a solution... but it isn't quite what I proposed: here b is a fonction, you have to call it to have the number. If you want to use it in your code, e.g. two int variable must stay to the same value, you can't give a fonction as argument, and what I want to avoid is precisely to change both value one by one. I misspoke something in my last message (and it wasn't also quite clear in my mind), I sould have written: -------------------------------- >>> a = 3 >>> b = Alias("a") # Alias is a class (or data type?). -------------------------------- Where b is an Alias instance. I don't know anything in C language, so I don't know how works data types defining, but I imagine the Alias class as a normal variable wich's value changes dynamically. Is it technically possible ? Amicalement, Peio -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Aug 31 14:04:15 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 31 Aug 2011 22:04:15 +1000 Subject: [Python-ideas] aliasing In-Reply-To: References: Message-ID: On Tue, Aug 30, 2011 at 11:20 PM, Peio Borthelle wrote: > Here I alias an expression (that's also why the arg must be a string). But I > don't know if this last example is something good because a would have to > stay a number (or generate an exception by calling b). > If a is destroyed, then calling b would generate an exception...I don't have > enough experience in python programming to really know how work the garbage > collector, so I don't know if a could be destroyed by it. > This is the end of my suggestion (I hope it wasn't already proposed, > else...). The fact that identifiers are just references to objects rather than objects in their own right is central to the way the language works (in a very real sense, everything of significance in a Python program is either an object or a reference to an object). The distinction between mutable objects (e.g. lists) and immutable objects (e.g. tuples) is an immediate outcome of that - since the latter can't be modified directly, the only thing you can do is change which one you're referring to, which doesn't affect any other references to the old value. If you want mutable state, you have to store it on a mutable object. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Wed Aug 31 14:41:51 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 31 Aug 2011 22:41:51 +1000 Subject: [Python-ideas] aliasing In-Reply-To: References: Message-ID: <4E5E2C0F.7080603@pearwood.info> Peio Borthelle wrote: >>>> a = 2 >>>> b = alias("a") >>>> a = 'foo' >>>> b > 'foo' I have often thought that would be a nice to have feature, but to be honest, I have never found a use for it that was absolutely necessary. I have always found another way to solve the problem. You can always use one level of indirection: a = [2] b = a a[0] = 'foo' # Not a='foo' b[0] => print 'foo' To make it work using just ordinary assignment syntax, as you suggest, requires more than just an "alias" function. It would need changes to the Python internals. Possibly very large changes. Without a convincing use-case, I don't think that will happen. So even though I think this would be a neat feature to have, and possibly even useful, I don't think it is useful enough to justify the work needed to make it happen. -- Steven From barry at python.org Wed Aug 31 17:05:36 2011 From: barry at python.org (Barry Warsaw) Date: Wed, 31 Aug 2011 11:05:36 -0400 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> Message-ID: <20110831110536.111822cf@resist.wooz.org> On Aug 31, 2011, at 12:45 PM, Nick Coghlan wrote: >The key benefit lies in ensuring that anything in the mooted >experimental namespace is clearly under python-dev's aegis from at >least the following perspectives: >- licensing (i.e. redistributed by the PSF under a Contributor >Licensing Agreement) >- testing (i.e. the module test suites are run on the python.org >buildbot fleet and results published via >http://www.python.org/dev/buildbot) >- issue management (i.e. bugs and feature requests are handled on >http://bugs.python.org) >- source control (i.e. the master repository for the software is >published on http://hg.python.org) > >Those are the things that will allow the experimental modules to be >used under existing legal approvals that allow the use of Python >itself (e.g. in a corporate or governmental environment). In the face of PEP 402, how could you enforce that? Even if you can't or don't want to enforce it, how would a user be able to verify that it was the case for something in experimental? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Wed Aug 31 17:19:46 2011 From: barry at python.org (Barry Warsaw) Date: Wed, 31 Aug 2011 11:19:46 -0400 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20110831111946.22bac764@resist.wooz.org> On Aug 30, 2011, at 08:53 PM, Guido van Rossum wrote: >TBH the best use case I can think of would actually be the ipaddr >package, which is somewhat controversial but not overly so, and seems >to lack a push to ever get it accepted. Putting it in experimental for >3.3 would let us distribute it with Python without committing 100% to >the solution it offers over its nearest competitor. However the >downside is that that's a very long wait, still provides a pretty >strong bias (unless we were to include both competing packages), and >still doesn't look like it might get enough beta testing, unless the >uptake of 3.3 is huge. So maybe we should just count PyPI downloads >and decide that way. "50,000,000 Elvis fans can't be wrong." (An >interesting meme by itself. :-) I think an experimental namespace is actually attacking the wrong problem, or maybe the right problem in the wrong way. ;) I'd much prefer to see some brainstorming on multi-version support, which is something that Robert Collins is always bugging me about. Something like pkg_resource's require() function seems like a good place to start. If we had this built-in, then including ipaddr in the stdlib as it currently stands would be a no-brainer, even with a suboptimal API. It would get lots of testing, and a completely different API could be designed for Python 3.4 without break packages that relied on the old API. Python's deprecation policy could be adjusted to include diminishing support for older versions of a module in the stdlib, and we'd avoid ugliness like unittest2 and such. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From guido at python.org Wed Aug 31 18:29:34 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 31 Aug 2011 09:29:34 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <20110831110536.111822cf@resist.wooz.org> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <20110831110536.111822cf@resist.wooz.org> Message-ID: On Wed, Aug 31, 2011 at 8:05 AM, Barry Warsaw wrote: > On Aug 31, 2011, at 12:45 PM, Nick Coghlan wrote: > >>The key benefit lies in ensuring that anything in the mooted >>experimental namespace is clearly under python-dev's aegis from at >>least the following perspectives: >>- licensing (i.e. redistributed by the PSF under a Contributor >>Licensing Agreement) >>- testing (i.e. the module test suites are run on the python.org >>buildbot fleet and results published via >>http://www.python.org/dev/buildbot) >>- issue management (i.e. bugs and feature requests are handled on >>http://bugs.python.org) >>- source control (i.e. the master repository for the software is >>published on http://hg.python.org) >> >>Those are the things that will allow the experimental modules to be >>used under existing legal approvals that allow the use of Python >>itself (e.g. in a corporate or governmental environment). > > In the face of PEP 402, how could you enforce that? ?Even if you can't or > don't want to enforce it, how would a user be able to verify that it was the > case for something in experimental? I'm sorry, I don't follow. The experimental package would only contain code distributed as part of the stdlib, and the code put in the stdlib's experimental package would get the same care from the core developers as the rest of the stdlib. The only difference would be that we'd drop the guarantee that the APIs offered would still be present in the next release (i.e. from 3.3 -> 3.4; the guarantees would hold from 3.3.1 -> 3.3.2). I hope I'm not contradicting myself? -- --Guido van Rossum (python.org/~guido) From guido at python.org Wed Aug 31 18:31:46 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 31 Aug 2011 09:31:46 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <20110831111946.22bac764@resist.wooz.org> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> Message-ID: On Wed, Aug 31, 2011 at 8:19 AM, Barry Warsaw wrote: > On Aug 30, 2011, at 08:53 PM, Guido van Rossum wrote: > >>TBH the best use case I can think of would actually be the ipaddr >>package, which is somewhat controversial but not overly so, and seems >>to lack a push to ever get it accepted. Putting it in experimental for >>3.3 would let us distribute it with Python without committing 100% to >>the solution it offers over its nearest competitor. However the >>downside is that that's a very long wait, still provides a pretty >>strong bias (unless we were to include both competing packages), and >>still doesn't look like it might get enough beta testing, unless the >>uptake of 3.3 is huge. So maybe we should just count PyPI downloads >>and decide that way. "50,000,000 Elvis fans can't be wrong." (An >>interesting meme by itself. :-) > > I think an experimental namespace is actually attacking the wrong problem, or > maybe the right problem in the wrong way. ;) > > I'd much prefer to see some brainstorming on multi-version support, which is > something that Robert Collins is always bugging me about. ?Something like > pkg_resource's require() function seems like a good place to start. It's a great idea to brainstorm about, just not in this thread. I see the two issues completely orthogonal. > If we had this built-in, then including ipaddr in the stdlib as it currently > stands would be a no-brainer, even with a suboptimal API. ?It would get lots > of testing, and a completely different API could be designed for Python 3.4 > without break packages that relied on the old API. ?Python's deprecation > policy could be adjusted to include diminishing support for older versions of > a module in the stdlib, and we'd avoid ugliness like unittest2 and such. You sound like you have a solution for using multiple versions of an API in the same program. Do you? Then out with it! Otherwise, no, it wouldn't be a no-brainer. -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed Aug 31 18:56:35 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 31 Aug 2011 18:56:35 +0200 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> Message-ID: <20110831185635.744c9624@pitrou.net> On Wed, 31 Aug 2011 09:31:46 -0700 Guido van Rossum wrote: > > > If we had this built-in, then including ipaddr in the stdlib as it currently > > stands would be a no-brainer, even with a suboptimal API. ?It would get lots > > of testing, and a completely different API could be designed for Python 3.4 > > without break packages that relied on the old API. ?Python's deprecation > > policy could be adjusted to include diminishing support for older versions of > > a module in the stdlib, and we'd avoid ugliness like unittest2 and such. > > You sound like you have a solution for using multiple versions of an > API in the same program. Do you? Then out with it! Otherwise, no, it > wouldn't be a no-brainer. But couldn't __experimental__ also mean we don't guarantee API stability (until the module leaves the __experimental__ namespace)? Regards Antoine. From guido at python.org Wed Aug 31 19:23:55 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 31 Aug 2011 10:23:55 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <20110831185635.744c9624@pitrou.net> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> Message-ID: On Wed, Aug 31, 2011 at 9:56 AM, Antoine Pitrou wrote: > On Wed, 31 Aug 2011 09:31:46 -0700 > Guido van Rossum wrote: >> >> > If we had this built-in, then including ipaddr in the stdlib as it currently >> > stands would be a no-brainer, even with a suboptimal API. ?It would get lots >> > of testing, and a completely different API could be designed for Python 3.4 >> > without break packages that relied on the old API. ?Python's deprecation >> > policy could be adjusted to include diminishing support for older versions of >> > a module in the stdlib, and we'd avoid ugliness like unittest2 and such. >> >> You sound like you have a solution for using multiple versions of an >> API in the same program. Do you? Then out with it! Otherwise, no, it >> wouldn't be a no-brainer. > > But couldn't __experimental__ also mean we don't guarantee API > stability (until the module leaves the __experimental__ namespace)? Yes, to some extent. My specific proposal is that experimental (by whatever name) would not change APIs in bugfix release, but would (potentially) change APIs in "major" releases (e.g. 3.3 -> 3.4(*)). (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor release", "3.3" is a "major release", and "3" an "earthshattering release"? Or other terms -- but something that is both agreed upon and clear enough without explanation. I'm tired of having to clarify minor and major every time I use them out of fear they'll be mistaken for "3" and "3.3". -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed Aug 31 19:33:05 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 31 Aug 2011 19:33:05 +0200 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> Message-ID: <1314811985.3505.0.camel@localhost.localdomain> > (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor > release", "3.3" is a "major release", and "3" an "earthshattering > release"? Or other terms -- but something that is both agreed upon and > clear enough without explanation. I'm tired of having to clarify minor > and major every time I use them out of fear they'll be mistaken for > "3" and "3.3". +1! "Minor" really sounds like a misnomer when applied to feature releases. Regards Antoine. From mwm at mired.org Wed Aug 31 19:36:55 2011 From: mwm at mired.org (Mike Meyer) Date: Wed, 31 Aug 2011 10:36:55 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> Message-ID: On Wed, Aug 31, 2011 at 10:23 AM, Guido van Rossum wrote: > (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor > release", "3.3" is a "major release", and "3" an "earthshattering > release"? Or other terms -- but something that is both agreed upon and > clear enough without explanation. I'm tired of having to clarify minor > and major every time I use them out of fear they'll be mistaken for > "3" and "3.3". Given that there's no general industry agreement on what those things mean, "clear enough without explanation" is probably unrealistic. On the other hand, a PEP or some similar document that lays out the terminology used for python releases (and encouraged for python libraries) would let you assume that python people had read that explanation, and make a reference to it easy (i.e."see PEP XXXX"). From p.f.moore at gmail.com Wed Aug 31 19:45:47 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 31 Aug 2011 18:45:47 +0100 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: <1314811985.3505.0.camel@localhost.localdomain> References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> <1314811985.3505.0.camel@localhost.localdomain> Message-ID: On 31 August 2011 18:33, Antoine Pitrou wrote: > >> (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor >> release", "3.3" is a "major release", and "3" an "earthshattering >> release"? Or other terms -- but something that is both agreed upon and >> clear enough without explanation. I'm tired of having to clarify minor >> and major every time I use them out of fear they'll be mistaken for >> "3" and "3.3". > > +1! "Minor" really sounds like a misnomer when applied to feature > releases. How about 3.3.3 -> 3.3.4 is a "minor" release, 3.3 -> 3.4 is a "feature" release and 3 -> 4 is not something we generally talk about (or "compatibility-breaking" or something like that). I suspect that "minor" for changing the last digit is pretty comprehensible, it's using "major" for 3.3 that confuses people, so let's avoid that term... Paul. From guido at python.org Wed Aug 31 19:51:07 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 31 Aug 2011 10:51:07 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> <1314811985.3505.0.camel@localhost.localdomain> Message-ID: On Wed, Aug 31, 2011 at 10:45 AM, Paul Moore wrote: > On 31 August 2011 18:33, Antoine Pitrou wrote: >> >>> (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor >>> release", "3.3" is a "major release", and "3" an "earthshattering >>> release"? Or other terms -- but something that is both agreed upon and >>> clear enough without explanation. I'm tired of having to clarify minor >>> and major every time I use them out of fear they'll be mistaken for >>> "3" and "3.3". >> >> +1! "Minor" really sounds like a misnomer when applied to feature >> releases. > > How about 3.3.3 -> 3.3.4 is a "minor" release, 3.3 -> 3.4 is a > "feature" release and 3 -> 4 is not something we generally talk about > (or "compatibility-breaking" or something like that). > > I suspect that "minor" for changing the last digit is pretty > comprehensible, it's using "major" for 3.3 that confuses people, so > let's avoid that term... Let's avoid minor and major altogether. 2 -> 3: galactic release 3.2 -> 3.3: feature release (also 3 -> 3.1) 3.2.1 -> 3.2.2: bugfix release (also 3.2 -> 3.2.1) -- --Guido van Rossum (python.org/~guido) From jeanpierreda at gmail.com Wed Aug 31 19:52:51 2011 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 31 Aug 2011 13:52:51 -0400 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> <1314811985.3505.0.camel@localhost.localdomain> Message-ID: > How about 3.3.3 -> 3.3.4 is a "minor" release, 3.3 -> 3.4 is a > "feature" release and 3 -> 4 is not something we generally talk about > (or "compatibility-breaking" or something like that). The issue with that is that the "y" in 'Python x.y.z' is usually called the minor release (see: http://docs.python.org/devguide/devcycle.html ). Switching which digit the minor release is would be confusing when reading older documents. It might be better to instead keep the old name, "bugfix", since that isn't confusing. Then we would be left with "compatibility-breaking.feature.bugfix", which isn't really bad in any sense I can see (aside perhaps from being new and unusual?) Devin On Wed, Aug 31, 2011 at 1:45 PM, Paul Moore wrote: > On 31 August 2011 18:33, Antoine Pitrou wrote: >> >>> (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor >>> release", "3.3" is a "major release", and "3" an "earthshattering >>> release"? Or other terms -- but something that is both agreed upon and >>> clear enough without explanation. I'm tired of having to clarify minor >>> and major every time I use them out of fear they'll be mistaken for >>> "3" and "3.3". >> >> +1! "Minor" really sounds like a misnomer when applied to feature >> releases. > > How about 3.3.3 -> 3.3.4 is a "minor" release, 3.3 -> 3.4 is a > "feature" release and 3 -> 4 is not something we generally talk about > (or "compatibility-breaking" or something like that). > > I suspect that "minor" for changing the last digit is pretty > comprehensible, it's using "major" for 3.3 that confuses people, so > let's avoid that term... > > Paul. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From mwm at mired.org Wed Aug 31 20:21:55 2011 From: mwm at mired.org (Mike Meyer) Date: Wed, 31 Aug 2011 11:21:55 -0700 Subject: [Python-ideas] aliasing In-Reply-To: <4E5E2C0F.7080603@pearwood.info> References: <4E5E2C0F.7080603@pearwood.info> Message-ID: On Wed, Aug 31, 2011 at 5:41 AM, Steven D'Aprano wrote: > To make it work using just ordinary assignment syntax, as you suggest, > requires more than just an "alias" function. It would need changes to the > Python internals. Possibly very large changes. Without a convincing > use-case, I don't think that will happen. > Well, once you clear up some of the ambiguities in the request, it's not quite so bad. class F(object): def m(self): alias(a, b) Which name space is a being aliased in? Function local? Instance? Class? Module? So alias needs to take a namespace of some sort. The appropriate object is easy to find, and should do the trick (though function local might be entertaining). For simplicity, we'll assume that aliases have to be in the same namespace (though doing otherwise isn't really harder, just slower). Passing in variable values doesn't work very well - especially for the new alias, which may not *have* a value yet! For the name to be aliased, you could in theory look up the value in the namespace and use the corresponding name, but there may be more than one of those. So pass that one in as a string as well. So make the call "alias(o, name1, name2)". o is a module, class, instance, or function. Name lookups are done via the __dict__ dict for all of those. It's easy to write a dict subclass that aliases it's entries. So the internal changes would be making __dict__ a writable attribute on any objects which it currently isn't, and making sure that the various objects use __setitem__ to change attribute values. I suspect the performance hit - even if you don't use this feature - would be noticeable. If you used it, it would be even worse. -1. From guido at python.org Wed Aug 31 21:05:05 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 31 Aug 2011 12:05:05 -0700 Subject: [Python-ideas] Expanding statistical functions in Python's std. lib. In-Reply-To: <4E5D05EE.2050502@gmail.com> References: <549901314286114@web119.yandex.ru> <4E56E859.3090504@canterbury.ac.nz> <4E5D05EE.2050502@gmail.com> Message-ID: You didn't get any responses AFAICT. That doesn't mean nobody is interested -- perhaps your proposal is simply too general? Do you feel up to making some more specific recommendations about the exact list of functions to add? It's easier to criticize a concrete proposal. Do you feel up to producing a patch that just adds the incomplete beta function? --Guido On Tue, Aug 30, 2011 at 8:46 AM, Spectral One wrote: > > Wandering about, looking up statistics info for a program I was writing, I > found a recommendation to add various useful 'special functions' to C's math > library: > http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1069.pdf > > The arguments in that paper make a lot of sense to me, and apply well to > Python. They came up with a good list, IMnsHO. I'd recommend implementing > this list in some form as library functions in Python. > > Blindly copying wouldn't end up particularly 'Pythonic;' tweaking the API is > required. ? ?Some of the selection choices, such as returning real only, > ought to be reevaluated, for example. Obviously, any of the decisions to > keep things C-like rather than object-oriented ought to shift, as well. > > Function names are only important as far as they are clear. I suggest naming > per e.g. distribution_t(), or dist_F(), > and include modification for algebraic order, as well, so gamma() and > log_gamma(). That said, anything clear is fine. > > Thoughts on the matter? I noticed that the math library in 2.7+ added the > gamma and log(gamma) functions, already, which was nice. Obviously, most, if > not all, are already present in extensions modules such as NumPy, but there > is value in having these things built into the language. "Batteries > included, "and all that. > > > > By the by, if that is far too much for one suggestion, then please just > treat this as a suggestion to add just the incomplete beta function. > (P-values for binomial, F, and t are all nice, too, though with inc. beta, > they aren't terrible to generate. I really think they should be included in > the standard library.) > > -Nate > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From brett at python.org Wed Aug 31 21:44:54 2011 From: brett at python.org (Brett Cannon) Date: Wed, 31 Aug 2011 12:44:54 -0700 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <1314569103.4640.19.camel@Gutsy> <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> <1314811985.3505.0.camel@localhost.localdomain> Message-ID: On Wed, Aug 31, 2011 at 10:51, Guido van Rossum wrote: > On Wed, Aug 31, 2011 at 10:45 AM, Paul Moore wrote: >> On 31 August 2011 18:33, Antoine Pitrou wrote: >>> >>>> (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor >>>> release", "3.3" is a "major release", and "3" an "earthshattering >>>> release"? Or other terms -- but something that is both agreed upon and >>>> clear enough without explanation. I'm tired of having to clarify minor >>>> and major every time I use them out of fear they'll be mistaken for >>>> "3" and "3.3". >>> >>> +1! "Minor" really sounds like a misnomer when applied to feature >>> releases. >> >> How about 3.3.3 -> 3.3.4 is a "minor" release, 3.3 -> 3.4 is a >> "feature" release and 3 -> 4 is not something we generally talk about >> (or "compatibility-breaking" or something like that). >> >> I suspect that "minor" for changing the last digit is pretty >> comprehensible, it's using "major" for 3.3 that confuses people, so >> let's avoid that term... > > Let's avoid minor and major altogether. > > 2 -> 3: galactic release > 3.2 -> 3.3: feature release (also 3 -> 3.1) > 3.2.1 -> 3.2.2: bugfix release (also 3.2 -> 3.2.1) sys.version_info has already made the declaration of what each of the digits represent when it became a named tuple: major, minor, micro (http://hg.python.org/cpython/file/4dcbae65df3f/Python/sysmodule.c#l1273). And if you don't like the naming, then blame me; http://bugs.python.org/issue4285 was the bug that led to the names and they are what I have always used for all software. After that blame Eric Smith for committing the patch. =) From eric at trueblade.com Wed Aug 31 23:36:21 2011 From: eric at trueblade.com (Eric V. Smith) Date: Wed, 31 Aug 2011 17:36:21 -0400 Subject: [Python-ideas] Add from __experimental__ import bla [was: Should we move to replace re with regex?] In-Reply-To: References: <58780826-87be-436f-b777-937e41691e90@m35g2000prl.googlegroups.com> <871uw2l0vs.fsf@uwakimon.sk.tsukuba.ac.jp> <20110831111946.22bac764@resist.wooz.org> <20110831185635.744c9624@pitrou.net> <1314811985.3505.0.camel@localhost.localdomain> Message-ID: <4E5EA955.9080405@trueblade.com> On 8/31/2011 3:44 PM, Brett Cannon wrote: > On Wed, Aug 31, 2011 at 10:51, Guido van Rossum wrote: >> On Wed, Aug 31, 2011 at 10:45 AM, Paul Moore wrote: >>> On 31 August 2011 18:33, Antoine Pitrou wrote: >>>> >>>>> (*) Can we pick a terminology so we all agree that "3.3.3" is a "minor >>>>> release", "3.3" is a "major release", and "3" an "earthshattering >>>>> release"? Or other terms -- but something that is both agreed upon and >>>>> clear enough without explanation. I'm tired of having to clarify minor >>>>> and major every time I use them out of fear they'll be mistaken for >>>>> "3" and "3.3". >>>> >>>> +1! "Minor" really sounds like a misnomer when applied to feature >>>> releases. >>> >>> How about 3.3.3 -> 3.3.4 is a "minor" release, 3.3 -> 3.4 is a >>> "feature" release and 3 -> 4 is not something we generally talk about >>> (or "compatibility-breaking" or something like that). >>> >>> I suspect that "minor" for changing the last digit is pretty >>> comprehensible, it's using "major" for 3.3 that confuses people, so >>> let's avoid that term... >> >> Let's avoid minor and major altogether. >> >> 2 -> 3: galactic release >> 3.2 -> 3.3: feature release (also 3 -> 3.1) >> 3.2.1 -> 3.2.2: bugfix release (also 3.2 -> 3.2.1) > > sys.version_info has already made the declaration of what each of the > digits represent when it became a named tuple: major, minor, micro > (http://hg.python.org/cpython/file/4dcbae65df3f/Python/sysmodule.c#l1273). > > And if you don't like the naming, then blame me; > http://bugs.python.org/issue4285 was the bug that led to the names and > they are what I have always used for all software. After that blame > Eric Smith for committing the patch. =) Great. My fingerprints are on it.