From niki.spahiev at gmail.com Fri Nov 1 09:21:59 2013 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Fri, 01 Nov 2013 10:21:59 +0200 Subject: [Python-ideas] Support os.path.join for Windows paths on Posix In-Reply-To: References: Message-ID: On 31.10.2013 00:58, Ryan Gonzalez wrote: > 1.Python 3 doesn't come with Ubuntu ?!? Ubuntu 13 is based on python 3. HTH Niki From steve at pearwood.info Fri Nov 1 10:44:05 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 1 Nov 2013 20:44:05 +1100 Subject: [Python-ideas] Allow attribute references for decimalinteger In-Reply-To: References: <5271B322.1000108@mrabarnett.plus.com> <5271D455.2090507@canterbury.ac.nz> Message-ID: <20131101094405.GD18730@ando> On Thu, Oct 31, 2013 at 04:10:46PM +0100, Ronald Oussoren wrote: > > On 31 Oct, 2013, at 15:35, Philipp A. wrote: > > > 2013/10/31 Greg Ewing > > In hindsight, things might have been better if the decimal point > > were required to have at least one digit after it, but it's > > probably too late to change that now. > > > > i disagree. i like writing ?1.? for ?float(1)? and ?.1? for ?1/10?. what?s the point of redundant zeros? > > Increased readability. Then you might prefer writing 1.000000000000000000000000000000000 for even more readability ;-) But seriously... accepting "1." for float 1.0 is, as far as I can tell, a common but not universal practice in programming languages. Being tolerant of missing zeroes before or after the decimal point matches how most people write, in my experience: dropping the leading zero is very common, trailing zero less so, unless they drop the decimal point as well. In a language like Python with distinct int and float types, but no declarations, dropping the decimal point gives you an int. Not much can be done about that. As far as other languages go, ISO Pascal requires a digit after the dot, but in my experience most Pascal compilers accept a bare dot: gpc compiles "x := 23.;" but issues a warning. C appears to allow it: gcc compiles "double foo = 23.;" without a warning. Ruby no longer accepts floats with a leading or trailing dot. Personally, I would never use "1." in source code, but I might use it in the interactive interpreter. I think it's an issue for a linter, not the compiler, so I'm happy with the status quo. -- Steven From elazarg at gmail.com Fri Nov 1 10:59:22 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Fri, 1 Nov 2013 11:59:22 +0200 Subject: [Python-ideas] Allow attribute references for decimalinteger In-Reply-To: <20131101094405.GD18730@ando> References: <5271B322.1000108@mrabarnett.plus.com> <5271D455.2090507@canterbury.ac.nz> <20131101094405.GD18730@ando> Message-ID: > Personally, I would never use "1." in source code, but I might use it in > the interactive interpreter. Would you use 1.e10 too? I think this syntax, at least, is an awful idea. From steve at pearwood.info Fri Nov 1 11:17:52 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 1 Nov 2013 21:17:52 +1100 Subject: [Python-ideas] Where did we go wrong with negative stride? In-Reply-To: References: <52719D47.1010507@mrabarnett.plus.com> <5271B415.8080607@mrabarnett.plus.com> <5271B5B2.7050209@stoneleaf.us> Message-ID: <20131101101752.GE18730@ando> On Thu, Oct 31, 2013 at 03:37:36PM +0200, ????? wrote: > Perhaps we can get this End object by adding two tokens: ":-" and "[-". So > > a[-3:-5] == a[slice(End-3, End-5, None)] That's ambiguous. Consider: a[-2] Is that a dict lookup with key -2, or a list indexed with End-2? Or worse, a dict lookup with key len(a)-2. To break the ambiguity, we'd need a rule that End objects can only occur inside slices with at least one colon. But, I think that means that the parser would have to look ahead to see whether it was within a slice or not, and that might not be possible with Python's parser. Even if were, it's still a special case that -2 means something different inside a slice than outside a slice, and we know what the Zen of Python says about special cases. > although it will turn a[-3] into a[End-3]. I don't think it's a > problem if the latter will behave in the same way as the former (i.e > End-3 be a subtype of int). So what would (End-3)*10 return? How about End & 0xF ? > Note that with an End object (regardless of wheather it's called > "End", "len-x" or ":-x") we can get End/5. I think that's a nice thing > to have. I presume that you expect a[End/5] to be equivalent to a[len(a)//5] ? > One more thing: End-5 should be callable, so it can be passed around. > > (End-3)("hello") == len("hello")-3 > (End-0)("hello") == len("hello") > > This way End is a generalization of len, making len somewhat redundant. All this seems very clever, but as far as I'm concerned, it's too clever. I don't like objects which are context-sensitive. Given: x = End-2 then in this context, x behaves like 4: "abcdef"[x:] while in this context, x behaves like 0: "abcd"[x:] I really don't like that. That makes it hard to reason about code. End seems to me to be an extremely sophisticated object, far too sophisticated for slicing, which really ought to be conceptually and practically a simple operation. I would not have to like to explain this to beginners to Python. I especially would not like to explain how it works. (How would it work?) I think the only answer is, "It's magic". I think it is something that would be right at home in PHP or Perl, and I don't mean that as an insult, but only that it's not a good fit to Python. -- Steven From rosuav at gmail.com Fri Nov 1 12:52:24 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 1 Nov 2013 22:52:24 +1100 Subject: [Python-ideas] Allow attribute references for decimalinteger In-Reply-To: <20131101094405.GD18730@ando> References: <5271B322.1000108@mrabarnett.plus.com> <5271D455.2090507@canterbury.ac.nz> <20131101094405.GD18730@ando> Message-ID: On Fri, Nov 1, 2013 at 8:44 PM, Steven D'Aprano wrote: > But seriously... accepting "1." for float 1.0 is, as far as I can tell, > a common but not universal practice in programming languages. Being > tolerant of missing zeroes before or after the decimal point matches how > most people write, in my experience: dropping the leading zero is very > common, trailing zero less so, unless they drop the decimal point as > well. In a language like Python with distinct int and float types, but > no declarations, dropping the decimal point gives you an int. Not much > can be done about that. And that's exactly why the clarity has value - because of the extreme visual similarity between "1" (int) and "1." (float). However... > Personally, I would never use "1." in source code, but I might use it in > the interactive interpreter. I think it's an issue for a linter, not the > compiler, so I'm happy with the status quo. ... I agree. Having the flexibility is handy, same as we have the flexibility to write less-clear code in other ways. And as to the exponent, I don't recall ever writing "1.e10" intentionally, but I definitely write "1e10", and would be extremely surprised if a loose dot broke that. ChrisA From elazarg at gmail.com Fri Nov 1 13:15:23 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Fri, 1 Nov 2013 14:15:23 +0200 Subject: [Python-ideas] Where did we go wrong with negative stride? In-Reply-To: <20131101101752.GE18730@ando> References: <52719D47.1010507@mrabarnett.plus.com> <5271B415.8080607@mrabarnett.plus.com> <5271B5B2.7050209@stoneleaf.us> <20131101101752.GE18730@ando> Message-ID: 2013/11/1 Steven D'Aprano : > On Thu, Oct 31, 2013 at 03:37:36PM +0200, ????? wrote: >> Note that with an End object (regardless of wheather it's called >> "End", "len-x" or ":-x") we can get End/5. I think that's a nice thing >> to have. > > I presume that you expect a[End/5] to be equivalent to a[len(a)//5] ? > Yes. Perhaps End//5 is better of course. > >> One more thing: End-5 should be callable, so it can be passed around. >> >> (End-3)("hello") == len("hello")-3 >> (End-0)("hello") == len("hello") >> >> This way End is a generalization of len, making len somewhat redundant. > > All this seems very clever, but as far as I'm concerned, it's too > clever. I don't like objects which are context-sensitive. Given: > > x = End-2 > > then in this context, x behaves like 4: > > "abcdef"[x:] > > while in this context, x behaves like 0: > > "abcd"[x:] > Sure you mean "x behaves like 2": > I really don't like that. That makes it hard to reason about code. Well, look at that: >>> x=-2 >>> "abcdef"[x:] # x behaves like 4: 'ef' >>> "abcd"[x:] # x behaves like 2: 'cd' Magic indeed. Not to mention what happens when it happens to be x=-0 Unlike ordinary int, where -x is an abbreviation for 0-x, so -0 == 0-0 == 0, in the context of slicing and element access (in a list, not in a dict) -x is an abbreviation for len(this list)-x, so -0 == len(this list)-0 != 0. Well, End is len(this list), or equivalently it's "0 (mod len this list)". I think it's natural. > End seems to me to be an extremely sophisticated object, far too > sophisticated for slicing, which really ought to be conceptually and > practically a simple operation. I would not have to like to explain this > to beginners to Python. I especially would not like to explain how it > works. (How would it work?) say lst[:end-0] become lst[slice(end-0), and inside list you take stop(self) and continue from there as before. (Turns out slice does not take keyword arguments. Why is that?) There are other ways, of course. Which makes it possible to pass any other callable, unless you want to explicitly forbid it. I agree that this: "abcdef"[1:(lambda lst: lst[0])] is a horrible idea. The reason it is hard to explain is that you don't want to explain functional programming to a beginner. But if the End object (which I don't think is "extremely" sophisticated) will be accessible only from within a slice expression, all you have to explain is that it is context-dependent, regardless of how it is actually implemented. It's just a readable "$", if you like. Slice notation is already a domain specific sublanguage of its own right; for example, nowhere else is x:y:z legal at all, and nowhere else does -x have a similar meaning. As for the negative stride, I would suggest a terminating minus for "reverse=True" or Nick's rslice. Possibly with a preceding comma: "abcdef"[1:-1 -] == "edcb" "abcdef"[-] == "fedcba" "abcdef"[1:end-1:2, -] == ''.join(reversed("abcdef"[1:-1:2])) == "db" (I don't understand why can't we have -[1,2,3] == list(reversed([1,2,3])) I'm sure that's a conscious decision but is it documented anywhere? Yes, it's not exactly a negation, but then 'ab'+'cd' is not exactly an addition - it is not commutative. nor does it have an inverse. It's just an intuitive notation, and I think "xyz" == -"zyx" is pretty intuitive; much more so than "zyx"[::-1], although not much more than "xyz"[-]. Taking it one step further, you can have things like "abcde"[-(1:3)] == "abcde"[-slice(1,3)] == "cb" Again, this look intuitive to me. I admit that my niece failed to guess this meaning so I might be wrong; She is completely unfamiliar with Python though) Elazar From barry at python.org Fri Nov 1 13:16:52 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 1 Nov 2013 08:16:52 -0400 Subject: [Python-ideas] Support os.path.join for Windows paths on Posix References: Message-ID: <20131101081652.7800a56b@anarchist> On Oct 30, 2013, at 05:58 PM, Ryan Gonzalez wrote: >1.Python 3 doesn't come with Ubuntu $ apt-get install python3 Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From oscar.j.benjamin at gmail.com Fri Nov 1 14:00:28 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Fri, 1 Nov 2013 13:00:28 +0000 Subject: [Python-ideas] Where did we go wrong with negative stride? In-Reply-To: <20131101101752.GE18730@ando> References: <52719D47.1010507@mrabarnett.plus.com> <5271B415.8080607@mrabarnett.plus.com> <5271B5B2.7050209@stoneleaf.us> <20131101101752.GE18730@ando> Message-ID: On 1 November 2013 10:17, Steven D'Aprano wrote: > On Thu, Oct 31, 2013 at 03:37:36PM +0200, ????? wrote: > >> Perhaps we can get this End object by adding two tokens: ":-" and "[-". So >> >> a[-3:-5] == a[slice(End-3, End-5, None)] > > That's ambiguous. Consider: > > a[-2] > > Is that a dict lookup with key -2, or a list indexed with End-2? Or > worse, a dict lookup with key len(a)-2. I've been thinking about that. The End object would have to be unhashable just like slice objects: $ python3 Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> d = {} >>> d[1:2] Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'slice' > To break the ambiguity, we'd need a rule that End objects can only occur > inside slices with at least one colon. This would defeat much of the point of having a new notation. The problem with negative wraparound applies just as much to ordinary indexing as to slicing. > But, I think that means that the > parser would have to look ahead to see whether it was within a slice or > not, and that might not be possible with Python's parser. If you've tried to write a parser for Python expressions you'll know that it takes a lot of special casing to handle slices anyway. (It would be simpler if slices were a valid expression in their own right). > Even if were, > it's still a special case that -2 means something different inside a > slice than outside a slice, and we know what the Zen of Python says > about special cases. > > >> although it will turn a[-3] into a[End-3]. I don't think it's a >> problem if the latter will behave in the same way as the former (i.e >> End-3 be a subtype of int). > > So what would (End-3)*10 return? How about End & 0xF ? I would expect End to just behave like an integer in the index/slice expression. [snip] > > All this seems very clever, but as far as I'm concerned, it's too > clever. I don't like objects which are context-sensitive. Given: > > x = End-2 > > then in this context, x behaves like 4: > > "abcdef"[x:] > > while in this context, x behaves like 0: > > "abcd"[x:] > > I really don't like that. That makes it hard to reason about code. I think that's a good argument for not having a magic object. Matlab doesn't allow you to use the keyword 'end' outside of an index/slice expression (well actually it does but it's used to signify the end of a block rather than as a magic object). Note that it's currently hard to reason about something like "abcde"[x:] because you need to know the sign of x to understand what it does. > End seems to me to be an extremely sophisticated object, far too > sophisticated for slicing, which really ought to be conceptually and > practically a simple operation. I would not have to like to explain this > to beginners to Python. I especially would not like to explain how it > works. (How would it work?) I think the only answer is, "It's magic". I > think it is something that would be right at home in PHP or Perl, and I > don't mean that as an insult, but only that it's not a good fit to > Python. I agree that the magic End object is a probably a bad idea on the basis that it should never be used outside of a slice/index expression. I'm still thinking about what would be a good, backward-compatible way of implementing something that achieves the desired semantics. Oscar From elazarg at gmail.com Fri Nov 1 14:26:32 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Fri, 1 Nov 2013 15:26:32 +0200 Subject: [Python-ideas] Where did we go wrong with negative stride? In-Reply-To: References: <52719D47.1010507@mrabarnett.plus.com> <5271B415.8080607@mrabarnett.plus.com> <5271B5B2.7050209@stoneleaf.us> <20131101101752.GE18730@ando> Message-ID: 2013/11/1 Oscar Benjamin : > On 1 November 2013 10:17, Steven D'Aprano wrote: >> To break the ambiguity, we'd need a rule that End objects can only occur >> inside slices with at least one colon. > > This would defeat much of the point of having a new notation. The > problem with negative wraparound applies just as much to ordinary > indexing as to slicing. > Not exactly as much. taking x[:-0] # intention: x[:len(x)] is a reasonable, which happens to fail in Python. While x[-0] # intention: x[len(x)] is an error in the first place, which happens not to raise an Exception in Python, but rather gives you a wrong result. Elazar From oscar.j.benjamin at gmail.com Fri Nov 1 15:21:05 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Fri, 1 Nov 2013 14:21:05 +0000 Subject: [Python-ideas] Where did we go wrong with negative stride? In-Reply-To: References: <52719D47.1010507@mrabarnett.plus.com> <5271B415.8080607@mrabarnett.plus.com> <5271B5B2.7050209@stoneleaf.us> <20131101101752.GE18730@ando> Message-ID: On 1 November 2013 13:26, ????? wrote: > 2013/11/1 Oscar Benjamin : >> On 1 November 2013 10:17, Steven D'Aprano wrote: >>> To break the ambiguity, we'd need a rule that End objects can only occur >>> inside slices with at least one colon. >> >> This would defeat much of the point of having a new notation. The >> problem with negative wraparound applies just as much to ordinary >> indexing as to slicing. >> > Not exactly as much. taking > > x[:-0] # intention: x[:len(x)] > > is a reasonable, which happens to fail in Python. While > > x[-0] # intention: x[len(x)] > > is an error in the first place, which happens not to raise an > Exception in Python, but rather gives you a wrong result. I'm not really sure what you mean by this so I'll clarify what I mean: If I write x[-n] then my intention is that n should always be positive. If n happens to be zero or negative then I really want an IndexError but I won't get one because it coincidentally has an alternate meaning. I would rather be able to spell that as x[end-n] and have any negative index be an error. While I can write x[len(x)-n] right now it still does the wrong thing when n>len(x). Oscar From rymg19 at gmail.com Fri Nov 1 15:39:14 2013 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Fri, 1 Nov 2013 09:39:14 -0500 Subject: [Python-ideas] Support os.path.join for Windows paths on Posix In-Reply-To: References: Message-ID: I've got Ubuntu 12. Plus, it takes too long to type all that. And I've filled up my disk with stuff as it is(RPython/PyPy, C--, LLVM, Clang, BFF, CodeBlocks, IDLE, etc.) On Fri, Nov 1, 2013 at 3:21 AM, Niki Spahiev wrote: > On 31.10.2013 00:58, Ryan Gonzalez wrote: > >> 1.Python 3 doesn't come with Ubuntu >> > > ?!? Ubuntu 13 is based on python 3. > > HTH > Niki > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/**mailman/listinfo/python-ideas > -- Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From skip at pobox.com Fri Nov 1 19:15:46 2013 From: skip at pobox.com (Skip Montanaro) Date: Fri, 1 Nov 2013 13:15:46 -0500 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent Message-ID: http://bugs.python.org/issue19475 I was told to come here. Is there some reason that datetime objects' __str__ and isoformat methods shouldn't emit microseconds in all cases for consistency? As things stand, datetime.strptime() can't reliably parse what those methods emit. Details on the above (close) ticket. Here's what I'm talkin' 'bout: >>> import datetime >>> dt1 = datetime.datetime(2013, 10, 30, 14, 26, 50) >>> dt2 = datetime.datetime(2013, 10, 30, 14, 26, 50, 1234) >>> datetime.datetime.strptime(str(dt1), "%Y-%m-%d %H:%M:%S") datetime.datetime(2013, 10, 30, 14, 26, 50) >>> datetime.datetime.strptime(str(dt2), "%Y-%m-%d %H:%M:%S") Traceback (most recent call last): File "", line 1, in File "/usr/bin/python27/lib/python2.7/_strptime.py", line 328, in _strptime data_string[found.end():]) ValueError: unconverted data remains: .001234 >>> datetime.datetime.strptime(str(dt2), "%Y-%m-%d %H:%M:%S.%f") datetime.datetime(2013, 10, 30, 14, 26, 50, 1234) >>> datetime.datetime.strptime(str(dt1), "%Y-%m-%d %H:%M:%S.%f") Traceback (most recent call last): File "", line 1, in File "/usr/bin/python27/lib/python2.7/_strptime.py", line 325, in _strptime (data_string, format)) ValueError: time data '2013-10-30 14:26:50' does not match format '%Y-%m-%d %H:%M:%S.%f' Do others agree with me that consistency in this situation is better than the current behavior? Thx, Skip Montanaro From ethan at stoneleaf.us Fri Nov 1 20:49:35 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Nov 2013 12:49:35 -0700 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: Message-ID: <527405CF.3070309@stoneleaf.us> On 11/01/2013 11:15 AM, Skip Montanaro wrote: > http://bugs.python.org/issue19475 > > Do others agree with me that consistency in this situation is better than > the current behavior? I haven't seen the arguments in favor of this awkward behavior, so I may change my mind, but at the moment I would certainly argue for consistency: either emit microseconds in __str__ or ignore remaining microseconds or ignore a trailing %f (or all three ;) . -- ~Ethan~ From skip at pobox.com Fri Nov 1 21:29:10 2013 From: skip at pobox.com (Skip Montanaro) Date: Fri, 1 Nov 2013 15:29:10 -0500 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <527405CF.3070309@stoneleaf.us> References: <527405CF.3070309@stoneleaf.us> Message-ID: > I haven't seen the arguments in favor of this awkward behavior, so I may > change my mind, but at the moment I would certainly argue for consistency: > either emit microseconds in __str__ or ignore remaining microseconds or > ignore a trailing %f (or all three ;) . Thanks for the response. I relented after seeing comments from Guido and Tim. It is highly unlikely that I'd be able to sway the major devs. Modifying __str__ is a complete non-starter, and Guido pushed back a bit on the notion of making isoformat() include microseconds. (He suggested maybe adding an "include microseconds" flag, which I think would be as bad as the current behavior. I will preformat my datetime objects in code that writes them out so I avoid the automatic stringification done by the csv module. Skip From python at mrabarnett.plus.com Fri Nov 1 21:37:23 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 01 Nov 2013 20:37:23 +0000 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <527405CF.3070309@stoneleaf.us> References: <527405CF.3070309@stoneleaf.us> Message-ID: <52741103.5050201@mrabarnett.plus.com> On 01/11/2013 19:49, Ethan Furman wrote: > On 11/01/2013 11:15 AM, Skip Montanaro wrote: >> http://bugs.python.org/issue19475 >> >> Do others agree with me that consistency in this situation is better than >> the current behavior? > > I haven't seen the arguments in favor of this awkward behavior, so I may change my mind, but at the moment I would > certainly argue for consistency: either emit microseconds in __str__ or ignore remaining microseconds or ignore a > trailing %f (or all three ;) . > Suppose there were, say, 900000 microseconds, i.e. 0.9 seconds? If the microseconds aren't shown by __str__, should it truncate or round? When parsing and there's no %f, should it ignore the microseconds, thus effectively truncating, or parse and round if necessary? From ethan at stoneleaf.us Fri Nov 1 21:22:18 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Nov 2013 13:22:18 -0700 Subject: [Python-ideas] Support os.path.join for Windows paths on Posix In-Reply-To: References: Message-ID: <52740D7A.2030100@stoneleaf.us> On 11/01/2013 07:39 AM, Ryan Gonzalez wrote: > I've got Ubuntu 12. Plus, it takes too long to type all that. It takes too long to say "Python 3 doesn't come with Ubuntu 12"? Three extra characters to be accurate and correct, but that was too big a burden for you? -- ~Ethan~ From barry at python.org Fri Nov 1 22:26:44 2013 From: barry at python.org (Barry Warsaw) Date: Fri, 1 Nov 2013 17:26:44 -0400 Subject: [Python-ideas] Support os.path.join for Windows paths on Posix References: <52740D7A.2030100@stoneleaf.us> Message-ID: <20131101172644.43f8e683@anarchist> On Nov 01, 2013, at 01:22 PM, Ethan Furman wrote: >On 11/01/2013 07:39 AM, Ryan Gonzalez wrote: >> I've got Ubuntu 12. Plus, it takes too long to type all that. >It takes too long to say "Python 3 doesn't come with Ubuntu 12"? Three extra >characters to be accurate and correct, but that was too big a burden for you? Well actually, I'm not even sure what "Ubuntu 12" is. Is that 12.04 or 12.10? My guess is 12.04 LTS, but it's just a guess. ;) In any case, a version of Python 3 has been available in Ubuntu since at least 2010: % rmadison python3 python3 | 3.1.2-0ubuntu1 | lucid | all python3 | 3.2.3-0ubuntu1 | precise | all python3 | 3.2.3-0ubuntu1.2 | precise-updates | amd64, armel, armhf, i386, powerpc python3 | 3.2.3-5ubuntu1 | quantal | all python3 | 3.2.3-5ubuntu1.2 | quantal-updates | amd64, armel, armhf, i386, powerpc python3 | 3.3.1-0ubuntu1 | raring | amd64, armhf, i386, powerpc python3 | 3.3.2-14ubuntu1 | saucy | amd64, arm64, armhf, i386, powerpc python3 | 3.3.2-14ubuntu1 | trusty | amd64, arm64, armhf, i386, powerpc Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From greg.ewing at canterbury.ac.nz Fri Nov 1 22:45:36 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 02 Nov 2013 10:45:36 +1300 Subject: [Python-ideas] Where did we go wrong with negative stride? In-Reply-To: References: <52719D47.1010507@mrabarnett.plus.com> <5271B415.8080607@mrabarnett.plus.com> <5271B5B2.7050209@stoneleaf.us> <20131101101752.GE18730@ando> Message-ID: <52742100.1050905@canterbury.ac.nz> ????? wrote: > x[-0] # intention: x[len(x)] > > is an error in the first place, which happens not to raise an > Exception in Python, but rather gives you a wrong result. Using x[End-n] would allow an exception to be properly raised when n is out of bounds instead of spuriously wrapping around. -- Greg From ethan at stoneleaf.us Fri Nov 1 22:29:16 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Nov 2013 14:29:16 -0700 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <52741103.5050201@mrabarnett.plus.com> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> Message-ID: <52741D2C.5040206@stoneleaf.us> On 11/01/2013 01:37 PM, MRAB wrote: > On 01/11/2013 19:49, Ethan Furman wrote: >> On 11/01/2013 11:15 AM, Skip Montanaro wrote: >>> >>> http://bugs.python.org/issue19475 >>> >>> Do others agree with me that consistency in this situation is better than >>> the current behavior? >> >> I haven't seen the arguments in favor of this awkward behavior, so I may change my mind, but at the moment I would >> certainly argue for consistency: either emit microseconds in __str__ or ignore remaining microseconds or ignore a >> trailing %f (or all three ;) . >> > Suppose there were, say, 900000 microseconds, i.e. 0.9 seconds? > > If the microseconds aren't shown by __str__, should it truncate or > round? My first (and preferred) option was to emit the microseconds (even if 0). At the heart of this issue is: If datetime.datetime is not going to always emit the microseconds, then there should be an easy roundtrip method to get the value back; currently there is not. Maybe the best fix is to make the csv module smarter about how it outputs datetime's (so it would always emit microseconds). -- ~Ethan~ From breamoreboy at yahoo.co.uk Fri Nov 1 23:14:56 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Fri, 01 Nov 2013 22:14:56 +0000 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <52741D2C.5040206@stoneleaf.us> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> Message-ID: On 01/11/2013 21:29, Ethan Furman wrote: > On 11/01/2013 01:37 PM, MRAB wrote: >> On 01/11/2013 19:49, Ethan Furman wrote: >>> On 11/01/2013 11:15 AM, Skip Montanaro wrote: >>>> >>>> http://bugs.python.org/issue19475 >>>> >>>> Do others agree with me that consistency in this situation is better >>>> than >>>> the current behavior? >>> >>> I haven't seen the arguments in favor of this awkward behavior, so I >>> may change my mind, but at the moment I would >>> certainly argue for consistency: either emit microseconds in __str__ >>> or ignore remaining microseconds or ignore a >>> trailing %f (or all three ;) . >>> >> Suppose there were, say, 900000 microseconds, i.e. 0.9 seconds? >> >> If the microseconds aren't shown by __str__, should it truncate or >> round? > > My first (and preferred) option was to emit the microseconds (even if 0). > > At the heart of this issue is: If datetime.datetime is not going to > always emit the microseconds, then there should be an easy roundtrip > method to get the value back; currently there is not. > > Maybe the best fix is to make the csv module smarter about how it > outputs datetime's (so it would always emit microseconds). > > -- > ~Ethan~ The first solution if anything please, the latter is the road to hell. "You've done it for datetime objects, so why not xyz?". "A similar thing was done to the csv module, so why not the ijk module?". -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From tjreedy at udel.edu Fri Nov 1 23:16:07 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 01 Nov 2013 18:16:07 -0400 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: <527405CF.3070309@stoneleaf.us> Message-ID: On 11/1/2013 4:29 PM, Skip Montanaro wrote: > Thanks for the response. I relented after seeing comments from Guido > and Tim. It is highly unlikely that I'd be able to sway the major > devs. Modifying __str__ is a complete non-starter, and Guido pushed > back a bit on the notion of making isoformat() include microseconds. Having repr() always include microseconds makes more sense to me. I have no idea what it does since it was not discussed. > (He suggested maybe adding an "include microseconds" flag, which I > think would be as bad as the current behavior. Why? Seems like a good solution to me. -- Terry Jan Reedy From eric at trueblade.com Fri Nov 1 23:20:48 2013 From: eric at trueblade.com (Eric V. Smith) Date: Fri, 01 Nov 2013 18:20:48 -0400 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> Message-ID: <52742940.8000808@trueblade.com> On 11/1/2013 6:14 PM, Mark Lawrence wrote: > On 01/11/2013 21:29, Ethan Furman wrote: >> On 11/01/2013 01:37 PM, MRAB wrote: >>> On 01/11/2013 19:49, Ethan Furman wrote: >>>> On 11/01/2013 11:15 AM, Skip Montanaro wrote: >>>>> >>>>> http://bugs.python.org/issue19475 >>>>> >>>>> Do others agree with me that consistency in this situation is better >>>>> than >>>>> the current behavior? >>>> >>>> I haven't seen the arguments in favor of this awkward behavior, so I >>>> may change my mind, but at the moment I would >>>> certainly argue for consistency: either emit microseconds in __str__ >>>> or ignore remaining microseconds or ignore a >>>> trailing %f (or all three ;) . >>>> >>> Suppose there were, say, 900000 microseconds, i.e. 0.9 seconds? >>> >>> If the microseconds aren't shown by __str__, should it truncate or >>> round? >> >> My first (and preferred) option was to emit the microseconds (even if 0). >> >> At the heart of this issue is: If datetime.datetime is not going to >> always emit the microseconds, then there should be an easy roundtrip >> method to get the value back; currently there is not. >> >> Maybe the best fix is to make the csv module smarter about how it >> outputs datetime's (so it would always emit microseconds). >> >> -- >> ~Ethan~ > > The first solution if anything please, the latter is the road to hell. > "You've done it for datetime objects, so why not xyz?". "A similar > thing was done to the csv module, so why not the ijk module?". But since csv calls str(value), and __str__ can't be changed due to backward compatibility constraints, nothing can be done, right? Eric. From ethan at stoneleaf.us Fri Nov 1 22:51:38 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Nov 2013 14:51:38 -0700 Subject: [Python-ideas] Round-tripping datetime's str Message-ID: <5274226A.10306@stoneleaf.us> Issue15873 [1] talks about adding the ability for datetime.datetime (and friends?) to accept a string to create the relevant object (and not just numbers). Thoughts? [1] http://bugs.python.org/issue15873 -- ~Ethan~ From abarnert at yahoo.com Sat Nov 2 01:28:09 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 1 Nov 2013 17:28:09 -0700 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <52741D2C.5040206@stoneleaf.us> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> Message-ID: On Nov 1, 2013, at 14:29, Ethan Furman wrote: > On 11/01/2013 01:37 PM, MRAB wrote: >> On 01/11/2013 19:49, Ethan Furman wrote: >>> On 11/01/2013 11:15 AM, Skip Montanaro wrote: >>>> >>>> http://bugs.python.org/issue19475 >>>> >>>> Do others agree with me that consistency in this situation is better than >>>> the current behavior? >>> >>> I haven't seen the arguments in favor of this awkward behavior, so I may change my mind, but at the moment I would >>> certainly argue for consistency: either emit microseconds in __str__ or ignore remaining microseconds or ignore a >>> trailing %f (or all three ;) . >> Suppose there were, say, 900000 microseconds, i.e. 0.9 seconds? >> >> If the microseconds aren't shown by __str__, should it truncate or >> round? > > My first (and preferred) option was to emit the microseconds (even if 0). > > At the heart of this issue is: If datetime.datetime is not going to always emit the microseconds, then there should be an easy roundtrip method to get the value back; currently there is not. > > Maybe the best fix is to make the csv module smarter about how it outputs datetime's (so it would always emit microseconds). Why should the round trip involve strftime? What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat? That means if isoformat sometimes emits microseconds and sometimes doesn't, fromisoformat can take both strings with microseconds and those without. (In fact, there's no reason it couldn't be even more flexible and handle all of the valid variations of ISO format, not just the ones isoformat generates...) From abarnert at yahoo.com Sat Nov 2 01:33:37 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 1 Nov 2013 17:33:37 -0700 Subject: [Python-ideas] Round-tripping datetime's str In-Reply-To: <5274226A.10306@stoneleaf.us> References: <5274226A.10306@stoneleaf.us> Message-ID: On Nov 1, 2013, at 14:51, Ethan Furman wrote: > Issue15873 [1] talks about adding the ability for datetime.datetime (and friends?) to accept a string to create the relevant object (and not just numbers). > > Thoughts? It looks like this is exactly what I just suggested on your other thread, except that the original version at the top of the feature request suggested doing it automatically in __new__, and it's only farther down the comments that people suggested a fromisoformat classmethod instead. So, as you might expect, I like this idea, but I like it as a classmethod rather than magically in __new__. > > > > [1] http://bugs.python.org/issue15873 > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From alexander.belopolsky at gmail.com Sat Nov 2 01:36:07 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 1 Nov 2013 20:36:07 -0400 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> Message-ID: On Fri, Nov 1, 2013 at 8:28 PM, Andrew Barnert wrote: > What about just adding a fromisoformat classmethod constructor that's > meant specifically for round tripping isoformat? What about following the lead of int(), float(), etc. and allow datetime() take a single string argument? -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat Nov 2 02:52:43 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 1 Nov 2013 18:52:43 -0700 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> Message-ID: <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> On Nov 1, 2013, at 17:36, Alexander Belopolsky wrote: > > On Fri, Nov 1, 2013 at 8:28 PM, Andrew Barnert wrote: >> What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat? > > What about following the lead of int(), float(), etc. and allow datetime() take a single string argument? I have three (not entirely separate) concerns. The first is that this would imply that ISO is _the_ format for datetimes, rather than just _a_ format. If you ask a normal person for an integer, unless he's got the Super Bowl in the brain, int will parse his input. Ask him for a date, and it's very unlikely it'll be in ISO format. The second problem is false positives. "2003" is a valid ISO date string, equivalent to "20030101" or "20030101T00:00:00". But in most contexts you wouldn't want that interpreted as a valid date or datetime. John Nagle's comment on the tracker (http://bugs.python.org/issue15873#msg169966) explains a similar concern. Finally, I'd be happy with a fromisoformat that _only_ handled the output from the isoformat function, but a general constructor would seem incomplete unless it handled all of ISO 8601, or at least all of RFC 3339. That isn't exactly _hard_ (the RFC was meant to be implementable, after all), but it's a higher bar than needed for the problem that started this thread. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Nov 2 03:17:18 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 1 Nov 2013 22:17:18 -0400 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> Message-ID: On Fri, Nov 1, 2013 at 9:52 PM, Andrew Barnert wrote: > The first is that this would imply that ISO is _the_ format for datetimes, > rather than just _a_ format. If you ask a normal person for an integer, > unless he's got the Super Bowl in the brain, int will parse his input. Ask > him for a date, and it's very unlikely it'll be in ISO format. > I don't see how this is a problem. *The* format for Python datetime is the output of print: >>> from datetime import datetime >>> print(datetime.now()) 2013-11-01 22:06:47.774767 It also happened to be ISO compliant. > The second problem is false positives. "2003" is a valid ISO date string, > equivalent to "20030101" or "20030101T00:00:00". But in most contexts you > wouldn't want that interpreted as a valid date or datetime. John Nagle's > comment on the tracker (http://bugs.python.org/issue15873#msg169966) > explains a similar concern. > I would start with date/datetime() just accepting str(x) as input for any date/datetime instance. Full-featured ISO-compliant date/time parsing is better left for third-party packages. > Finally, I'd be happy with a fromisoformat that _only_ handled the output > from the isoformat function, but a general constructor would seem > incomplete unless it handled all of ISO 8601, or at least all of RFC 3339. > That isn't exactly _hard_ (the RFC was meant to be implementable, after > all), but it's a higher bar than needed for the problem that started this > thread. > As long as the general constructor can handle str() output it seems to be complete to me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Nov 2 03:02:31 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 01 Nov 2013 19:02:31 -0700 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> Message-ID: <52745D37.5080806@stoneleaf.us> On 11/01/2013 06:52 PM, Andrew Barnert wrote: > On Nov 1, 2013, at 17:36, Alexander Belopolsky wrote: >> On Fri, Nov 1, 2013 at 8:28 PM, Andrew Barnert wrote: >>> >>> What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat? >> >> What about following the lead of int(), float(), etc. and allow datetime() take a single string argument? > > [snip] > > Finally, I'd be happy with a fromisoformat that _only_ handled the output from the isoformat function [snip] I'd be happy with having __new__ accept a string in whatever format __str__ outputs (not isoformat or anything else, just __str__). -- ~Ethan~ From ron3200 at gmail.com Sat Nov 2 03:44:47 2013 From: ron3200 at gmail.com (ron adam) Date: Fri, 01 Nov 2013 21:44:47 -0500 Subject: [Python-ideas] Where did we go wrong with negative stride? In-Reply-To: References: <526D4FF7.6010106@mrabarnett.plus.com> <526DA58B.7080504@canterbury.ac.nz> <526F2EE0.9010705@mrabarnett.plus.com> <52701846.2070604@mrabarnett.plus.com> Message-ID: On Wed, 30 Oct 2013 20:22:22 +1000, Nick Coghlan wrote: > Regardless, my main point is this: slices are just objects. The syntax: > s[i:j:k] > is just syntactic sugar for: > s[slice(i, j, k)] > That means that until people have fully explored exactly the semantics > they want in terms of the existing object model, just as I did for > rslice(), then there are *zero* grounds to be discussing syntax > changes that provide those new semantics. (Sending from gmane on limited tablet) I think its sugar for a function that takes *args and depending on its contents makes a slice, or multiple slices, plus what ever is left over. In the case of a simple index its just whats left over. Slice syntax is simple on purpose so that they can pass through more than just ints. It the responsibility of the object that uses them to make sense of the slices. So it may possible to pass a callable index modifier through too. def reversed(obj, slice_obj): ... return reversed_slice_obj a[i:j:k, reversed] If it's also passed the object to be sliced, self from __getitem__ method, it could set end values and maybe do other things like control weather exceptions are raised or not. Its kind of like decorating a slice. For example it could do an infanite wrap around slice by normalizing the indices of the slice object. Ican't test it on this tablet unfornately. Cheers, Ron Adam. From python at mrabarnett.plus.com Sat Nov 2 03:49:17 2013 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 02 Nov 2013 02:49:17 +0000 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> Message-ID: <5274682D.3040401@mrabarnett.plus.com> On 02/11/2013 02:17, Alexander Belopolsky wrote: > > On Fri, Nov 1, 2013 at 9:52 PM, Andrew Barnert > wrote: > > The first is that this would imply that ISO is _the_ format for > datetimes, rather than just _a_ format. If you ask a normal person > for an integer, unless he's got the Super Bowl in the brain, int > will parse his input. Ask him for a date, and it's very unlikely > it'll be in ISO format. > > > I don't see how this is a problem. *The* format for Python datetime is > the output of print: > > >>> from datetime import datetime > >>> print(datetime.now()) > 2013-11-01 22:06:47.774767 > > It also happened to be ISO compliant. > > > > The second problem is false positives. "2003" is a valid ISO date > string, equivalent to "20030101" or "20030101T00:00:00". But in most > contexts you wouldn't want that interpreted as a valid date or > datetime. John Nagle's comment on the tracker > (http://bugs.python.org/issue15873#msg169966) explains a similar > concern. > > > I would start with date/datetime() just accepting str(x) as input for > any date/datetime instance. Full-featured ISO-compliant date/time > parsing is better left for third-party packages. > > Finally, I'd be happy with a fromisoformat that _only_ handled the > output from the isoformat function, but a general constructor would > seem incomplete unless it handled all of ISO 8601, or at least all > of RFC 3339. That isn't exactly _hard_ (the RFC was meant to be > implementable, after all), but it's a higher bar than needed for the > problem that started this thread. > > > As long as the general constructor can handle str() output it seems to > be complete to me. > +1 Guaranteeing round-trip does seem like a reasonable behaviour to me. For anything else, be explicit with strftime and strptime. From ncoghlan at gmail.com Sat Nov 2 10:25:29 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 2 Nov 2013 19:25:29 +1000 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <5274682D.3040401@mrabarnett.plus.com> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> <5274682D.3040401@mrabarnett.plus.com> Message-ID: On 2 November 2013 12:49, MRAB wrote: > On 02/11/2013 02:17, Alexander Belopolsky wrote: > +1 > > Guaranteeing round-trip does seem like a reasonable behaviour to me. It's been a while since I had to deal seriously with date parsing, but at the time, emitting microseconds was a fairly surefire way to break most utilities that nominally supported date parsing. Roundtripping is good, but interoperability is important too, and as far as I am aware, microsecond support when parsing is still sketchy with many date parsing tools. Ensuring that emitting and consuming microseconds is easy would definitely be a good thing, but unless general date parsing support (not just in Python, but in programming utilities in general) has improved more dramatically in recent years than I believe it has, emitting microseconds by default would be a backwards compatibility breach. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From masklinn at masklinn.net Sat Nov 2 14:13:01 2013 From: masklinn at masklinn.net (Masklinn) Date: Sat, 2 Nov 2013 14:13:01 +0100 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> Message-ID: On 2013-11-02, at 02:52 , Andrew Barnert wrote: > On Nov 1, 2013, at 17:36, Alexander Belopolsky wrote: > >> >> On Fri, Nov 1, 2013 at 8:28 PM, Andrew Barnert wrote: >> What about just adding a fromisoformat classmethod constructor that's meant specifically for round tripping isoformat? >> >> What about following the lead of int(), float(), etc. and allow datetime() take a single string argument? > > I have three (not entirely separate) concerns. > > The first is that this would imply that ISO is _the_ format for datetimes, rather than just _a_ format. If you ask a normal person for an integer, unless he?s got the Super Bowl in the brain, int will parse his input. Right, now try that with a float and watch `float(s)` blow up when most europeans give you `4,63` or something along those lines. Or ask for a large integer, and notice that int really does not like decimal separators. From masklinn at masklinn.net Sat Nov 2 16:20:42 2013 From: masklinn at masklinn.net (Masklinn) Date: Sat, 2 Nov 2013 16:20:42 +0100 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> Message-ID: <77D26847-FD41-410C-8218-AF485446358E@masklinn.net> On 2013-11-02, at 14:13 , Masklinn wrote: > Or ask for a large integer, and notice that int really does not like > decimal separators. (and by ?decimal? I meant ?thousands?, sorry about that) From ncoghlan at gmail.com Sat Nov 2 16:53:22 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 3 Nov 2013 01:53:22 +1000 Subject: [Python-ideas] PEP 451 (import API refactoring) and C extension loading Message-ID: Eric Snow's PEP 451 that refactors the import plugin API to move more boilerplate responsibility to the import system has come along nicely, and is now almost certain to make the beta 1 deadline. One of the guiding elements of that design has been the discussion we had a while ago on this list about making more information available to C extensions about how they're being loaded - the ModuleSpec objects in PEP 451 are a key piece in fulfilling that goal. However, I'm *not* confident it's going to be possible to work through all the issues involved in actually refactoring the dynamic loading infrastructure to use the new plugin model in the time we have left before feature freeze, particularly since getting the code aspects of PEP 451 completely bedded down is going to take some time (e.g. it opens up additional possibilities for the runpy API), and there are several other things that still need to happen prior to the beta (like getting ensurepip committed and completing the follow-on work). So, here's my current thinking: the PEP 451 refactoring makes it far more feasible to experiment with alternative C extension loading techniques through cffi, so I'd like to postpone actually changing the default extension loading system to Python 3.5. The idea of adding cffi itself to the standard library has been discussed, in which case we might even be able to move all of that ugly extension management code out of C and into Python permanently, rather than just as part of third party experiments. Thoughts? (cc'ing Stefan directly, since Cython is the most likely venue for such experiments) Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From stefan_ml at behnel.de Sat Nov 2 17:44:42 2013 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 02 Nov 2013 17:44:42 +0100 Subject: [Python-ideas] PEP 451 (import API refactoring) and C extension loading In-Reply-To: References: Message-ID: Nick Coghlan, 02.11.2013 16:53: > the PEP 451 refactoring makes it far > more feasible to experiment with alternative C extension loading > techniques through cffi That's a funny idea. Doesn't sound completely infeasible at first sight, cffi is all about shared library loading. > so I'd like to postpone actually changing the > default extension loading system to Python 3.5. The idea of adding > cffi itself to the standard library has been discussed, in which case > we might even be able to move all of that ugly extension management > code out of C and into Python permanently, rather than just as part of > third party experiments. cffi itself would then have to be statically linked in, obviously. And be available on all platforms that CPython currently supports. Stefan From rymg19 at gmail.com Sat Nov 2 18:24:15 2013 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Sat, 2 Nov 2013 12:24:15 -0500 Subject: [Python-ideas] Support os.path.join for Windows paths on Posix In-Reply-To: <20131101172644.43f8e683@anarchist> References: <52740D7A.2030100@stoneleaf.us> <20131101172644.43f8e683@anarchist> Message-ID: I know, I know, I just don't use Python 3 enough to make installing it useful. Besides, as I said earlier, I just installed Ubuntu on Sunday, and I've already filled up 38% of my partition. On Fri, Nov 1, 2013 at 4:26 PM, Barry Warsaw wrote: > On Nov 01, 2013, at 01:22 PM, Ethan Furman wrote: > > >On 11/01/2013 07:39 AM, Ryan Gonzalez wrote: > >> I've got Ubuntu 12. Plus, it takes too long to type all that. > > >It takes too long to say "Python 3 doesn't come with Ubuntu 12"? Three > extra > >characters to be accurate and correct, but that was too big a burden for > you? > > Well actually, I'm not even sure what "Ubuntu 12" is. Is that 12.04 or > 12.10? > My guess is 12.04 LTS, but it's just a guess. ;) > > In any case, a version of Python 3 has been available in Ubuntu since at > least > 2010: > > % rmadison python3 > python3 | 3.1.2-0ubuntu1 | lucid | all > python3 | 3.2.3-0ubuntu1 | precise | all > python3 | 3.2.3-0ubuntu1.2 | precise-updates | amd64, armel, armhf, > i386, powerpc > python3 | 3.2.3-5ubuntu1 | quantal | all > python3 | 3.2.3-5ubuntu1.2 | quantal-updates | amd64, armel, armhf, > i386, powerpc > python3 | 3.3.1-0ubuntu1 | raring | amd64, armhf, i386, powerpc > python3 | 3.3.2-14ubuntu1 | saucy | amd64, arm64, armhf, i386, > powerpc > python3 | 3.3.2-14ubuntu1 | trusty | amd64, arm64, armhf, i386, > powerpc > > Cheers, > -Barry > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -- Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sat Nov 2 20:43:51 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 02 Nov 2013 12:43:51 -0700 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> <5274682D.3040401@mrabarnett.plus.com> Message-ID: <527555F7.8040905@stoneleaf.us> On 11/02/2013 02:25 AM, Nick Coghlan wrote: > On 2 November 2013 12:49, MRAB wrote: >> On 02/11/2013 02:17, Alexander Belopolsky wrote: >> +1 >> >> Guaranteeing round-trip does seem like a reasonable behaviour to me. > > It's been a while since I had to deal seriously with date parsing, but > at the time, emitting microseconds was a fairly surefire way to break > most utilities that nominally supported date parsing. Roundtripping is > good, but interoperability is important too, and as far as I am aware, > microsecond support when parsing is still sketchy with many date > parsing tools. > > Ensuring that emitting and consuming microseconds is easy would > definitely be a good thing, but unless general date parsing support > (not just in Python, but in programming utilities in general) has > improved more dramatically in recent years than I believe it has, > emitting microseconds by default would be a backwards compatibility > breach. The thread seems to be leaning towards leaving the current __str__ behavior as-is, and simply adding enough smarts to __new__ to be able to reconvert back to a date/datetime instance (whether or not microseconds have been emitted). -- ~Ethan~ From python at mrabarnett.plus.com Sat Nov 2 22:19:04 2013 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 02 Nov 2013 21:19:04 +0000 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <527555F7.8040905@stoneleaf.us> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> <5274682D.3040401@mrabarnett.plus.com> <527555F7.8040905@stoneleaf.us> Message-ID: <52756C48.7080007@mrabarnett.plus.com> On 02/11/2013 19:43, Ethan Furman wrote: > On 11/02/2013 02:25 AM, Nick Coghlan wrote: >> On 2 November 2013 12:49, MRAB wrote: >>> On 02/11/2013 02:17, Alexander Belopolsky wrote: +1 >>> >>> Guaranteeing round-trip does seem like a reasonable behaviour to >>> me. >> >> It's been a while since I had to deal seriously with date parsing, >> but at the time, emitting microseconds was a fairly surefire way to >> break most utilities that nominally supported date parsing. >> Roundtripping is good, but interoperability is important too, and >> as far as I am aware, microsecond support when parsing is still >> sketchy with many date parsing tools. >> >> Ensuring that emitting and consuming microseconds is easy would >> definitely be a good thing, but unless general date parsing >> support (not just in Python, but in programming utilities in >> general) has improved more dramatically in recent years than I >> believe it has, emitting microseconds by default would be a >> backwards compatibility breach. > > The thread seems to be leaning towards leaving the current __str__ > behavior as-is, and simply adding enough smarts to __new__ to be able > to reconvert back to a date/datetime instance (whether or not > microseconds have been emitted). > The OP was using strptime, so perhaps the simplest solution would be to allow its 'format' argument to default to None, which would mean ISO format with optional microseconds. From ethan at stoneleaf.us Sat Nov 2 23:01:38 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 02 Nov 2013 15:01:38 -0700 Subject: [Python-ideas] Making datetime __str__ and isoformat more consistent In-Reply-To: <52756C48.7080007@mrabarnett.plus.com> References: <527405CF.3070309@stoneleaf.us> <52741103.5050201@mrabarnett.plus.com> <52741D2C.5040206@stoneleaf.us> <8008EEDA-85DF-4B38-B814-70836BB96925@yahoo.com> <5274682D.3040401@mrabarnett.plus.com> <527555F7.8040905@stoneleaf.us> <52756C48.7080007@mrabarnett.plus.com> Message-ID: <52757642.4070500@stoneleaf.us> On 11/02/2013 02:19 PM, MRAB wrote: > On 02/11/2013 19:43, Ethan Furman wrote: >> On 11/02/2013 02:25 AM, Nick Coghlan wrote: >>> >>> Ensuring that emitting and consuming microseconds is easy would >>> definitely be a good thing, but unless general date parsing >>> support (not just in Python, but in programming utilities in >>> general) has improved more dramatically in recent years than I >>> believe it has, emitting microseconds by default would be a >>> backwards compatibility breach. >> >> The thread seems to be leaning towards leaving the current __str__ >> behavior as-is, and simply adding enough smarts to __new__ to be able >> to reconvert back to a date/datetime instance (whether or not >> microseconds have been emitted). >> > The OP was using strptime, so perhaps the simplest solution would be to > allow its 'format' argument to default to None, which would mean ISO > format with optional microseconds. +1 with the caveat that None would mean whatever __str__ outputs (which at this time is coincidentally an ISO format). -- ~Ethan~ From ncoghlan at gmail.com Sun Nov 3 02:14:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 3 Nov 2013 11:14:23 +1000 Subject: [Python-ideas] PEP 451 (import API refactoring) and C extension loading In-Reply-To: References: Message-ID: On 3 Nov 2013 02:45, "Stefan Behnel" wrote: > > Nick Coghlan, 02.11.2013 16:53: > > the PEP 451 refactoring makes it far > > more feasible to experiment with alternative C extension loading > > techniques through cffi > > That's a funny idea. Doesn't sound completely infeasible at first sight, > cffi is all about shared library loading. > > > > so I'd like to postpone actually changing the > > default extension loading system to Python 3.5. The idea of adding > > cffi itself to the standard library has been discussed, in which case > > we might even be able to move all of that ugly extension management > > code out of C and into Python permanently, rather than just as part of > > third party experiments. > > cffi itself would then have to be statically linked in, obviously. And be > available on all platforms that CPython currently supports. Ah, true. Oh well, it's still a useful option for prototyping purposes :) Cheers, Nick. > > Stefan > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From random832 at fastmail.us Sun Nov 3 03:15:20 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Sat, 02 Nov 2013 22:15:20 -0400 Subject: [Python-ideas] os.path.join In-Reply-To: References: Message-ID: <1383444920.23282.42141045.2588B009@webmail.messagingengine.com> On Wed, Oct 30, 2013, at 13:06, Bruce Leban wrote: > >>> os.path.join(r'c:\abc', r'\def\g') # Windows paths > '\\def\\g' > > On Windows \def\g is a drive-relative path not an absolute path. To get > the > right result you need to do: > > >>> drive, path = os.path.splitdrive(r'c:\abc') > >>> drive + os.path.join(path, r'/def/g') > 'c:/def/g' What should it do in the opposite case: where the path is a relative path on another drive? I.e. os.path.join('c:\\abc','d:efg'). This is archaic, sure, but if we're going to say it's always equivalent to "as if the first argument were the cwd", this needs to be handled too. Fortunately, I don't think there's any way to specify this with a UNC share. From ryan at ryanhiebert.com Sun Nov 3 04:12:19 2013 From: ryan at ryanhiebert.com (Ryan Hiebert) Date: Sat, 2 Nov 2013 22:12:19 -0500 Subject: [Python-ideas] os.path.join In-Reply-To: <1383444920.23282.42141045.2588B009@webmail.messagingengine.com> References: <1383444920.23282.42141045.2588B009@webmail.messagingengine.com> Message-ID: IIRC correctly the form 'd:efg' refers to a relative path on the d: drive, based on the current working directory _of that drive_. Since it requires a default base path to be relative on, that form wouldn't work cross drives. However if we used the same drive, we might be willing to allow a relative path prefixed with a drive letter. os.path.join(r'c:\abc\foo', 'd:efg') # ERROR os.path.join(r'd:\abc\foo', 'd:efg') # returns r'd:\abc\efg' On Sat, Nov 2, 2013 at 9:15 PM, wrote: > On Wed, Oct 30, 2013, at 13:06, Bruce Leban wrote: >> >>> os.path.join(r'c:\abc', r'\def\g') # Windows paths >> '\\def\\g' >> >> On Windows \def\g is a drive-relative path not an absolute path. To get >> the >> right result you need to do: >> >> >>> drive, path = os.path.splitdrive(r'c:\abc') >> >>> drive + os.path.join(path, r'/def/g') >> 'c:/def/g' > > What should it do in the opposite case: where the path is a relative > path on another drive? I.e. os.path.join('c:\\abc','d:efg'). This is > archaic, sure, but if we're going to say it's always equivalent to "as > if the first argument were the cwd", this needs to be handled too. > Fortunately, I don't think there's any way to specify this with a UNC > share. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From ethan at stoneleaf.us Sun Nov 3 04:40:36 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 02 Nov 2013 20:40:36 -0700 Subject: [Python-ideas] os.path.join In-Reply-To: References: <1383444920.23282.42141045.2588B009@webmail.messagingengine.com> Message-ID: <5275C5B4.4030700@stoneleaf.us> On 11/02/2013 08:12 PM, Ryan Hiebert wrote: > IIRC correctly the form 'd:efg' refers to a relative path on the d: > drive, based on the current working directory _of that drive_. Since > it requires a default base path to be relative on, that form wouldn't > work cross drives. However if we used the same drive, we might be > willing to allow a relative path prefixed with a drive letter. > > os.path.join(r'c:\abc\foo', 'd:efg') # ERROR > os.path.join(r'd:\abc\foo', 'd:efg') # returns r'd:\abc\efg' Surely you meant r'd:\abc\foo\efg'. -- ~Ethan~ From bruce at leapyear.org Sun Nov 3 07:32:27 2013 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 2 Nov 2013 23:32:27 -0700 Subject: [Python-ideas] os.path.join In-Reply-To: References: <1383444920.23282.42141045.2588B009@webmail.messagingengine.com> Message-ID: On Sat, Nov 2, 2013 at 8:12 PM, Ryan Hiebert wrote: > IIRC correctly the form 'd:efg' refers to a relative path on the d: > drive, based on the current working directory _of that drive_. > Correct > Since > it requires a default base path to be relative on, that form wouldn't > work cross drives. > os.path.join doesn't require a full base path and isn't required to return an absolute path. > However if we used the same drive, we might be > willing to allow a relative path prefixed with a drive letter. > > os.path.join(r'c:\abc\foo', 'd:efg') # ERROR > This shouldn't be an error. The correct result is 'd:efg', i.e., relative to the current path on the D drive at the time the path is used, just as '\abc' is relative to the current drive at the time the path is used. --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From mcepl at redhat.com Mon Nov 4 15:51:39 2013 From: mcepl at redhat.com (=?utf-8?B?TWF0xJtq?= Cepl) Date: Mon, 4 Nov 2013 15:51:39 +0100 Subject: [Python-ideas] requests in the stdlib? Message-ID: <20131104145138.GA1553@wycliff.ceplovi.cz> Afternoon (or whatever else), gentlemen and gentle ladies! I guess I am not the first one who experienced a profound life-changing epiphany of meeting with the requests package. Yesterday after fighting for hours with urllib2 (and GitHub API) and failing, I have succeeded with requests in five minutes. Despite how much I dislike using libraries outside of the stdlib (yes, unittest is the way to go, please, no nosetests and py.test for me), I am now persuaded that requests API is the best thing since ... I don't know, API of feedparser (RIP, Aaron Swartz)? So, I guess, somebody had to already suggest pulling requests into stdlib. Could somebody here point me to the resulting discussion, please? Thank you, Mat?j Cepl -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 190 bytes Desc: not available URL: From senthil at uthcode.com Mon Nov 4 16:14:33 2013 From: senthil at uthcode.com (Senthil Kumaran) Date: Mon, 4 Nov 2013 07:14:33 -0800 Subject: [Python-ideas] requests in the stdlib? In-Reply-To: <20131104145138.GA1553@wycliff.ceplovi.cz> References: <20131104145138.GA1553@wycliff.ceplovi.cz> Message-ID: On Mon, Nov 4, 2013 at 6:51 AM, Mat?j Cepl wrote: > Yesterday after fighting for hours with urllib2 (and GitHub API) > and failing, I have succeeded with requests in five minutes. > That was the bug, which can be fixed. Thanks for raising the bug > > Despite how much I dislike using libraries outside of the stdlib > (yes, unittest is the way to go, please, no nosetests and > py.test for me), I am now persuaded that requests API is the > best thing since ... I don't know, API of feedparser (RIP, Aaron > Swartz)? > > So, I guess, somebody had to already suggest pulling requests > into stdlib. Could somebody here point me to the resulting > discussion, please? > I think, the plan is to our http handling library enhanced with the request's style API. It is discussed at python-dev last year and at pycon. But it still will need to bring in fixes which requests has been doing for handing latest issues. -- Senthil -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Mon Nov 4 16:25:11 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 4 Nov 2013 18:25:11 +0300 Subject: [Python-ideas] os.path.join In-Reply-To: References: Message-ID: On Thu, Oct 31, 2013 at 5:07 PM, Mark Lawrence wrote: > On 31/10/2013 10:31, anatoly techtonik wrote: >> >> On Wed, Oct 30, 2013 at 10:41 PM, Mark Lawrence >> wrote: >>> >>> On 30/10/2013 16:34, anatoly techtonik wrote: >>>> >>>> >>>> >>> os.path.join('/static', '/styles/largestyles.css') >>>> '/styles/largestyles.css' >>>> >>>> Is it only me who thinks that the code above is wrong? >>>> >>> >>> Is this the appropriate place for such a question? What is wrong with >>> the >>> main Python mailing list, Stackoverflow...? >>> >>> -- >>> Python is the second best programming language in the world. >>> But the best has yet to be invented. Christian Tismer >> >> >> Both Python ML and SO are bad for inventing new languages. >> -- >> anatoly t. >> > > I'm completely baffled by your comment, so please explain yourself. I mean that if you're going to invent new language or improve existing one, you need a place to keep notes about things that need to be improved, so that you can find them when the time comes. Neither ML nor StackOverflow (SO) are such place. From techtonik at gmail.com Mon Nov 4 16:30:06 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 4 Nov 2013 18:30:06 +0300 Subject: [Python-ideas] os.path.join In-Reply-To: <52727462.3050101@stoneleaf.us> References: <52727462.3050101@stoneleaf.us> Message-ID: On Thu, Oct 31, 2013 at 6:16 PM, Ethan Furman wrote: > On 10/31/2013 03:31 AM, anatoly techtonik wrote: >> >> On Wed, Oct 30, 2013 at 10:41 PM, Mark Lawrence >> wrote: >>> >>> >>> Is this the appropriate place for such a question? What is wrong with >>> the >>> main Python mailing list, Stackoverflow...? >> >> >> Both Python ML and SO are bad for inventing new languages. > > > If you're inventing a new language, why are you wasting time on a Python > venue? NIH + stand on the shoulders of giants. From breamoreboy at yahoo.co.uk Mon Nov 4 16:37:16 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Mon, 04 Nov 2013 15:37:16 +0000 Subject: [Python-ideas] os.path.join In-Reply-To: References: Message-ID: On 04/11/2013 15:25, anatoly techtonik wrote: > On Thu, Oct 31, 2013 at 5:07 PM, Mark Lawrence wrote: >> On 31/10/2013 10:31, anatoly techtonik wrote: >>> >>> On Wed, Oct 30, 2013 at 10:41 PM, Mark Lawrence >>> wrote: >>>> >>>> On 30/10/2013 16:34, anatoly techtonik wrote: >>>>> >>>>> >>>>> >>> os.path.join('/static', '/styles/largestyles.css') >>>>> '/styles/largestyles.css' >>>>> >>>>> Is it only me who thinks that the code above is wrong? >>>>> >>>> >>>> Is this the appropriate place for such a question? What is wrong with >>>> the >>>> main Python mailing list, Stackoverflow...? >>>> >>>> -- >>>> Python is the second best programming language in the world. >>>> But the best has yet to be invented. Christian Tismer >>> >>> >>> Both Python ML and SO are bad for inventing new languages. >>> -- >>> anatoly t. >>> >> >> I'm completely baffled by your comment, so please explain yourself. > > I mean that if you're going to invent new language or improve existing > one, you need a place to keep notes about things that need to be > improved, so that you can find them when the time comes. Neither ML > nor StackOverflow (SO) are such place. > What on earth has this got to do with your original question, a trivial thing about os.path.join? I'd say this is definitely not the place to be asking this particular question, in fact I'd say anywhere but here. -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From jeanpierreda at gmail.com Mon Nov 4 16:40:20 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Mon, 4 Nov 2013 07:40:20 -0800 Subject: [Python-ideas] os.path.join In-Reply-To: References: Message-ID: On Thu, Oct 31, 2013 at 3:30 AM, anatoly techtonik wrote: > On Wed, Oct 30, 2013 at 8:06 PM, Bruce Leban wrote: >> I don't know if the code is wrong but if you're asking if the *result* of >> join is wrong, I don't think it is. It references the same file as these >> commands: >> >> cd /static >> cat /styles/largestyles,css >> >> I agree it might be confusing but it's pretty explicitly documented. > > Yes. It is confusing. > > 1. How often the operations to join absolute paths is needed? Whenever user-provided paths are given relative to another directory, but you'd like them to be allowed to give an absolute path instead, the current behaviour is quite convenient. Otherwise, as you say, it won't come up. > 2. What is expected result of this operation? The expected result is the current behaviour. If it helps, I consider it this way: os.path.join(a, b, c) gives you the directory you'd be in if you cd into a, b, and c in turn. -- Devin From techtonik at gmail.com Mon Nov 4 16:29:27 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 4 Nov 2013 18:29:27 +0300 Subject: [Python-ideas] os.path.join In-Reply-To: <527273FE.2060104@stoneleaf.us> References: <527273FE.2060104@stoneleaf.us> Message-ID: On Thu, Oct 31, 2013 at 6:15 PM, Ethan Furman wrote: > On 10/31/2013 03:30 AM, anatoly techtonik wrote: >> >> On Wed, Oct 30, 2013 at 8:06 PM, Bruce Leban wrote: >>> >>> I don't know if the code is wrong but if you're asking if the *result* of >>> join is wrong, I don't think it is. It references the same file as these >>> commands: >>> >>> cd /static >>> cat /styles/largestyles,css >> >> >> 2. What is expected result of this operation? >> >> for 2 I'd expect 2nd path to be treated as relative one. > > > If 2 is a relative path, it shouldn't be leading with a slash. Right. But I am working more with URL paths nowadays. In there if I want to join two paths, no matter if 2nd starts with slash or not, I don't really expect the 2nd to rewrite the first. From techtonik at gmail.com Mon Nov 4 17:13:52 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 4 Nov 2013 19:13:52 +0300 Subject: [Python-ideas] os.path.join In-Reply-To: References: Message-ID: On Mon, Nov 4, 2013 at 6:37 PM, Mark Lawrence wrote: > On 04/11/2013 15:25, anatoly techtonik wrote: >> >> On Thu, Oct 31, 2013 at 5:07 PM, Mark Lawrence >> wrote: >>> >>> On 31/10/2013 10:31, anatoly techtonik wrote: >>>> >>>> >>>> On Wed, Oct 30, 2013 at 10:41 PM, Mark Lawrence >>>> >>>> wrote: >>>>> >>>>> >>>>> On 30/10/2013 16:34, anatoly techtonik wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>> os.path.join('/static', '/styles/largestyles.css') >>>>>> '/styles/largestyles.css' >>>>>> >>>>>> Is it only me who thinks that the code above is wrong? >>>>>> >>>>> >>>>> Is this the appropriate place for such a question? What is wrong with >>>>> the >>>>> main Python mailing list, Stackoverflow...? >>>>> >>>>> -- >>>>> Python is the second best programming language in the world. >>>>> But the best has yet to be invented. Christian Tismer >>>> >>>> >>>> >>>> Both Python ML and SO are bad for inventing new languages. >>>> -- >>>> anatoly t. >>>> >>> >>> I'm completely baffled by your comment, so please explain yourself. >> >> >> I mean that if you're going to invent new language or improve existing >> one, you need a place to keep notes about things that need to be >> improved, so that you can find them when the time comes. Neither ML >> nor StackOverflow (SO) are such place. >> > > What on earth has this got to do with your original question, a trivial > thing about os.path.join? I'd say this is definitely not the place to be > asking this particular question, in fact I'd say anywhere but here. The question was: 1. if this specific behavior for joining paths is still actual 2. if there are other people who don't expect it to work in this way I find Ruby approach more useful, and it's quite valuable info that the call can not be directly converted between stdlibs. From mcepl at redhat.com Mon Nov 4 17:14:37 2013 From: mcepl at redhat.com (=?UTF-8?B?TWF0xJtqIENlcGw=?=) Date: Mon, 04 Nov 2013 17:14:37 +0100 Subject: [Python-ideas] requests in the stdlib? In-Reply-To: References: <20131104145138.GA1553@wycliff.ceplovi.cz> Message-ID: <5277C7ED.1090601@redhat.com> On 04/11/13 16:14, Senthil Kumaran wrote: > I think, the plan is to our http handling library enhanced with the > request's style API. > It is discussed at python-dev last year and at pycon. > But it still will need to bring in fixes which requests has been doing > for handing latest issues. Is there some bug or thread I could follow? Thank you very much for the reply, Mat?j -- http://www.ceplovi.cz/matej/, Jabber: mcepl at ceplovi.cz GPG Finger: 89EF 4BC6 288A BF43 1BAB 25C3 E09F EF25 D964 84AC ? ????? ????????? ?? ????? ??????? ?? ?????. -- Russian proverb (this time actually checked by a native Russian) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 255 bytes Desc: OpenPGP digital signature URL: From ron3200 at gmail.com Mon Nov 4 18:34:48 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 04 Nov 2013 11:34:48 -0600 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) Message-ID: This is one solution to what we can do to make slices both easier to understand and work in a much more consistent and flexible way. This matches the slice semantics that Guido likes. (possibly for python 4.) The ability to pass callables along with the slice creates an easy and clean way to add new indexing modes. (As Nick suggested.) Overall, this makes everything simpler and easier to do. :-) Cheers, Ron """ An improved slice implementation. Even the C source says: "It's harder to get right than you might think." both in code and in understanding This requires changing slice indexing so that the following relationships are true. s[i:j:k] == s[i:j][k] both in code and in understanding s[i:j:-1] == s[i:j:1][::-1] And it also adds the ability to apply callables to slices and index's using the existing slice syntax. These alterations would need to be made to the __getitem__ and __setitem__, methods of built-in types. Possibly in Python 4.0. *(I was not able to get a clean version of this behaviour with the existing slice semantics. But the slice and index behaviour of this implementation is much simpler and makes using callables to adjust index's very easy. That seems like a good indication that changing slices to match the above relationships is worth doing. It may be possible to get the current behaviour by applying a callable to the slice like the open, closed, and ones index examples below. ) """ # A string sub-class for testing. class Str(str): def _fix_slice_indexes(self, slc): # Replace Nones and check step value. if isinstance(slc, int): return slc i, j, k = slc.start, slc.stop, slc.step if k == 0: raise ValueError("slice step cannot be zero") if i == None: i = 0 if j == None: j = len(self) if k == None: k = 1 return slice(i, j, k) def __getitem__(self, args): """ Gets a item from a string with either an index, or slice. Apply any callables to the slice if they are pressent. Valid inputes... i (i, callables ...) slice() (slice(), callables ...) """ # Apply callables if any. if isinstance(args, tuple): slc, *callables = args slc = self._fix_slice_indexes(slc) for fn in callables: slc = fn(self, slc) else: slc = self._fix_slice_indexes(args) # Just an index if isinstance(slc, int): return str.__getitem__(self, slc) # Handle slice. rval = [] i, j, k = slc.start, slc.stop, slc.step ix = i if k > 0 else j-1 while i <= ix < j: rval.append(str.__getitem__(self, ix)) ix += k return type(self)('').join(rval) """ These end with 'i' to indicate they make index adjustments, and also to make them less likely to clash with other functions. Some of these are so simple, you'd probably just adjust the index directly, ie.. reversei. But they make good examples of what is possible. And possible There are other uses as well. Because they are just objects passed in, the names aren't important. They can be called anything and still work, and the programmer is free to create new alternatives. """ def reversei(obj, slc): """Return a new slice with reversed step.""" if isinstance(slc, slice): i, j, k = slc.start, slc.stop, slc.step return slice(i, j, -k) return slc def trimi(obj, slc): """Trim left and right so an IndexError is not produced.""" if isinstance(slc, slice): ln = len(obj) i, j, k = slc.start, slc.stop, slc.step if i<0: i = 0 if j>ln: j = ln return slice(i, j, k) return slc def openi(obj, slc): """Open interval - Does not include end points.""" if isinstance(slc, slice): i, j, k = slc.start, slc.stop, slc.step return slice(i+1, j, k) return slc def closedi(obj, slc): """Closed interval - Includes end points.""" if isinstance(slc, slice): i, j, k = slc.start, slc.stop, slc.step return slice(i, j+1, k) return slc def onei(obj, slc): """First element is 1 instead of zero.""" if isinstance(slc, slice): i, j, k = slc.start, slc.stop, slc.step return slice(i-1, j-1, k) return slc - 1 def _test_cases1(): """ # test string >>> s = Str('0123456789') # |0|1|2|3|4|5|6|7|8|9| # 0 1 2 3 4 5 6 7 8 9 10 # 10 9 8 7 6 5 4 3 2 1 0 >>> s[:] '0123456789' >>> s[:, trimi] '0123456789' >>> s[:, reversei] '9876543210' >>> s[:, reversei, trimi] '9876543210' >>> s[::, trimi, reversei] '9876543210' # Right side bigger than len(s) >>> s[:100] Traceback (most recent call last): IndexError: string index out of range >>> s[:100, trimi] '0123456789' >>> s[:100, trimi, reversei] '9876543210' >>> s[:100, reversei] Traceback (most recent call last): IndexError: string index out of range >>> s[:100, reversei, trimi] '9876543210' # Left side smaller than 0. >>> s[-100:] Traceback (most recent call last): IndexError: string index out of range >>> s[-100:, trimi] '0123456789' >>> s[-100:, trimi, reversei] '9876543210' # Slice bigger than s. >>> s[-100:100] Traceback (most recent call last): IndexError: string index out of range >>> s[-100:100, trimi] '0123456789' # Slice smaller than s. >>> s[3:7] '3456' >>> s[3:7, reversei] '6543' # From left With negative step. >>> s[::-1] '9876543210' >>> s[::-1, reversei] '0123456789' >>> s[:100:-1, trimi] # j past right side '9876543210' >>> s[-100::-1, trimi] # i before left side '9876543210' >>> s[-100:100:-1, trimi, reversei] # slice is bigger '0123456789' # Null results >>> s[7:3:1, trimi] '' >>> s[7:3:-1, trimi] '' # Check None values. >>> s[:] '0123456789' >>> s[None:None] '0123456789' >>> s[None:None:None] '0123456789' >>> s[:: 1] '0123456789' >>> s[::-1] '9876543210' >>> s[None:None:1] '0123456789' >>> s[None:None:-1] '9876543210' # Check error messages. >>> s[0:0:0:0] Traceback (most recent call last): SyntaxError: invalid syntax >>> s[5:5:0] Traceback (most recent call last): ValueError: slice step cannot be zero # And various other combinations. >>> s = Str('123456789') >>> s[3, onei] '3' >>> s[4:8, onei] '4567' >>> s[4:8, onei, openi] '567' >>> s[4:8, onei, closedi] '45678' """ def _test(): import doctest print(doctest.testmod(verbose=False)) if __name__=="__main__": _test() From bruce at leapyear.org Mon Nov 4 18:45:24 2013 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 4 Nov 2013 09:45:24 -0800 Subject: [Python-ideas] os.path.join In-Reply-To: References: <527273FE.2060104@stoneleaf.us> Message-ID: On Mon, Nov 4, 2013 at 7:29 AM, anatoly techtonik wrote: > Right. But I am working more with URL paths nowadays. In there if I > want to join two paths, no matter if 2nd starts with slash or not, I > don't really expect the 2nd to rewrite the first. > Joining url paths is different from joining file system paths. I wouldn't suggest using a function designed for one to do the other. urljoin('https://s/a/b/', 'x') => 'https://s/a/b/x') urljoin('https://s/a/b/', '/x/y') => 'https://s/x/y') urljoin('https://s/a/b/', '//t/x/y') => 'https://t/x/y') urljoin('https://s/a/b/', '//t') => 'https://t/a/b') urljoin('https://s/a/b/', 'http:') => 'http://s/a/b') urljoin('http:', '//s', 'x/y') => 'http://s/x/y') Note that I'm ignoring the issue of whether or not the last part of the url on the left should be stripped off. --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Mon Nov 4 19:15:35 2013 From: brett at python.org (Brett Cannon) Date: Mon, 4 Nov 2013 13:15:35 -0500 Subject: [Python-ideas] requests in the stdlib? In-Reply-To: <5277C7ED.1090601@redhat.com> References: <20131104145138.GA1553@wycliff.ceplovi.cz> <5277C7ED.1090601@redhat.com> Message-ID: On Mon, Nov 4, 2013 at 11:14 AM, Mat?j Cepl wrote: > On 04/11/13 16:14, Senthil Kumaran wrote: > > I think, the plan is to our http handling library enhanced with the > > request's style API. > > It is discussed at python-dev last year and at pycon. > > But it still will need to bring in fixes which requests has been doing > > for handing latest issues. > > Is there some bug or thread I could follow? > Nope as the bulk of the discussion happened at the language summit at PyCon 2013, so all in-person discussion. Basically that discussion said a requests-like API (or even requests if the right things happened) could go in the stdlib, but only if they support asyncio/tulip as a first-class way of using the library. -------------- next part -------------- An HTML attachment was scrubbed... URL: From barry at python.org Mon Nov 4 21:14:18 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 4 Nov 2013 15:14:18 -0500 Subject: [Python-ideas] requests in the stdlib? References: <20131104145138.GA1553@wycliff.ceplovi.cz> Message-ID: <20131104151418.6cd36468@anarchist> On Nov 04, 2013, at 03:51 PM, Mat?j Cepl wrote: >I guess I am not the first one who experienced a profound life-changing >epiphany of meeting with the requests package. Yesterday after fighting for >hours with urllib2 (and GitHub API) and failing, I have succeeded with >requests in five minutes. > >Despite how much I dislike using libraries outside of the stdlib (yes, >unittest is the way to go, please, no nosetests and py.test for me), I am now >persuaded that requests API is the best thing since ... I don't know, API of >feedparser (RIP, Aaron Swartz)? > >So, I guess, somebody had to already suggest pulling requests into >stdlib. Could somebody here point me to the resulting discussion, please? requests is nice, and I like the API as well. I wouldn't be opposed to pulling it into the 3.4 stdlib if upstream were amenable. However, the specific implementation would have to stop vendoring various packages such as chardet and urllib3, and it would have to make it easy to use system provided certificates instead of bundling its own (there's been discussion on this issue for Python as well). We have to carry deltas in Debian to prevent this vendorization, and I think for Python stdlib they would be unacceptable. That probably means at least also pulling in urllib3 and chardet. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ben+python at benfinney.id.au Mon Nov 4 22:22:35 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 05 Nov 2013 08:22:35 +1100 Subject: [Python-ideas] os.path.join References: <527273FE.2060104@stoneleaf.us> Message-ID: <7wa9hjzwtg.fsf@benfinney.id.au> anatoly techtonik writes: > Right. But I am working more with URL paths nowadays. In there if I > want to join two paths, no matter if 2nd starts with slash or not, I > don't really expect the 2nd to rewrite the first. Then you're not using the right tool: ?os.path? is specifically about filesystem paths. URLs follow different rules, as you say; you should be using ?urllib.parse? . -- \ ?Our task must be to free ourselves from our prison by widening | `\ our circle of compassion to embrace all humanity and the whole | _o__) of nature in its beauty.? ?Albert Einstein | Ben Finney From ben+python at benfinney.id.au Mon Nov 4 22:25:27 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 05 Nov 2013 08:25:27 +1100 Subject: [Python-ideas] os.path.join References: <527273FE.2060104@stoneleaf.us> Message-ID: <7w61s7zwoo.fsf@benfinney.id.au> anatoly techtonik writes: > Right. But I am working more with URL paths nowadays. In there if I > want to join two paths, no matter if 2nd starts with slash or not, I > don't really expect the 2nd to rewrite the first. Then you are using the wrong tool for the job: ?os.path? is specifically for manipulating OS filesystem paths. URLs follow different rules, as you say. For those, use the standard library ?urllib.parse? module. -- \ ?When I was born I was so surprised I couldn't talk for a year | `\ and a half.? ?Gracie Allen | _o__) | Ben Finney From ncoghlan at gmail.com Mon Nov 4 23:40:37 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Nov 2013 08:40:37 +1000 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) In-Reply-To: References: Message-ID: On 5 Nov 2013 03:35, "Ron Adam" wrote: > > > This is one solution to what we can do to make slices both easier to understand and work in a much more consistent and flexible way. > > This matches the slice semantics that Guido likes. > (possibly for python 4.) > > The ability to pass callables along with the slice creates an easy and clean way to add new indexing modes. (As Nick suggested.) Tuples can't really be used for this purpose, since that's incompatible with multi-dimensional indexing. However, I also agree containment would be a better way to go than subclassing. I'm currently thinking that a fourth "adjust" argument to the slice constructor may work, and call that from the indices method as: def adjust_indices(start, stop, step, length): ... The values passed in would be those from the slice constructor plus the length passed to the indices method. The only preconditioning would be the check for a non-zero step. The result would be used as the result of the indices method. Cheers, Nick. > > Overall, this makes everything simpler and easier to do. :-) > > > Cheers, > Ron > > > """ > > An improved slice implementation. > > Even the C source says: > > "It's harder to get right than you might think." > > both in code and in understanding > This requires changing slice indexing so that > the following relationships are true. > > s[i:j:k] == s[i:j][k] both in code and in understanding > > s[i:j:-1] == s[i:j:1][::-1] > > > And it also adds the ability to apply callables to > slices and index's using the existing slice syntax. > > These alterations would need to be made to the __getitem__ > and __setitem__, methods of built-in types. Possibly in > Python 4.0. > > *(I was not able to get a clean version of this behaviour > with the existing slice semantics. But the slice and index > behaviour of this implementation is much simpler and makes > using callables to adjust index's very easy. That seems > like a good indication that changing slices to match the > above relationships is worth doing. > > It may be possible to get the current behaviour by applying > a callable to the slice like the open, closed, and ones index > examples below. > ) > > """ > > # A string sub-class for testing. > > class Str(str): > > def _fix_slice_indexes(self, slc): > # Replace Nones and check step value. > if isinstance(slc, int): > return slc > i, j, k = slc.start, slc.stop, slc.step > if k == 0: > raise ValueError("slice step cannot be zero") > if i == None: i = 0 > if j == None: j = len(self) > if k == None: k = 1 > return slice(i, j, k) > > def __getitem__(self, args): > """ > Gets a item from a string with either an > index, or slice. Apply any callables to the > slice if they are pressent. > > Valid inputes... > i > (i, callables ...) > slice() > (slice(), callables ...) > > """ > # Apply callables if any. > if isinstance(args, tuple): > slc, *callables = args > slc = self._fix_slice_indexes(slc) > for fn in callables: > slc = fn(self, slc) > else: > slc = self._fix_slice_indexes(args) > > # Just an index > if isinstance(slc, int): > return str.__getitem__(self, slc) > > # Handle slice. > rval = [] > i, j, k = slc.start, slc.stop, slc.step > ix = i if k > 0 else j-1 > while i <= ix < j: > rval.append(str.__getitem__(self, ix)) > ix += k > return type(self)('').join(rval) > > > """ > These end with 'i' to indicate they make index adjustments, > and also to make them less likely to clash with other > functions. > > Some of these are so simple, you'd probably just > adjust the index directly, ie.. reversei. But > they make good examples of what is possible. And possible > There are other uses as well. > > Because they are just objects passed in, the names aren't > important. They can be called anything and still work, and > the programmer is free to create new alternatives. > """ > > def reversei(obj, slc): > """Return a new slice with reversed step.""" > if isinstance(slc, slice): > i, j, k = slc.start, slc.stop, slc.step > return slice(i, j, -k) > return slc > > def trimi(obj, slc): > """Trim left and right so an IndexError is not produced.""" > if isinstance(slc, slice): > ln = len(obj) > i, j, k = slc.start, slc.stop, slc.step > if i<0: i = 0 > if j>ln: j = ln > return slice(i, j, k) > return slc > > def openi(obj, slc): > """Open interval - Does not include end points.""" > if isinstance(slc, slice): > i, j, k = slc.start, slc.stop, slc.step > return slice(i+1, j, k) > return slc > > def closedi(obj, slc): > """Closed interval - Includes end points.""" > if isinstance(slc, slice): > i, j, k = slc.start, slc.stop, slc.step > return slice(i, j+1, k) > return slc > > def onei(obj, slc): > """First element is 1 instead of zero.""" > if isinstance(slc, slice): > i, j, k = slc.start, slc.stop, slc.step > return slice(i-1, j-1, k) > return slc - 1 > > > > def _test_cases1(): > """ > > # test string > >>> s = Str('0123456789') > > # |0|1|2|3|4|5|6|7|8|9| > # 0 1 2 3 4 5 6 7 8 9 10 > # 10 9 8 7 6 5 4 3 2 1 0 > > > >>> s[:] > '0123456789' > > >>> s[:, trimi] > '0123456789' > > >>> s[:, reversei] > '9876543210' > > >>> s[:, reversei, trimi] > '9876543210' > > >>> s[::, trimi, reversei] > '9876543210' > > > # Right side bigger than len(s) > > >>> s[:100] > Traceback (most recent call last): > IndexError: string index out of range > > >>> s[:100, trimi] > '0123456789' > > >>> s[:100, trimi, reversei] > '9876543210' > > >>> s[:100, reversei] > Traceback (most recent call last): > IndexError: string index out of range > > >>> s[:100, reversei, trimi] > '9876543210' > > > # Left side smaller than 0. > > >>> s[-100:] > Traceback (most recent call last): > IndexError: string index out of range > > >>> s[-100:, trimi] > '0123456789' > > >>> s[-100:, trimi, reversei] > '9876543210' > > > # Slice bigger than s. > > >>> s[-100:100] > Traceback (most recent call last): > IndexError: string index out of range > > >>> s[-100:100, trimi] > '0123456789' > > > # Slice smaller than s. > > >>> s[3:7] > '3456' > > >>> s[3:7, reversei] > '6543' > > > > # From left With negative step. > > >>> s[::-1] > '9876543210' > > >>> s[::-1, reversei] > '0123456789' > > >>> s[:100:-1, trimi] # j past right side > '9876543210' > > >>> s[-100::-1, trimi] # i before left side > '9876543210' > > >>> s[-100:100:-1, trimi, reversei] # slice is bigger > '0123456789' > > > > # Null results > > >>> s[7:3:1, trimi] > '' > > >>> s[7:3:-1, trimi] > '' > > > # Check None values. > > >>> s[:] > '0123456789' > > >>> s[None:None] > '0123456789' > > >>> s[None:None:None] > '0123456789' > > >>> s[:: 1] > '0123456789' > > >>> s[::-1] > '9876543210' > > >>> s[None:None:1] > '0123456789' > > >>> s[None:None:-1] > '9876543210' > > > # Check error messages. > > >>> s[0:0:0:0] > Traceback (most recent call last): > SyntaxError: invalid syntax > > >>> s[5:5:0] > Traceback (most recent call last): > ValueError: slice step cannot be zero > > > > # And various other combinations. > > >>> s = Str('123456789') > > >>> s[3, onei] > '3' > > >>> s[4:8, onei] > '4567' > > >>> s[4:8, onei, openi] > '567' > > >>> s[4:8, onei, closedi] > '45678' > > > """ > > > def _test(): > import doctest > print(doctest.testmod(verbose=False)) > > > if __name__=="__main__": > _test() > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Nov 4 23:52:55 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Nov 2013 08:52:55 +1000 Subject: [Python-ideas] requests in the stdlib? In-Reply-To: <20131104151418.6cd36468@anarchist> References: <20131104145138.GA1553@wycliff.ceplovi.cz> <20131104151418.6cd36468@anarchist> Message-ID: On 5 Nov 2013 06:15, "Barry Warsaw" wrote: > > On Nov 04, 2013, at 03:51 PM, Mat?j Cepl wrote: > > >I guess I am not the first one who experienced a profound life-changing > >epiphany of meeting with the requests package. Yesterday after fighting for > >hours with urllib2 (and GitHub API) and failing, I have succeeded with > >requests in five minutes. > > > >Despite how much I dislike using libraries outside of the stdlib (yes, > >unittest is the way to go, please, no nosetests and py.test for me), I am now > >persuaded that requests API is the best thing since ... I don't know, API of > >feedparser (RIP, Aaron Swartz)? > > > >So, I guess, somebody had to already suggest pulling requests into > >stdlib. Could somebody here point me to the resulting discussion, please? > > requests is nice, and I like the API as well. I wouldn't be opposed to > pulling it into the 3.4 stdlib if upstream were amenable. > > However, the specific implementation would have to stop vendoring various > packages such as chardet and urllib3, and it would have to make it easy to use > system provided certificates instead of bundling its own (there's been > discussion on this issue for Python as well). > > We have to carry deltas in Debian to prevent this vendorization, and I think > for Python stdlib they would be unacceptable. That probably means at least > also pulling in urllib3 and chardet. PEP 453 vendors the whole thing as part of pip :) (but that's just one of the many ways that PEP makes life interesting for Linux distros). Anyway, yes, the requests *API* does likely represent the future standard HTTP(S) interface for Python, but more likely through an alternative asyncio compatible implementation rather than the current synchronous-only implementation. For now "pip install requests" (or the platform specific equivalent) is the way to go. Reference: http://python-notes.curiousefficiency.org/en/latest/conferences/pyconus2013/20130313-language-summit.html#requests Cheers, Nick. > > -Barry > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From graffatcolmingov at gmail.com Mon Nov 4 23:53:29 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Mon, 4 Nov 2013 16:53:29 -0600 Subject: [Python-ideas] requests in the stdlib? In-Reply-To: <20131104151418.6cd36468@anarchist> References: <20131104145138.GA1553@wycliff.ceplovi.cz> <20131104151418.6cd36468@anarchist> Message-ID: On Nov 4, 2013 2:14 PM, "Barry Warsaw" wrote: > > On Nov 04, 2013, at 03:51 PM, Mat?j Cepl wrote: > > >I guess I am not the first one who experienced a profound life-changing > >epiphany of meeting with the requests package. Yesterday after fighting for > >hours with urllib2 (and GitHub API) and failing, I have succeeded with > >requests in five minutes. > > > >Despite how much I dislike using libraries outside of the stdlib (yes, > >unittest is the way to go, please, no nosetests and py.test for me), I am now > >persuaded that requests API is the best thing since ... I don't know, API of > >feedparser (RIP, Aaron Swartz)? > > > >So, I guess, somebody had to already suggest pulling requests into > >stdlib. Could somebody here point me to the resulting discussion, please? > > requests is nice, and I like the API as well. I wouldn't be opposed to > pulling it into the 3.4 stdlib if upstream were amenable. > > However, the specific implementation would have to stop vendoring various > packages such as chardet and urllib3, and it would have to make it easy to use > system provided certificates instead of bundling its own (there's been > discussion on this issue for Python as well). > > We have to carry deltas in Debian to prevent this vendorization, and I think > for Python stdlib they would be unacceptable. That probably means at least > also pulling in urllib3 and chardet. > > -Barry It actually isn't chardet anymore but a separate package called charade which is sort of chardet. The caveat is that it runs on python 2.6-3.3 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Tue Nov 5 02:48:06 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 04 Nov 2013 19:48:06 -0600 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) In-Reply-To: References: Message-ID: <52784E56.907@gmail.com> On 11/04/2013 04:40 PM, Nick Coghlan wrote: > > > On 5 Nov 2013 03:35, "Ron Adam" > wrote: > > > > > > This is one solution to what we can do to make slices both easier to > understand and work in a much more consistent and flexible way. > > > > This matches the slice semantics that Guido likes. > > (possibly for python 4.) > > > > The ability to pass callables along with the slice creates an easy and > clean way to add new indexing modes. (As Nick suggested.) > > Tuples can't really be used for this purpose, since that's incompatible > with multi-dimensional indexing. > Are there plans for pythons builtin types to use multidimensional indexing? I don't think what I'm suggesting would create an issue with it in either. It may even be complementary. Either I'm missing something, or you aren't quite understanding where the changes I'm suggesting are to be made. As long as the change is made local to the object that uses it, it won't effect any other types uses of slices. And what is passed in a tuple is different from specifying the meaning of a tuple. There may be other reasons this may not be a bad idea, but I can't think of any myself at the moment. Possibly because a callable passed with a slice may alter the object, but that could be limited by giving the callable a length instead of of the object itself. But personally I like that it's open ended and not limited by the syntax. Consider this... >>> a = list(range(10)) >>> a [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> b = list([a] * 3) >>> b [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]] >>> a[2:5, 1:2] Traceback (most recent call last): File "", line 1, in TypeError: list indices must be integers, not tuple Python lists currently don't know what to do with a tuple. In order to do anything else, the __getitem__ and __setitem__ methods need to be overridden. For that reason, it can't cause an issue with anything as long as the change is kept *local to the object(s)* that use it. Making changes at the syntax level, or even slice level could be disruptive though. (This doesn't do that.) >>> class Foo: ... def __getitem__(self, args): ... print(args) ... >>> foo = Foo() >>> foo[1,2,3,4] (1, 2, 3, 4) >>> foo[1:2:3, 4:5:6, 7, 8, 9] (slice(1, 2, 3), slice(4, 5, 6), 7, 8, 9) The slice syntax already constructs a tuple if it gets a complex set of argument. That isn't being changed. The only thing that it does is expand what builtin types can accept through the existing syntax. It does not restrict, or make any change, at a level that will prevent anything else from using that same syntax in other ways. As a way to allow new-slices and the current slices together/overlap in a transition period, we could just require one extra value to be passed, which would cause a tuple to be created and the __getitem__ method could then use the newer indexing on the slice. s[i:j:k] # current indexing. s[i:j:k, ''] # new indexing... Null string or None causes tuple to be created. (or a callable that takes a slice.) > However, I also agree containment would be a better way to go than > subclassing. > > I'm currently thinking that a fourth "adjust" argument to the slice > constructor may work, and call that from the indices method as: > > def adjust_indices(start, stop, step, length): > ... > Currently the length adjustment is made by the __getitem__ method calling the indices method as in this example. >>> class Foo(list): ... def __getitem__(self, slc): ... print(slc.indices(len(self))) ... >>> foo = Foo(range(10)) >>> foo [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> foo[:] (0, 10, 1) >>> foo[::-1] (9, -1, -1) # The altered indices we don't like from the indices method. So you don't need to add the fourth length argument if the change is made in __getitem__ and __setitem__. Or possibly you can do it just in the slices, indices method. > The values passed in would be those from the slice constructor plus the > length passed to the indices method. The only preconditioning would be > the check for a non-zero step. > > The result would be used as the result of the indices method. > Did you see this part of the tests? > # And various other combinations. > > >>> s = Str('123456789') > > >>> s[3, onei] # ones indexing > '3' > > >>> s[4:8, onei] # ones indexing with slice > '4567' > > >>> s[4:8, onei, openi] # open interval > '567' > > >>> s[4:8, onei, closedi] # closed interval > '45678' These all were very easy to implement, and did not require any extra logic added to the underlying __getitem__ code other than calling the passed functions in the tuple. It moves these cases out of the object being sliced in a nice way. Other ways of doing it would require keywords and logic for each case to be included in the objects. Cheers,adjustment Ron -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Tue Nov 5 03:25:27 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 04 Nov 2013 20:25:27 -0600 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) In-Reply-To: <52784E56.907@gmail.com> References: <52784E56.907@gmail.com> Message-ID: On 11/04/2013 07:48 PM, Ron Adam wrote: > > Cheers,adjustment > Ron I seem to be getting random inserts of pieces I've cut from other places in my thunderbird email client. 'adjustment' wasn't there when I posted. I hope this will be fixed in a Ubuntu update soon. Ron From greg.ewing at canterbury.ac.nz Tue Nov 5 11:51:12 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 05 Nov 2013 23:51:12 +1300 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) In-Reply-To: References: Message-ID: <5278CDA0.90307@canterbury.ac.nz> Ron Adam wrote: > >>> s[:100, trimi, reversei] -1. This conflicts with the convention of indexing multi-dimensional data structures using comma-separated expressions. -- Greg From ncoghlan at gmail.com Tue Nov 5 12:02:02 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 5 Nov 2013 21:02:02 +1000 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) In-Reply-To: <52784E56.907@gmail.com> References: <52784E56.907@gmail.com> Message-ID: On 5 Nov 2013 11:48, "Ron Adam" wrote: > > On 11/04/2013 04:40 PM, Nick Coghlan wrote: >> >> >> On 5 Nov 2013 03:35, "Ron Adam" wrote: >> > >> > >> > This is one solution to what we can do to make slices both easier to understand and work in a much more consistent and flexible way. >> > >> > This matches the slice semantics that Guido likes. >> > (possibly for python 4.) >> > >> > The ability to pass callables along with the slice creates an easy and clean way to add new indexing modes. (As Nick suggested.) >> >> Tuples can't really be used for this purpose, since that's incompatible with multi-dimensional indexing. > > Are there plans for pythons builtin types to use multidimensional indexing? Yes, memoryview already has some support for multidimensional array shapes, and that's likely to be enhanced further in 3.5. > I don't think what I'm suggesting would create an issue with it in either. It may even be complementary. > > Either I'm missing something, or you aren't quite understanding where the changes I'm suggesting are to be made. As long as the change is made local to the object that uses it, it won't effect any other types uses of slices. And what is passed in a tuple is different from specifying the meaning of a tuple. You're proposing a mechanism for slice index customisation that would be ambiguous and thoroughly confusing when used to define a slice as part of a multidimensional array access. Remember, the Ellipsis was first added specifically as part of multi-dimensional indexing notation for the scientific community. Even though the stdlib only partially supports it today, the conventions for multidimensional slicing are defined by NumPy and need to be taken into account in future design changes. > There may be other reasons this may not be a bad idea, but I can't think of any myself at the moment. Possibly because a callable passed with a slice may alter the object, but that could be limited by giving the callable a length instead of of the object itself. But personally I like that it's open ended and not limited by the syntax. I like the idea of a passing a callable. I just think it should be an extra optional argument to slice that is used to customise the result of calling indices() rather than changing the type seen by the underlying container. > > > Consider this... > > >>> a = list(range(10)) > >>> a > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > >>> b = list([a] * 3) > >>> b > [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]] > >>> a[2:5, 1:2] > > Traceback (most recent call last): > File "", line 1, in > TypeError: list indices must be integers, not tuple > > Python lists currently don't know what to do with a tuple. In order to do anything else, the __getitem__ and __setitem__ methods need to be overridden. For that reason, it can't cause an issue with anything as long as the change is kept *local to the object(s)* that use it. Except for all the humans that will have to read it, and the confusion of applying it to multidimensional array operations. > > Making changes at the syntax level, or even slice level could be disruptive though. (This doesn't do that.) And hence only works with types that have been updated to support it. We already did that once for extended slicing support, so let's not do it again when there are other alternatives available. However, using a custom container type is a good way to experiment, so I've gone back to not wanting to permit slice subclasses at this point (since containment is sufficient when experimenting with a custom container). > > >>> class Foo: > ... def __getitem__(self, args): > ... print(args) > ... > >>> foo = Foo() > >>> foo[1,2,3,4] > (1, 2, 3, 4) > >>> foo[1:2:3, 4:5:6, 7, 8, 9] > (slice(1, 2, 3), slice(4, 5, 6), 7, 8, 9) > > The slice syntax already constructs a tuple if it gets a complex set of argument. That isn't being changed. > > The only thing that it does is expand what builtin types can accept through the existing syntax. It does not restrict, or make any change, at a level that will prevent anything else from using that same syntax in other ways. Yes, I realise it requires changes to all the container types. That's one of the problems with the idea. So, yes, I did understand your proposal, and definitely like the general idea of passing in a callable to customise the index calculations. I just don't like the specifics of it, both because of the visual confusion with multidimensional indexing and because we've been through this before with extended slicing and requiring changes to every single container type is a truly painful way to make a transition. Implementing a change through the slice object instead would mean that any transition problems would be isolated to using the new feature with containers that didn't call slice.indices or the C API equivalent when calculating slice indices. Cheers, Nick. > > > As a way to allow new-slices and the current slices together/overlap in a transition period, we could just require one extra value to be passed, which would cause a tuple to be created and the __getitem__ method could then use the newer indexing on the slice. > > s[i:j:k] # current indexing. > s[i:j:k, ''] # new indexing... Null string or None causes tuple to be created. (or a callable that takes a slice.) > > > >> However, I also agree containment would be a better way to go than subclassing. >> >> I'm currently thinking that a fourth "adjust" argument to the slice constructor may work, and call that from the indices method as: >> >> def adjust_indices(start, stop, step, length): >> ... > > Currently the length adjustment is made by the __getitem__ method calling the indices method as in this example. > > >>> class Foo(list): > ... def __getitem__(self, slc): > ... print(slc.indices(len(self))) > ... > >>> foo = Foo(range(10)) > >>> foo > [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] > >>> foo[:] > (0, 10, 1) > >>> foo[::-1] > (9, -1, -1) # The altered indices we don't like from the indices method. > > > So you don't need to add the fourth length argument if the change is made in __getitem__ and __setitem__. > Or possibly you can do it just in the slices, indices method. > > > >> The values passed in would be those from the slice constructor plus the length passed to the indices method. The only preconditioning would be the check for a non-zero step. > > >> The result would be used as the result of the indices method. > > > Did you see this part of the tests? > > > > # And various other combinations. > > > > >>> s = Str('123456789') > > > > >>> s[3, onei] # ones indexing > > '3' > > > > >>> s[4:8, onei] # ones indexing with slice > > '4567' > > > > >>> s[4:8, onei, openi] # open interval > > '567' > > > > >>> s[4:8, onei, closedi] # closed interval > > '45678' > > > These all were very easy to implement, and did not require any extra logic added to the underlying __getitem__ code other than calling the passed functions in the tuple. It moves these cases out of the object being sliced in a nice way. Other ways of doing it would require keywords and logic for each case to be included in the objects. > > Cheers,adjustment > Ron > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Tue Nov 5 14:32:46 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Tue, 5 Nov 2013 13:32:46 +0000 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) In-Reply-To: <5278CDA0.90307@canterbury.ac.nz> References: <5278CDA0.90307@canterbury.ac.nz> Message-ID: On 5 November 2013 10:51, Greg Ewing wrote: > Ron Adam wrote: >> >> >>> s[:100, trimi, reversei] > > > -1. This conflicts with the convention of indexing multi-dimensional > data structures using comma-separated expressions. Agreed. Numpy users are the biggest consumers of slicing. Any proposal to improve slicing had better improve it for numpy as well which means it should work in multidimensional slicing context - regardless of whether numpy is in the stdlib. Oscar From ron3200 at gmail.com Tue Nov 5 19:13:50 2013 From: ron3200 at gmail.com (Ron Adam) Date: Tue, 05 Nov 2013 12:13:50 -0600 Subject: [Python-ideas] Possible new slice behaviour? Was ( Negative slice discussion.) In-Reply-To: References: <52784E56.907@gmail.com> Message-ID: On 11/05/2013 05:02 AM, Nick Coghlan wrote: > > Are there plans for pythons builtin types to use multidimensional indexing? > > Yes, memoryview already has some support for multidimensional array shapes, > and that's likely to be enhanced further in 3.5. Ok, I suppose other builtin types may (or may not) follow that pattern. But in this light, I agree, it's best not to create a more complex pattern to handle for those cases. (... and it worked so nicely, Oh Well.) The alternative is to have a function that does what the slice syntax does. And then extend that. It seems to me it's a good idea to have function equivalents of syntax when possible in any case. do_slice(obj, slices, *callables) Where slices is either a single slice or a tuple of slices or indices. # Examples class GetSlice: """Return a slices from slice syntax.""" def __getitem__(self, slc): return slc gs = GetSlice() seq = list(range(10)) print(do_slice(seq, gs[1, 5, 7])) print(do_slice(seq, gs[3:7], openi)) print(do_slice(seq, gs[3:7], closedi)) print(do_slice(seq, gs[3:7], closedi, onei)) print(do_slice(seq, gs[3:5, 7:8, 9], reversei)) print(do_slice(seq, gs[:], reversei)) print(do_slice(seq, range(-5, 15), wrapi)) print(do_slice(seq, range(15, -5, -1), wrapi)) """ [1, 5, 7] [4, 5, 6] [3, 4, 5, 6, 7] [2, 3, 4, 5, 6] [[4, 3], [7], 9] [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] [5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4] [5, 4, 3, 2, 1, 0, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0, 9, 8, 7, 6] """ Cheers, Ron From antony.lee at berkeley.edu Sun Nov 10 09:41:10 2013 From: antony.lee at berkeley.edu (Antony Lee) Date: Sun, 10 Nov 2013 00:41:10 -0800 Subject: [Python-ideas] Issues with inspect.Parameter Message-ID: The docstring of inspect.Parameter indicates the "default" and "annotation" attributes are not set if the parameter does not have, respectively, a default value and an annotation, and that the "kind" attribute is a string. But in fact, the "default" and "annotation" attributes are set to "inspect._empty (== Parameter.empty)" in that case, and the "kind" attribute has type "_ParameterKind" (essentially a hand-written equivalent of IntEnum). I suggest to correct the docstring accordingly, and to replace the implementation of _ParameterKind by a proper IntEnum (if full backwards compatibility is required), or even just by Enum (which makes a bit more sense, as the fact that _ParameterKind is a subclass of int doesn't seem to be documented anywhere). Antony -------------- next part -------------- An HTML attachment was scrubbed... URL: From tarek at ziade.org Sun Nov 10 10:55:16 2013 From: tarek at ziade.org (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sun, 10 Nov 2013 10:55:16 +0100 Subject: [Python-ideas] where statement in Pyret Message-ID: <527F5804.9010605@ziade.org> Hey I've read about Pyret on hackernews: http://www.pyret.org/ and found the 'where' statement very compeling. Functions can end with a where that contains small unit tests. >From the documentation example: fun sum(l): cases(List) l: | empty => 0 | link(first, rest) => first + sum(rest) end where: sum([]) is 0 sum([1, 2, 3]) is 6 end It's quite similar to the doctests ideas I guess - but not intended to be documentation like them. I ended up disliking docttests because of this doc+test duality by the way: it often ends up as a not so good documentation and not so good tests. Anyways, having a dedicated keyword to append after a function some tests as part of the language has benefits imho: - the scope is reduced to the function - so it helps making 'real' isolated unit tests. - we do have the unittest conventions, but here it make tests a first class citizen in the language. Cheers Tarek -------------- next part -------------- An HTML attachment was scrubbed... URL: From tarek at ziade.org Sun Nov 10 11:06:33 2013 From: tarek at ziade.org (=?UTF-8?B?VGFyZWsgWmlhZMOp?=) Date: Sun, 10 Nov 2013 11:06:33 +0100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: <527F5804.9010605@ziade.org> Message-ID: <527F5AA9.2000405@ziade.org> Le 11/10/13 11:00 AM, Markus Unterwaditzer a ?crit : > While it is a nice idea, i don't think this feature deserves its own > syntax. Besides doctests, another way of achieving this in Python > might be: > > def sum(l): > # implementation of sum > > def test_sum(): > assert sum([]) == 0 > assert sum([1, 2, 3]) == 6 > > which IMO is nice enough. yes, that's what we all already do: tests in test_xxx functions. And the adopted convention is to have the tests in dedicated tests modules, which defeats the benefit of having real isolated unit tests just by the function code. And even if we place the test function just besides the tested function, Python will not make any distinction : they are both just functions. Having the ability to distinguish tests and regular code at the language level has benefits like the ability to ignore tests when you run the app in production etc. Cheers Tarek > > -- Markus > > "Tarek Ziad?" wrote: > > Hey > > I've read about Pyret on hackernews: http://www.pyret.org/ > > and found the 'where' statement very compeling. Functions can end > with a where that contains small unit > tests. > > From the documentation example: > > fun sum(l): > cases(List) l: > | empty => 0 > | link(first, rest) => first + sum(rest) > end > where: > sum([]) is 0 > sum([1, 2, 3]) is 6 > end > > > It's quite similar to the doctests ideas I guess - but not > intended to be documentation like them. > > I ended up disliking docttests because of this doc+test duality by > the way: it often ends up as a > not so good documentation and not so good tests. > > Anyways, having a dedicated keyword to append after a function > some tests as part of the language > has benefits imho: > > - the scope is reduced to the function - so it helps making 'real' > isolated unit tests. > - we do have the unittest conventions, but here it make tests a > first class citizen in the language. > > Cheers > Tarek > > ------------------------------------------------------------------------ > > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus at unterwaditzer.net Sun Nov 10 11:00:03 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Sun, 10 Nov 2013 11:00:03 +0100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <527F5804.9010605@ziade.org> References: <527F5804.9010605@ziade.org> Message-ID: While it is a nice idea, i don't think this feature deserves its own syntax. Besides doctests, another way of achieving this in Python might be: def sum(l): # implementation of sum def test_sum(): assert sum([]) == 0 assert sum([1, 2, 3]) == 6 which IMO is nice enough. -- Markus "Tarek Ziad?" wrote: >Hey > >I've read about Pyret on hackernews: http://www.pyret.org/ > >and found the 'where' statement very compeling. Functions can end with >a >where that contains small unit >tests. > >>From the documentation example: > >fun sum(l): > cases(List) l: > | empty => 0 > | link(first, rest) => first + sum(rest) > end >where: > sum([]) is 0 > sum([1, 2, 3]) is 6 >end > > >It's quite similar to the doctests ideas I guess - but not intended to >be documentation like them. > >I ended up disliking docttests because of this doc+test duality by the >way: it often ends up as a >not so good documentation and not so good tests. > >Anyways, having a dedicated keyword to append after a function some >tests as part of the language >has benefits imho: > >- the scope is reduced to the function - so it helps making 'real' >isolated unit tests. >- we do have the unittest conventions, but here it make tests a first >class citizen in the language. > >Cheers >Tarek > > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus at unterwaditzer.net Sun Nov 10 11:26:06 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Sun, 10 Nov 2013 11:26:06 +0100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <527F5AA9.2000405@ziade.org> References: <527F5804.9010605@ziade.org> <527F5AA9.2000405@ziade.org> Message-ID: I agree that such hints about the tests would be nice for performance, but i don't think considerations about performance were ever given a high priority during the design of Python. How about this: def sum(): # implementation @sum.test def test_sum(): assert sum([]) == 0 This potentially could do the same things you expected from the where-statement, without introducing new keywords. Maybe this could also be buried into some stdlib module in order to avoid "polluting" the core language if this way of writing tests doesn't gain any traction. import unittest # or whatever def sum(): # implementation @unittest.test_func(sum): def test_sum(): assert sum([]) == 0 -- Markus On 2013-11-10 11:06, Tarek Ziad? wrote: > Le 11/10/13 11:00 AM, Markus Unterwaditzer a ?crit?: > >> While it is a nice idea, i don't think this feature deserves its own >> syntax. Besides doctests, another way of achieving this in Python >> might be: >> >> def sum(l): >> # implementation of sum >> >> def test_sum(): >> assert sum([]) == 0 >> assert sum([1, 2, 3]) == 6 >> >> which IMO is nice enough. > yes, that's what we all already do: tests in test_xxx functions. And > the adopted convention > is to have the tests in dedicated tests modules, which defeats the > benefit of having > real isolated unit tests just by the function code. > > And even if we place the test function just besides the tested > function, Python will not > make any distinction : they are both just functions. > > Having the ability to distinguish tests and regular code at the > language level > has benefits like the ability to ignore tests when you run the app > in production etc. > > Cheers > Tarek > >> -- Markus >> >> "Tarek Ziad?" wrote: >> >>> Hey >>> >>> I've read about Pyret on hackernews: http://www.pyret.org/ [2] >>> >>> and found the 'where' statement very compeling. Functions can end >>> with a where that contains small unit >>> tests. >>> >>> From the documentation example: >>> >>> fun sum(l): >>> cases(List) l: >>> | empty => 0 >>> | link(first, rest) => first + sum(rest) >>> end >>> where: >>> sum([]) is 0 >>> sum([1, 2, 3]) is 6 >>> end >>> ? >>> It's quite similar to the doctests ideas I guess - but not intended >>> to be documentation like them. >>> >>> I ended up disliking docttests because of this doc+test duality by >>> the way: it often ends up as a >>> not so good documentation and not so good tests. >>> >>> Anyways, having a dedicated keyword to append after a function some >>> tests as part of the language >>> has benefits imho: >>> >>> - the scope is reduced to the function - so it helps making 'real' >>> isolated unit tests. >>> - we do have the unittest conventions, but here it make tests a >>> first class citizen in the language. >>> >>> Cheers >>> Tarek >>> >>> ------------------------- >>> >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas [1] > > > > Links: > ------ > [1] https://mail.python.org/mailman/listinfo/python-ideas > [2] http://www.pyret.org/ From steve at pearwood.info Sun Nov 10 13:01:12 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 10 Nov 2013 23:01:12 +1100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <527F5804.9010605@ziade.org> References: <527F5804.9010605@ziade.org> Message-ID: <20131110120111.GB2085@ando> On Sun, Nov 10, 2013 at 10:55:16AM +0100, Tarek Ziad? wrote: > Hey > > I've read about Pyret on hackernews: http://www.pyret.org/ Looks very interesting. > and found the 'where' statement very compeling. Functions can end with a > where that contains small unit tests. > > From the documentation example: > > fun sum(l): > cases(List) l: > | empty => 0 > | link(first, rest) => first + sum(rest) > end > where: > sum([]) is 0 > sum([1, 2, 3]) is 6 > end Sadly, I *really* dislike that. To me, "where" has absolutely nothing to do with testing. I see that Pyret also includes a "check" keyword which also does testing. That seems like a more sensible keyword. I would prefer to see some variation on Nick Coglan's ideas about a "where" keyword for local scoping of temporary variables. This is an idea that's been floating around for a long time: https://mail.python.org/pipermail/python-list/2005-January/329539.html Quite frankly, although I believe that tests are of vital importance, I remain to be convinced whether they should be "a first-class citizen of the language" as you put it. In my experience, for every line of code you write, you'll probably need anything from 3-10 lines of test code. I don't think that much test code belongs in the same module as the function itself -- that's an invitation to have people stint on their testing. Look at the example given -- do you really feel that this is sufficient testing for the sum() function? I don't object to having a few, simple, fast tests in the main module, but for "real" unit testing and regression testing, they ought to be moved out into a separate file. Tests are often, usually, boring code which just distracts from the code you care about. I would hate to see it become common to have something like this: def some_function(): # 20 lines of code where: # 170 lines of tests become standard. But in any case, I think it is far too early to be thinking about stealing an idea from Pyret. The language isn't even stable yet, let alone at the stage where this idea has proven itself in the real world. Pyret and the where clause is still experimental. > It's quite similar to the doctests ideas I guess - but not intended to > be documentation like them. > > I ended up disliking docttests because of this doc+test duality by the > way: it often ends up as a > not so good documentation and not so good tests. In my experience, people who write poor doctests would write equally poor unit tests, or "where" tests if Python had this feature. > Anyways, having a dedicated keyword to append after a function some > tests as part of the language > has benefits imho: > > - the scope is reduced to the function - so it helps making 'real' > isolated unit tests. I don't understand this comment. In what way are unit tests from the unittest module, or doctests, not "real" unit tests? -- Steven From ncoghlan at gmail.com Sun Nov 10 15:48:07 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Nov 2013 00:48:07 +1000 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <527F5804.9010605@ziade.org> References: <527F5804.9010605@ziade.org> Message-ID: On 10 November 2013 19:55, Tarek Ziad? wrote: > From the documentation example: > > fun sum(l): > cases(List) l: > | empty => 0 > | link(first, rest) => first + sum(rest) > end > where: > sum([]) is 0 > sum([1, 2, 3]) is 6 > end It would make more sense to just bake py.test style rich assertions into the language in some way and let people write: def sum(iterable): # implementation of sum assert sum([]) == 0 assert sum([1, 2, 3]) == 6 A mechanism to say "always execute assert statements in this module regardless of optimisation level" could also be useful. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From skip at pobox.com Sun Nov 10 16:21:38 2013 From: skip at pobox.com (Skip Montanaro) Date: Sun, 10 Nov 2013 09:21:38 -0600 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: Message-ID: > Sadly, I *really* dislike that. To me, "where" has absolutely nothing to > do with testing. I see that Pyret also includes a "check" keyword which > also does testing. That seems like a more sensible keyword. "where" seems like it would implement argument constraints. I don't think you want the (necessary) mess of comprehensive unit test code embedded in the source, so nix on the "check" keyword too. It would just obscure the structure of the code. Skip -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Nov 10 17:12:19 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 10 Nov 2013 17:12:19 +0100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: <527F5804.9010605@ziade.org> Message-ID: Le 10/11/2013 15:48, Nick Coghlan a ?crit : > It would make more sense to just bake py.test style rich assertions > into the language in some way and let people write: > > def sum(iterable): > # implementation of sum > > assert sum([]) == 0 > assert sum([1, 2, 3]) == 6 This has the same problem as doctests: it works well for trivial tests like the above, but will be difficult to scale towards more complicated testing. unittest-like structuration is really what works best for most testing situations, IMO. Alternative testing schemes for "easier" or "more intuitive" testing have generally failed as general-purpose tools. Regards Antoine. From ron3200 at gmail.com Sun Nov 10 19:37:24 2013 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 10 Nov 2013 12:37:24 -0600 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: <527F5804.9010605@ziade.org> Message-ID: On 11/10/2013 08:48 AM, Nick Coghlan wrote: > On 10 November 2013 19:55, Tarek Ziad? wrote: >> > From the documentation example: >> > >> >fun sum(l): >> > cases(List) l: >> > | empty => 0 >> > | link(first, rest) => first + sum(rest) >> > end >> >where: >> > sum([]) is 0 >> > sum([1, 2, 3]) is 6 >> >end > It would make more sense to just bake py.test style rich assertions > into the language in some way and let people write: > > def sum(iterable): > # implementation of sum > > assert sum([]) == 0 > assert sum([1, 2, 3]) == 6 > A mechanism to say "always execute assert statements in this module > regardless of optimisation level" could also be useful. + 1 Currently assert statements are removed if the -O flag is used, but somehow that never seemed quite right to me. It makes more sense to have a -A option to turn them on, rather than using -O to turn them off. That change *along with* a way to "always execute asserts in this scope" would make asserts more useful. Cheers, Ron From mertz at gnosis.cx Sun Nov 10 19:55:08 2013 From: mertz at gnosis.cx (David Mertz) Date: Sun, 10 Nov 2013 10:55:08 -0800 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: <527F5804.9010605@ziade.org> Message-ID: On Sun, Nov 10, 2013 at 10:37 AM, Ron Adam wrote: > def sum(iterable): > >> # implementation of sum >> >> assert sum([]) == 0 >> assert sum([1, 2, 3]) == 6 >> > > A mechanism to say "always execute assert statements in this module >> regardless of optimisation level" could also be useful. >> > > That change *along with* a way to "always execute asserts in this scope" > would make asserts more useful. > assert not test_mode or sum([]) == 0 assert not test_mode or sum([1, 2, 3]) == 6 -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 10 22:57:21 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 11 Nov 2013 07:57:21 +1000 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: <527F5804.9010605@ziade.org> Message-ID: On 11 Nov 2013 02:12, "Antoine Pitrou" wrote: > > Le 10/11/2013 15:48, Nick Coghlan a ?crit : > >> It would make more sense to just bake py.test style rich assertions >> into the language in some way and let people write: >> >> def sum(iterable): >> # implementation of sum >> >> assert sum([]) == 0 >> assert sum([1, 2, 3]) == 6 > > > This has the same problem as doctests: it works well for trivial tests like the above, but will be difficult to scale towards more complicated testing. Yeah, part of my point was actually that module level assertions allow this kind of thing today, and there are good reasons people don't do it in practice. > unittest-like structuration is really what works best for most testing situations, IMO. Alternative testing schemes for "easier" or "more intuitive" testing have generally failed as general-purpose tools. Agreed, but the spelling of test assertions as methods on test cases isn't an essential part of that structure. For 3.5, it would be nice to offer either testtools style "matcher" objects or py.test style always-on-in-test-modules rich assertions. Either approach has a real world base to draw from, but would still involve a fair bit of work (they're also not mutually exclusive - a matcher based approach could be used as the back end for rich assertions). Cheers, Nick. > > Regards > > Antoine. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Nov 10 23:10:51 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 11 Nov 2013 11:10:51 +1300 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <20131110120111.GB2085@ando> References: <527F5804.9010605@ziade.org> <20131110120111.GB2085@ando> Message-ID: <5280046B.3080108@canterbury.ac.nz> Steven D'Aprano wrote: > I would prefer to see some variation on Nick Coglan's ideas about a > "where" keyword for local scoping of temporary variables. I second that. This is the way mathematicians use the word "where", and it would be a much better way of spending a keyword. > In my experience, for every line of code > you write, you'll probably need anything from 3-10 lines of test code. Indeed. Also, the situations where you can meaningfully test each function on its own using a few concisely- expressed test cases are relatively rare, IMO. It looks good for the kind of exercises you find in programming courses, but it doesn't scale up to real-life code that requires complex data structures and test harnesses to be set up. -- Greg From tjreedy at udel.edu Mon Nov 11 02:21:33 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 10 Nov 2013 20:21:33 -0500 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <527F5AA9.2000405@ziade.org> References: <527F5804.9010605@ziade.org> <527F5AA9.2000405@ziade.org> Message-ID: On 11/10/2013 5:06 AM, Tarek Ziad? wrote: > Le 11/10/13 11:00 AM, Markus Unterwaditzer a ?crit : >> While it is a nice idea, i don't think this feature deserves its own >> syntax. To put it another way, we already have syntax support for testing: assert, which is sugar for conditional stateements, but with extra features; and functions, which are executed when called. Like most suggested new keywords, 'where' is certain to by in use already as an identifier. So we need a really good reason, with no better alternative, to make it a keyword. >> [test_xxx functions] > yes, that's what we all already do: tests in test_xxx functions. > And the adopted convention is to have the tests in dedicated > tests modules, which defeats the benefit of having > real isolated unit tests just by the function code. As others have noted, one big reason for the convention is that testing class methods usually requires big chunks of code that are better isolated in another file. However, the convention is not a rule, and for pure module-level functions that run in isolation, one is free to include tests in the doc string or just after. See example below. > And even if we place the test function just besides the tested function, > Python will not make any distinction : they are both just functions. Functions are fine. Distinction is easily done by an obvious name convention. > Having the ability to distinguish tests and regular code at the language > level has benefits like the ability to ignore tests when you run the app in > production etc. Functions only run when called. For my book, most example code consists of classical functions. For these, I am doing the following, adapted for your sum example. For didactic reasons, I like having the tests immediately follow the function code; the input-output pairs serve as testable documentation. (This code does not run at the moment as I am midstream in changing the test module.) The first and last statements are boilerplate that is part of a _template.py file. ---- from xploro.test import main, ftest def sum_rec(seq): if seq: return seq[0] + sum_rec(seq[1:]) else: return 0 def sum_for(seq): ret = 0 for num in seq: ret += num return ret def test_sum(): ftest((sum_rec, sum_for), (([], 0), ([1], 1), ([1,2,3], 6),) ) if __name__ == '__main__': main() ---- ftest calls each function with each input of the input-output pairs and checks that the function output matches the output given. main scans globals() for functions named 'test_xxx' and calls them. Anyway, I prefer the above to the 'where' suggestion. -- Terry Jan Reedy From tarek at ziade.org Mon Nov 11 09:58:06 2013 From: tarek at ziade.org (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 11 Nov 2013 09:58:06 +0100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: <527F5804.9010605@ziade.org> Message-ID: <52809C1E.80505@ziade.org> Le 11/10/13 10:57 PM, Nick Coghlan a ?crit : > > > On 11 Nov 2013 02:12, "Antoine Pitrou" > wrote: > > > > Le 10/11/2013 15:48, Nick Coghlan a ?crit : > > > >> It would make more sense to just bake py.test style rich assertions > >> into the language in some way and let people write: > >> > >> def sum(iterable): > >> # implementation of sum > >> > >> assert sum([]) == 0 > >> assert sum([1, 2, 3]) == 6 > > > > > > This has the same problem as doctests: it works well for trivial > tests like the above, but will be difficult to scale towards more > complicated testing. > > Yeah, part of my point was actually that module level assertions allow > this kind of thing today, and there are good reasons people don't do > it in practice. > Interesting... what if we could mark a block of tests like this : def sum(iterable): # implementation of sum asserts: assert sum([]) == 0 assert sum([1, 2, 3]) == 6 ... ..more complicated testing here... Cheers Tarek -------------- next part -------------- An HTML attachment was scrubbed... URL: From tarek at ziade.org Mon Nov 11 09:59:05 2013 From: tarek at ziade.org (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Mon, 11 Nov 2013 09:59:05 +0100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <5280046B.3080108@canterbury.ac.nz> References: <527F5804.9010605@ziade.org> <20131110120111.GB2085@ando> <5280046B.3080108@canterbury.ac.nz> Message-ID: <52809C59.10004@ziade.org> Le 11/10/13 11:10 PM, Greg Ewing a ?crit : > Steven D'Aprano wrote: >> I would prefer to see some variation on Nick Coglan's ideas about a >> "where" keyword for local scoping of temporary variables. > > I second that. This is the way mathematicians use the > word "where", and it would be a much better way of > spending a keyword. I agree that 'where' is not a good name for tests. From tarek at ziade.org Mon Nov 11 10:10:26 2013 From: tarek at ziade.org (=?UTF-8?B?VGFyZWsgWmlhZMOp?=) Date: Mon, 11 Nov 2013 10:10:26 +0100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: References: <527F5804.9010605@ziade.org> <527F5AA9.2000405@ziade.org> Message-ID: <52809F02.4060701@ziade.org> Le 11/11/13 2:21 AM, Terry Reedy a ?crit : > .. > >> Having the ability to distinguish tests and regular code at the language >> level has benefits like the ability to ignore tests when you run the >> app in >> production etc. > > Functions only run when called. I would not say that. You often have meta-level code like class-level bits or decorators, the interpreter will read -- that will load stuff. In other words, a program with tests functions in your code will be different in memory that the same program without tests functions. > > For my book, most example code consists of classical functions. For > these, I am doing the following, adapted for your sum example. For > didactic reasons, I like having the tests immediately follow the > function code; the input-output pairs serve as testable documentation. > (This code does not run at the moment as I am midstream in changing > the test module.) The first and last statements are boilerplate that > is part of a _template.py file. > > ---- > from xploro.test import main, ftest > > def sum_rec(seq): > if seq: > return seq[0] + sum_rec(seq[1:]) > else: > return 0 > > def sum_for(seq): > ret = 0 > for num in seq: > ret += num > return ret > > def test_sum(): > ftest((sum_rec, sum_for), (([], 0), ([1], 1), ([1,2,3], 6),) ) > > if __name__ == '__main__': > main() > ---- > > ftest calls each function with each input of the input-output pairs > and checks that the function output matches the output given. > main scans globals() for functions named 'test_xxx' and calls them. > > Anyway, I prefer the above to the 'where' suggestion. That's a good template indeed, I have 3 remarks though: 1/ you are importing your test framework even if you don't run the tests. 2/ those are not pure unit tests, since test_sum() tests 2 separate functions - but I guess this rule can be broken in this case. 3/ main() is an optional artifact from unittest - most developers your a script that takes care of the test discovery (unittest2, nosetests, etc) So I would rather write in plain python: ---- def sum_rec(seq): if seq: return seq[0] + sum_rec(seq[1:]) else: return 0 def sum_for(seq): ret = 0 for num in seq: ret += num return ret def test_sum(): from xploro.test import main, ftest ftest((sum_for, sum_rec), (([], 0), ([1], 1), ([1,2,3], 6),) ) ---- Cheers Tarek From steve at pearwood.info Mon Nov 11 11:39:31 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 11 Nov 2013 21:39:31 +1100 Subject: [Python-ideas] where statement in Pyret In-Reply-To: <52809C1E.80505@ziade.org> References: <527F5804.9010605@ziade.org> <52809C1E.80505@ziade.org> Message-ID: <20131111103931.GR15615@ando> On Mon, Nov 11, 2013 at 09:58:06AM +0100, Tarek Ziad? wrote: > Interesting... what if we could mark a block of tests like this : > > def sum(iterable): > # implementation of sum > > asserts: > assert sum([]) == 0 > assert sum([1, 2, 3]) == 6 > ... > ..more complicated testing here... I often spell that: if __debug__: assert sum([]) == 0 # more complicated testing here I've been playing around with doctest, and I think it may be useful to have a "check" decorator that you use like this: @check def spam(): """Return spam. >>> spam() 'spam spam spam' """ return ' '.join(['spam']*3) If the doctests pass, the function is returned unchanged, otherwise an exception is raised. Note that this will encourage a specific style of more limited tests focused on just one function at a time. Normally, doctests aren't run until after the entire module is loaded, which lets you do things like this: def spam(): """Return spam. >>> spam() 'spam spam spam' Like ham, but more tasty: >>> taste(spam()) > taste(ham()) True """ return ' '.join(['spam']*3) # definitions of taste and ham follow later in the module. That *won't work* with a check decorator, since at the time the decorator runs, the functions taste and ham don't exist. The same would apply to a "where" clause (or whatever name it is given). This forces the doc tests to be more tightly focused on the function in isolation, which would be both good and bad. The good is that it would discourage overloading the docstring with too many too complex tests. The bad is that it would limit what can be tested this way, and the errors would no doubt be confusing to beginners. "What do you mean NameError? taste is defined right there..." On balance, I think that such a check() decorator would be useful enough that it's worth my time writing it. (Perhaps not useful enough to include in doctest, but I'll certainly stick it in my own personal toolbox.) But I don't think it would be useful enough to make it syntax. -- Steven From apieum at gmail.com Mon Nov 11 12:20:57 2013 From: apieum at gmail.com (Gregory Salvan) Date: Mon, 11 Nov 2013 12:20:57 +0100 Subject: [Python-ideas] Extending language syntax Message-ID: Hi all, I've just joined you, I hope I'll not make you loose your time with questions already covered. Sorry, if my suggest represents lot of work (certainly too much) and about my lack of knowledge about internal implementation, but I hope opening a discussion can help. I would suggest to introduce a new statement to provide a way to extend language syntax. Abstract: While reading this article about None historyI realised I was not alone to wonder about some aspect of the language. Some of these are: - constant and immutability - null object and "None is a singleton" - readability of code *Constant:* It's a convention, no matter with that. My question is more if there is no memory overuse to have fixed values that behave as variable. *Null Object:* NoneType is not overridable. Having null objects instance of NoneType wouldn't make sense ? Can't assign keywords so "None is a singleton". Wouldn't it be more simple to have "keyword" type and be able to assign a keyword to a variable ? With an object hierarchy like: keyword <- ReservedKeyword I find it can make clearer why some keywords are reserved without changing actual behaviour. *Readability of code:* I have shouldDsl use case in mind. The introduction illustrate my thought: "*The goal of Should-DSL is to write should expectations in Python as clear and readable as possible, using an ?almost? natural language (limited by some Python language?s constraints).*" I find ShouldDsl perturbing, as the aim is laudable but the implementation stays a hack of the language. I see the influence of functional paradigm in this syntax and agreed with the fact it becomes a "must have". I love Python syntax and find, it has very few limits, but maybe it can be improved by reducing some constraints. One concerns in extending syntax is that the language can evolve on users usage. I see it like an open laboratory where developpers experiments new keywords like "should" implemented in their favorite language and when this keyword is largely accepted by the community it can finally integrate the core. *A solution*: I was first thinking of a statement like "Instance" to declare an immutable object with a single instance (like None), but I then consider how it would be great to be able to extend the syntax. I thought the keyword "Keyword", which exists in clojure for example, can do the job. A first implementation can be: Keyword (): field = # fixed value: self.field = "something" will raise an error def behaviour(self, *args, **kwargs): # do something def __get__(self, left, right, block_content): # what to do when accessing keyword Inline implementation: Keyword () *Examples of use:* *Constant*: Keyword Pi(3.14159) *Null Object* and immutable objects: Class Person(object): def __init__(self, name): self.name = name JohnDoe = Person(name="anonymous") Keyword Anonymous(None, JohnDoe) Anonymous.name = "JohnDoe" # raise an error assert isinstance(Anonymous, type(None)) # is true As Anonymous is immutable , it is also thread safe. The mecanism which consist of freezing an object can be applied in other situation. An interesting approach can be to define keywords identity throught a hash of their value, like value object. *Should keyword*: Keyword ShouldEqual(keyword): def __get__(self, left, right, block): if left != right: raise AssertError() ### weird examples: *Multiline Lambdas*: Keyword MultilineLambda(keyword): def __get__(self, left, right, block_content): def multiline_lambda(*args, **kwargs): self.assert_args_valid(right, args, kwargs) # compare "right" (tuple) with args and raise exception in case not valid return self.run(block_content, args, kwargs) # "run" is a shortcut because I don't know which type is the most appropriate for block content return multiline_lambda divide = MultilineLambda a, b: if b == 0: raise MyDivideByZeroError() return a/b *Reimplementing else*: Keyword otherwise(keyword): def __get__(self, left, right, block): if left: return # left is the result of "if" block self.run(block) if something: # do something otherwise: # other Whereas this example is weird, it seems to me that we can better figure out what language do. Thanks all to have kept python open and give us opportunities to freely discuss about its future implementation. Have a nice day, Gr?gory -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Nov 11 20:03:18 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 11 Nov 2013 11:03:18 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: Message-ID: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> I could go through this point by point, but I think I can head it all off by explaining what I think is a fundamental misconception that's misleading you. In Python, variables aren't "memory locations" where values live, they're just names that are bound to values that live somewhere on their own. Rebinding a variable to a different value doesn't mutate anything. You can't make variables refer to other variables, and making two variables refer to the same value doesn't have the same referential transparency, thread safety, etc. issues as making a variable refer to another variable. For example, in the following code: me = Person('Andrew Barnert') andrew = me me = Person('Dr. Sam Beckett') ... nothing has been mutated. In particular, the andrew variable is unchanged; it's still referring to the same Person('Andrew Barnert') object as before, not the new one. If I make the Person type mutable, and call a mutating method on it (or do so implicitly, by setting an attribute or a keyed or indexed member), then of course I can change the value, which will be visible to all variables referring to that value. But assignment does not do that. So, you already have almost everything you want. You can create new immutable types and global singleton values of those types. Or create Enums. Or just create singleton objects whose type doesn't matter, which are immutable and compared by identity, just by calling object(). All of this is trivial. Taking one of your examples: Pi = 3.14159 ... gives you everything you wanted, except for the fact that you can accidentally or maliciously rebind it to a new value. It's already an immutable, thread-safe constant. You can even make a variable name un-rebindable with a module import hook that produces a custom globals, if you really want to, although I can't imagine that ever being worth doing. The only additional thing a keyword gives you is that the compiler can prevent you from rebinding the name to a different value, at compile time rather than run time. That's it. Meanwhile, the disadvantage of allowing new keywords to be defined for immutable constants is huge. You could no longer compile, or even parse, any code without checking each token against the current runtime environment. That makes the parser slower and more complicated, introduces a dependency that makes it hard to keep the components separate, makes pyc files and marshal/pickle and so on useless, makes it much harder for code tools to use the ast module, etc. All for a very tiny benefit. Sent from a random iPhone On Nov 11, 2013, at 3:20, Gregory Salvan wrote: > Hi all, > I've just joined you, I hope I'll not make you loose your time with questions already covered. > Sorry, if my suggest represents lot of work (certainly too much) and about my lack of knowledge about internal implementation, but I hope opening a discussion can help. > > I would suggest to introduce a new statement to provide a way to extend language syntax. > > Abstract: > While reading this article about None history I realised I was not alone to wonder about some aspect of the language. > > Some of these are: > - constant and immutability > - null object and "None is a singleton" > - readability of code > > Constant: > It's a convention, no matter with that. > My question is more if there is no memory overuse to have fixed values that behave as variable. > > Null Object: > NoneType is not overridable. > Having null objects instance of NoneType wouldn't make sense ? > Can't assign keywords so "None is a singleton". > Wouldn't it be more simple to have "keyword" type and be able to assign a keyword to a variable ? > With an object hierarchy like: keyword <- ReservedKeyword > I find it can make clearer why some keywords are reserved without changing actual behaviour. > > Readability of code: > I have shouldDsl use case in mind. > The introduction illustrate my thought: > "The goal of Should-DSL is to write should expectations in Python as clear and readable as possible, using an ?almost? natural language (limited by some Python language?s constraints)." > > I find ShouldDsl perturbing, as the aim is laudable but the implementation stays a hack of the language. > I see the influence of functional paradigm in this syntax and agreed with the fact it becomes a "must have". > > I love Python syntax and find, it has very few limits, but maybe it can be improved by reducing some constraints. > > One concerns in extending syntax is that the language can evolve on users usage. > I see it like an open laboratory where developpers experiments new keywords like "should" implemented in their favorite language and when this keyword is largely accepted by the community it can finally integrate the core. > > A solution: > I was first thinking of a statement like "Instance" to declare an immutable object with a single instance (like None), but I then consider how it would be great to be able to extend the syntax. > I thought the keyword "Keyword", which exists in clojure for example, can do the job. > > A first implementation can be: > > Keyword (): > field = # fixed value: self.field = "something" will raise an error > def behaviour(self, *args, **kwargs): > # do something > def __get__(self, left, right, block_content): > # what to do when accessing keyword > > Inline implementation: > Keyword () > > > Examples of use: > > Constant: > Keyword Pi(3.14159) > > Null Object and immutable objects: > Class Person(object): > def __init__(self, name): > self.name = name > > JohnDoe = Person(name="anonymous") > > Keyword Anonymous(None, JohnDoe) > > Anonymous.name = "JohnDoe" # raise an error > assert isinstance(Anonymous, type(None)) # is true > > > As Anonymous is immutable , it is also thread safe. > The mecanism which consist of freezing an object can be applied in other situation. > > An interesting approach can be to define keywords identity throught a hash of their value, like value object. > > Should keyword: > > Keyword ShouldEqual(keyword): > def __get__(self, left, right, block): > if left != right: > raise AssertError() > > ### weird examples: > > Multiline Lambdas: > Keyword MultilineLambda(keyword): > def __get__(self, left, right, block_content): > def multiline_lambda(*args, **kwargs): > self.assert_args_valid(right, args, kwargs) # compare "right" (tuple) with args and raise exception in case not valid > return self.run(block_content, args, kwargs) # "run" is a shortcut because I don't know which type is the most appropriate for block content > return multiline_lambda > > divide = MultilineLambda a, b: > if b == 0: > raise MyDivideByZeroError() > return a/b > > Reimplementing else: > Keyword otherwise(keyword): > def __get__(self, left, right, block): > if left: return # left is the result of "if" block > self.run(block) > > if something: > # do something > otherwise: > # other > > Whereas this example is weird, it seems to me that we can better figure out what language do. > > > Thanks all to have kept python open and give us opportunities to freely discuss about its future implementation. > > Have a nice day, > Gr?gory > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From xuancong84 at gmail.com Tue Nov 12 04:45:30 2013 From: xuancong84 at gmail.com (Xuancong Wang) Date: Tue, 12 Nov 2013 11:45:30 +0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 Message-ID: Hi python developers, I notice that one major change in python 3 is that it makes 'print' as a standard function, and it will require typing (). I do understand that it makes python language more consistent because most of the python functionalities are implemented as function calls. As you know, reading from and writing to IO is a high frequency operation. By entropy coding theorem (e.g. Huffman coding), an efficient language should assign shorter language code to more frequent tasks. Typing a '(' requires holding SHIFT and pressing 9, the input effort is much higher than that in Python 2. Also, specifying IO has changed from >>* to file=*, which also becomes more inconvenient. I hope you we can take a look at user's code and see what are the most commonly used functions and try to shorten the language codes for those functions. Assigning shortest language codes to most frequently used functions will make python the best programming language in the world. What I suggest is that either we can treat those most common functions like print as a special command to avoid typing (), or we add alias to these most common functions, so that if the user types: print >>fp1, 'hello world' Internally, it is equivalent to calling function print('hello world', file=fp1) Another suggestion is that 'enumerate' is also frequently used, hopefully we can shorten the command as well. Any comments? Wang Xuancong National University of Singapore -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Tue Nov 12 04:56:31 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 11 Nov 2013 19:56:31 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: On Mon, Nov 11, 2013 at 7:45 PM, Xuancong Wang wrote: > As you know, reading from and writing to IO is a high frequency operation. > By entropy coding theorem (e.g. Huffman coding), an efficient language > should assign shorter language code to more frequent tasks. Typing a '(' > requires holding SHIFT and pressing 9, the input effort is much higher than > that in Python 2. Also, specifying IO has changed from >>* to file=*, which > also becomes more inconvenient. An interesting methodology; however, I think PERL already has conquered this corner of the language world. -- MarkJ Tacoma, Washington From rosuav at gmail.com Tue Nov 12 05:07:21 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Nov 2013 15:07:21 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: On Tue, Nov 12, 2013 at 2:45 PM, Xuancong Wang wrote: > As you know, reading from and writing to IO is a high frequency operation. > By entropy coding theorem (e.g. Huffman coding), an efficient language > should assign shorter language code to more frequent tasks. Typing a '(' > requires holding SHIFT and pressing 9, the input effort is much higher than > that in Python 2. Also, specifying IO has changed from >>* to file=*, which > also becomes more inconvenient. Yes, I/O is a very common operation; but not all I/O gets its own statement. Python doesn't have a keyword for reading input (REXX, for instance, has 'say' for output and 'pull' for input), and admittedly output IS more common than input, but I still don't see that having a keyword for one of them is really beneficial. Advantages of print being a function: * You can override it. Can't do that with a special language element. * It can be used in map, lambda, and other expression contexts. * Precedence etc follows the normal rules of functions - the arguments are all tidily enclosed. * Keyword arguments, rather than magical syntax, handle the oddities like end=" ". * You can easily alias it: "p = print; p('Hello, world!')" Advantages of print being a statement: * You don't have to hit Shift to create console output. Parsimony is important in design. Why have a statement when it can just be a built-in? Every keyword needs to justify itself, and far more than just "make this faster to type". Console output is common, but not half as common as, say, assignment, which is why assignment gets an extremely short and easy-to-type token. Basic arithmetic gets operators; "square root" doesn't, it just gets a named function. The less the language has to do (and the more the standard library does), the easier the language is to grok. >From the Zen of Python: Special cases aren't special enough to break the rules. The rule is that function-like operations get called as functions. Maybe this would be a small benefit, but the cost of special-casing print is high. Anyway, you already get a single function name for it - it's not "sys.stdout.write(...)" - so it's already been given quite a boost in readability :) ChrisA From apieum at gmail.com Tue Nov 12 06:03:12 2013 From: apieum at gmail.com (Gregory Salvan) Date: Tue, 12 Nov 2013 06:03:12 +0100 Subject: [Python-ideas] Extending language syntax In-Reply-To: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> Message-ID: Thank you. You're almost right, but there is misunderstanding. I suggested generating value object by freezing object states and to have object identities defined on their values instead of their memory allocation. so : pi1 = keyword(3.14) pi2 = keyword(3.14) assert pi1 is pi2 # is true and: you = Person('Andrew Barnert') andrew = keyword(you) andrew.name = 'Dr. Sam Beckett' # will raise an error. It is value object not singleton because pi1 and pi2 are not of the same instance. This matter when doing concurrency but this is not the point as you can cheat with "repr" or "hash". I would focus mainly on how to extend language syntax and permit writing code like: 'abc' should_equal 'abc' Value object seems to me necessary to avoid side effects. I miss your point about marshal/pickle and will dig it to understand. I hope in future I can be more relevant. Sorry for the inconvenience, thanks for taking the time to answer, good continuation. 2013/11/11 Andrew Barnert > I could go through this point by point, but I think I can head it all off > by explaining what I think is a fundamental misconception that's misleading > you. > > In Python, variables aren't "memory locations" where values live, they're > just names that are bound to values that live somewhere on their own. > Rebinding a variable to a different value doesn't mutate anything. You > can't make variables refer to other variables, and making two variables > refer to the same value doesn't have the same referential > transparency, thread safety, etc. issues as making a variable refer to > another variable. For example, in the following code: > > me = Person('Andrew Barnert') > andrew = me > me = Person('Dr. Sam Beckett') > > ... nothing has been mutated. In particular, the andrew variable is > unchanged; it's still referring to the same Person('Andrew Barnert') object > as before, not the new one. > > If I make the Person type mutable, and call a mutating method on it (or do > so implicitly, by setting an attribute or a keyed or indexed member), then > of course I can change the value, which will be visible to all variables > referring to that value. But assignment does not do that. > > So, you already have almost everything you want. You can create new > immutable types and global singleton values of those types. Or create > Enums. Or just create singleton objects whose type doesn't matter, which > are immutable and compared by identity, just by calling object(). All of > this is trivial. > > Taking one of your examples: > > Pi = 3.14159 > > ... gives you everything you wanted, except for the fact that you can > accidentally or maliciously rebind it to a new value. It's already an > immutable, thread-safe constant. > > You can even make a variable name un-rebindable with a module import hook > that produces a custom globals, if you really want to, although I can't > imagine that ever being worth doing. > > The only additional thing a keyword gives you is that the compiler can > prevent you from rebinding the name to a different value, at compile time > rather than run time. That's it. > > Meanwhile, the disadvantage of allowing new keywords to be defined for > immutable constants is huge. You could no longer compile, or even parse, > any code without checking each token against the current runtime > environment. That makes the parser slower and more complicated, introduces > a dependency that makes it hard to keep the components separate, makes pyc > files and marshal/pickle and so on useless, makes it much harder for code > tools to use the ast module, etc. All for a very tiny benefit. > > Sent from a random iPhone > > On Nov 11, 2013, at 3:20, Gregory Salvan wrote: > > Hi all, > I've just joined you, I hope I'll not make you loose your time with > questions already covered. > Sorry, if my suggest represents lot of work (certainly too much) and about > my lack of knowledge about internal implementation, but I hope opening a > discussion can help. > > I would suggest to introduce a new statement to provide a way to extend > language syntax. > > Abstract: > While reading this article about None historyI realised I was not alone to wonder about some aspect of the language. > > Some of these are: > - constant and immutability > - null object and "None is a singleton" > - readability of code > > *Constant:* > It's a convention, no matter with that. > My question is more if there is no memory overuse to have fixed values > that behave as variable. > > > *Null Object:* > NoneType is not overridable. > Having null objects instance of NoneType wouldn't make sense ? > Can't assign keywords so "None is a singleton". > Wouldn't it be more simple to have "keyword" type and be able to assign a > keyword to a variable ? > With an object hierarchy like: keyword <- ReservedKeyword > I find it can make clearer why some keywords are reserved without changing > actual behaviour. > > > *Readability of code:* > I have shouldDsl use case in mind. > The introduction illustrate my thought: > "*The goal of Should-DSL is to write should expectations in Python as > clear and readable as possible, using an ?almost? natural language (limited > by some Python language?s constraints).*" > > I find ShouldDsl perturbing, as the aim is laudable but the implementation > stays a hack of the language. > I see the influence of functional paradigm in this syntax and agreed with > the fact it becomes a "must have". > > I love Python syntax and find, it has very few limits, but maybe it can be > improved by reducing some constraints. > > One concerns in extending syntax is that the language can evolve on users > usage. > I see it like an open laboratory where developpers experiments new > keywords like "should" implemented in their favorite language and when this > keyword is largely accepted by the community it can finally integrate the > core. > > *A solution*: > I was first thinking of a statement like "Instance" to declare an > immutable object with a single instance (like None), but I then consider > how it would be great to be able to extend the syntax. > I thought the keyword "Keyword", which exists in clojure for example, > can do the job. > > A first implementation can be: > > Keyword (): > field = # fixed value: self.field = "something" will raise > an error > def behaviour(self, *args, **kwargs): > # do something > def __get__(self, left, right, block_content): > # what to do when accessing keyword > > Inline implementation: > Keyword () > > > *Examples of use:* > > *Constant*: > Keyword Pi(3.14159) > > *Null Object* and immutable objects: > Class Person(object): > def __init__(self, name): > self.name = name > > JohnDoe = Person(name="anonymous") > > Keyword Anonymous(None, JohnDoe) > > Anonymous.name = "JohnDoe" # raise an error > assert isinstance(Anonymous, type(None)) # is true > > > As Anonymous is immutable , it is also thread safe. > The mecanism which consist of freezing an object can be applied in other > situation. > > An interesting approach can be to define keywords identity throught a hash > of their value, like value object. > > *Should keyword*: > > Keyword ShouldEqual(keyword): > def __get__(self, left, right, block): > if left != right: > raise AssertError() > > ### weird examples: > > *Multiline Lambdas*: > Keyword MultilineLambda(keyword): > def __get__(self, left, right, block_content): > def multiline_lambda(*args, **kwargs): > self.assert_args_valid(right, args, kwargs) # compare "right" > (tuple) with args and raise exception in case not valid > return self.run(block_content, args, kwargs) # "run" is a shortcut > because I don't know which type is the most appropriate for block content > return multiline_lambda > > divide = MultilineLambda a, b: > if b == 0: > raise MyDivideByZeroError() > return a/b > > *Reimplementing else*: > Keyword otherwise(keyword): > def __get__(self, left, right, block): > if left: return # left is the result of "if" block > self.run(block) > > if something: > # do something > otherwise: > # other > > Whereas this example is weird, it seems to me that we can better figure > out what language do. > > > Thanks all to have kept python open and give us opportunities to freely > discuss about its future implementation. > > Have a nice day, > Gr?gory > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xuancong84 at gmail.com Tue Nov 12 06:03:40 2013 From: xuancong84 at gmail.com (Xuancong Wang) Date: Tue, 12 Nov 2013 13:03:40 +0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 Message-ID: >An interesting methodology; however, I think PERL already has >conquered this corner of the language world. I think Perl has much lower efficiency than python, especially you need to type a $ before every variable. The input effort of $ is very high because you need to press shift. Also, you need to type {} for every function/for/while structure. In terms of language efficiency, I think Perl is no comparison to Python. We can roughly estimate the input effort of every key in the following way: Normal alphabet keys: effort=1 Numbers 0~9: effort=1.2 Shift/Tab: effort=0.6 Ctrl/Alt: effort=0.8 {}[];'\,./-=: effort=1.2 (effort measures how difficult it is to press the key) Therefore, any composed keys like shift+9='(', the input effort is 0.6+1.2=1.8 that's why we should try to avoid composed keys if it's not necessary. >>"Advantages of print being a function: * You can override it. Can't do that with a special language element. * It can be used in map, lambda, and other expression contexts. * Precedence etc follows the normal rules of functions - the arguments are all tidily enclosed. * Keyword arguments, rather than magical syntax, handle the oddities like end=" ". * You can easily alias it: "p = print; p('Hello, world!')" I do agree that print should remain as a function logically. But is there a way to make it as simple as in python 2, or even simpler, for example: pr >>sys.stderr, 'hello world' xuancong On Tue, Nov 12, 2013 at 11:56 AM, Mark Janssen wrote: > On Mon, Nov 11, 2013 at 7:45 PM, Xuancong Wang > wrote: > > As you know, reading from and writing to IO is a high frequency > operation. > > By entropy coding theorem (e.g. Huffman coding), an efficient language > > should assign shorter language code to more frequent tasks. Typing a '(' > > requires holding SHIFT and pressing 9, the input effort is much higher > than > > that in Python 2. Also, specifying IO has changed from >>* to file=*, > which > > also becomes more inconvenient. > > An interesting methodology; however, I think PERL already has > conquered this corner of the language world. > -- > MarkJ > Tacoma, Washington > -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian at python.org Tue Nov 12 06:23:19 2013 From: brian at python.org (Brian Curtin) Date: Mon, 11 Nov 2013 23:23:19 -0600 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: On Mon, Nov 11, 2013 at 11:03 PM, Xuancong Wang wrote: > I do agree that print should remain as a function logically. But is there a > way to make it as simple as in python 2, or even simpler, for example: > pr >>sys.stderr, 'hello world' It cannot get any more simple than doing what it is named to do. print prints. Anything else is not readable. From rosuav at gmail.com Tue Nov 12 06:27:20 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Nov 2013 16:27:20 +1100 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> Message-ID: On Tue, Nov 12, 2013 at 4:03 PM, Gregory Salvan wrote: > I suggested generating value object by freezing object states and to have > object identities defined on their values instead of their memory > allocation. I can see some potential in this if there were a way to say "This will never change, don't refcount it or GC-check it"; that might improve performance across a fork (or across threads), but it'd take a lot of language support. Effectively, you would be forfeiting the usual GC memory saving "this isn't needed, get rid of it" and fixing it in memory someplace. The question would be: Is the saving of not writing to that memory (updating refcounts, or marking for a mark/sweep GC, or whatever the equivalent is for each Python implementation) worth the complexity of checking every object to see if it's a frozen one? ChrisA From guido at python.org Tue Nov 12 06:27:55 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 11 Nov 2013 21:27:55 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: You probably ought to weigh the ease of typing against readability. Which is more important? By how much? If 10% more effort typing makes for 5% better readability, but the code is read 5 times as often as it is edited, is that then worth it? Separately, have you measured typing speed rigorously? On Monday, November 11, 2013, Xuancong Wang wrote: > >An interesting methodology; however, I think PERL already has > >conquered this corner of the language world. > > I think Perl has much lower efficiency than python, especially you need to > type a $ before every variable. The input effort of $ is very high because > you need to press shift. > > Also, you need to type {} for every function/for/while structure. > > In terms of language efficiency, I think Perl is no comparison to Python. > > We can roughly estimate the input effort of every key in the following way: > Normal alphabet keys: effort=1 > Numbers 0~9: effort=1.2 > Shift/Tab: effort=0.6 > Ctrl/Alt: effort=0.8 > {}[];'\,./-=: effort=1.2 > (effort measures how difficult it is to press the key) > Therefore, any composed keys like shift+9='(', the input effort is > 0.6+1.2=1.8 > that's why we should try to avoid composed keys if it's not necessary. > > >>"Advantages of print being a function: > * You can override it. Can't do that with a special language element. > * It can be used in map, lambda, and other expression contexts. > * Precedence etc follows the normal rules of functions - the arguments > are all tidily enclosed. > * Keyword arguments, rather than magical syntax, handle the oddities > like end=" ". > * You can easily alias it: "p = print; p('Hello, world!')" > > I do agree that print should remain as a function logically. But is there > a way to make it as simple as in python 2, or even simpler, for example: > pr >>sys.stderr, 'hello world' > > xuancong > > > On Tue, Nov 12, 2013 at 11:56 AM, Mark Janssen > > wrote: > >> On Mon, Nov 11, 2013 at 7:45 PM, Xuancong Wang > >> wrote: >> > As you know, reading from and writing to IO is a high frequency >> operation. >> > By entropy coding theorem (e.g. Huffman coding), an efficient language >> > should assign shorter language code to more frequent tasks. Typing a '(' >> > requires holding SHIFT and pressing 9, the input effort is much higher >> than >> > that in Python 2. Also, specifying IO has changed from >>* to file=*, >> which >> > also becomes more inconvenient. >> >> An interesting methodology; however, I think PERL already has >> conquered this corner of the language world. >> -- >> MarkJ >> Tacoma, Washington >> > > -- --Guido van Rossum (on iPad) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Tue Nov 12 06:31:56 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Nov 2013 16:31:56 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: On Tue, Nov 12, 2013 at 4:23 PM, Brian Curtin wrote: > On Mon, Nov 11, 2013 at 11:03 PM, Xuancong Wang wrote: >> I do agree that print should remain as a function logically. But is there a >> way to make it as simple as in python 2, or even simpler, for example: >> pr >>sys.stderr, 'hello world' > > It cannot get any more simple than doing what it is named to do. print > prints. Anything else is not readable. It's funny how many words we have that all quite validly describe the concept of producing console output. C calls it "print" (printf), or "write"; Python calls it "print"; REXX opts for "say"; C++ goes for "cout" (console output)... yet all of them are very clear, in their own way, and nobody expects Python to create hard copies or REXX to use the speaker :) ChrisA From xuancong84 at gmail.com Tue Nov 12 06:52:51 2013 From: xuancong84 at gmail.com (Xuancong Wang) Date: Tue, 12 Nov 2013 13:52:51 +0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 Message-ID: >You probably ought to weigh the ease of typing against readability. Which is more important? By how much? As people are used to python2, I don't think changing print to print() will improve code readability by how much, especially now VIM already highlights the print command in python. Probably you don't use print in python quite often, so you may not sense the difference in typing brackets. If you ever used C/C++/perl frequently, you should sense the inconvenience caused by brackets in loop structures. >Separately, have you measured typing speed rigorously? The typing effort varies from person to person, from beginners to experts, maybe you are too expert to sense the difference which is sensitive to beginners. There's no rule-of-thumb and it's not just a speed issue. It's also related to energy spent by your fingers. So obviously, pressing 2 keys, you need to spend at least twice the amount of energy. And even more if you need to hold down 1 key. Anyway, I am just introducing an alternative form to the print() function. As long as this does not reduce the performance of the interpreter significantly, it should not harm. And it should also improve the downward compatibility. On Tue, Nov 12, 2013 at 1:27 PM, Guido van Rossum wrote: > You probably ought to weigh the ease of typing against readability. Which > is more important? By how much? If 10% more effort typing makes for 5% > better readability, but the code is read 5 times as often as it is edited, > is that then worth it? > > Separately, have you measured typing speed rigorously? > > On Monday, November 11, 2013, Xuancong Wang wrote: > >> >An interesting methodology; however, I think PERL already has >> >conquered this corner of the language world. >> >> I think Perl has much lower efficiency than python, especially you need >> to type a $ before every variable. The input effort of $ is very high >> because you need to press shift. >> >> Also, you need to type {} for every function/for/while structure. >> >> In terms of language efficiency, I think Perl is no comparison to Python. >> >> We can roughly estimate the input effort of every key in the following >> way: >> Normal alphabet keys: effort=1 >> Numbers 0~9: effort=1.2 >> Shift/Tab: effort=0.6 >> Ctrl/Alt: effort=0.8 >> {}[];'\,./-=: effort=1.2 >> (effort measures how difficult it is to press the key) >> Therefore, any composed keys like shift+9='(', the input effort is >> 0.6+1.2=1.8 >> that's why we should try to avoid composed keys if it's not necessary. >> >> >>"Advantages of print being a function: >> * You can override it. Can't do that with a special language element. >> * It can be used in map, lambda, and other expression contexts. >> * Precedence etc follows the normal rules of functions - the arguments >> are all tidily enclosed. >> * Keyword arguments, rather than magical syntax, handle the oddities >> like end=" ". >> * You can easily alias it: "p = print; p('Hello, world!')" >> >> I do agree that print should remain as a function logically. But is there >> a way to make it as simple as in python 2, or even simpler, for example: >> pr >>sys.stderr, 'hello world' >> >> xuancong >> >> >> On Tue, Nov 12, 2013 at 11:56 AM, Mark Janssen > > wrote: >> >>> On Mon, Nov 11, 2013 at 7:45 PM, Xuancong Wang >>> wrote: >>> > As you know, reading from and writing to IO is a high frequency >>> operation. >>> > By entropy coding theorem (e.g. Huffman coding), an efficient language >>> > should assign shorter language code to more frequent tasks. Typing a >>> '(' >>> > requires holding SHIFT and pressing 9, the input effort is much higher >>> than >>> > that in Python 2. Also, specifying IO has changed from >>* to file=*, >>> which >>> > also becomes more inconvenient. >>> >>> An interesting methodology; however, I think PERL already has >>> conquered this corner of the language world. >>> -- >>> MarkJ >>> Tacoma, Washington >>> >> >> > > -- > --Guido van Rossum (on iPad) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Tue Nov 12 07:44:03 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 12 Nov 2013 19:44:03 +1300 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: <5281CE33.3070203@canterbury.ac.nz> Xuancong Wang wrote: > As you know, reading from and writing to IO is a high frequency > operation. Actually, no, I don't know that. Thinking about the code I write, the proportion of it devoted to textual output is probably less than 1%, often far less. So by your argument it should have a rather verbose encoding! -- Greg From steve at pearwood.info Tue Nov 12 08:14:04 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Nov 2013 18:14:04 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: <20131112071404.GA18971@ando> On Tue, Nov 12, 2013 at 04:31:56PM +1100, Chris Angelico wrote: > It's funny how many words we have that all quite validly describe the > concept of producing console output. C calls it "print" (printf), or > "write"; Python calls it "print"; REXX opts for "say"; C++ goes for > "cout" (console output)... yet all of them are very clear, in their > own way, and nobody expects Python to create hard copies or REXX to > use the speaker :) I don't know about that. Expecting print to generate hardcopy output is something which beginners to programming often need to unlearn. If print doesn't print, which command do you use to actually print? As for REXX, I expected "say" to use the speaker. That's what the "say" command does in Applescript, for example. -- Steven From rosuav at gmail.com Tue Nov 12 08:27:37 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 12 Nov 2013 18:27:37 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <20131112071404.GA18971@ando> References: <20131112071404.GA18971@ando> Message-ID: On Tue, Nov 12, 2013 at 6:14 PM, Steven D'Aprano wrote: > I don't know about that. Expecting print to generate hardcopy output is > something which beginners to programming often need to unlearn. If print > doesn't print, which command do you use to actually print? Why, LPRINT of course! Yes, I was tainted by BASIC in my early days. ChrisA From steve at pearwood.info Tue Nov 12 08:30:57 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Nov 2013 18:30:57 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <5281CE33.3070203@canterbury.ac.nz> References: <5281CE33.3070203@canterbury.ac.nz> Message-ID: <20131112073057.GB18971@ando> On Tue, Nov 12, 2013 at 07:44:03PM +1300, Greg Ewing wrote: > Xuancong Wang wrote: > >As you know, reading from and writing to IO is a high frequency > >operation. > > Actually, no, I don't know that. Thinking about the code I write, > the proportion of it devoted to textual output is probably less > than 1%, often far less. So by your argument it should have > a rather verbose encoding! I agree strongly! print is simply not important enough to dedicated compiler magic to have it behave differently from every other function. I just picked one random project of mine, and found 23 instances of print in 1644 lines of code, or just under 1.4% of lines. In another project, I had zero instances of print in 2149 lines. In another project, I had 11 instances of the string print, but only 1 was the print statement (the rest were in documentation), out of 791 lines, approximately 0.1%. Consistency is far more important than saving a few key strokes. If print magically treats parentheses as optional, so that these are the same: print a, b, c, sep="**" print(a, b, c, sep="**") people will be confused why print is so special and they can't do the same with any arbitrary function: result = 1 + my_function a, b, c, keyword_argument=42 And of course, there is the problem that leaving out the parentheses makes it ambiguous in Python 3. Does this: my_tuple = (100, 200, print 42, "hello") print "42" and generate the tuple (100, 200, None, "hello"), or does it print "42 hello" and generate the tuple (100, 200, None)? -- Steven From stephen at xemacs.org Tue Nov 12 08:38:13 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 12 Nov 2013 16:38:13 +0900 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: <878uwuhxy2.fsf@uwakimon.sk.tsukuba.ac.jp> Xuancong Wang writes: > I notice that one major change in python 3 is that it makes > 'print' as a standard function, and it will require typing (). What makes you think typing "()" is required? True, it needs to be present in the file, but typing is not necessarily required. Eg, if you use Emacs, you can define an abbreviation such that pr SPC automatically produces "print()"; with a little more effort you could define a "skeleton" which positions the cursor between the parentheses. I don't know if IDLE or WingIDE will do this for you, but certainly you could add such a feature to IDLE. I imagine other editors have similar possibilities. If there's really so much demand for it, I imagine Emacs's python-mode and IDLE at least will add those features. This kind of technology is getting quite advanced; predictive input methods are essential in Japanese, and I would suppose in Chinese as well. Spoken input is now a proven technology. I really don't think it's a good idea to argue about typing efficiency in this day and age; the changes you suggest are likely to be obsolete within a few years. From techtonik at gmail.com Tue Nov 12 09:45:08 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 12 Nov 2013 11:45:08 +0300 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <20131112073057.GB18971@ando> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> Message-ID: On Tue, Nov 12, 2013 at 10:30 AM, Steven D'Aprano wrote: > On Tue, Nov 12, 2013 at 07:44:03PM +1300, Greg Ewing wrote: >> Xuancong Wang wrote: >> >As you know, reading from and writing to IO is a high frequency >> >operation. >> >> Actually, no, I don't know that. Thinking about the code I write, >> the proportion of it devoted to textual output is probably less >> than 1%, often far less. So by your argument it should have >> a rather verbose encoding! > > I agree strongly! print is simply not important enough to dedicated > compiler magic to have it behave differently from every other function. ... Although practicality beats purity. ... I bet that most people here started with Python at time when UX as a definition was unknown, but things that resulted in good UX were called 'pythonic'. http://www.youtube.com/watch?v=Ovj4hFxko7c Now about strong agreement from Steven. I believe this agreement is strictly theoretical. You know - "In theory, practice and theory are the same, but in practice..." - of this type. If we leave the discussion of "dedicated compiler magic" and "proper editor" aside and discuss only the User eXperience - don't you think that fast-typing "print var" is more convenient than "print(var)"? I believe that nobody objects against a better UX, but Steven pinpointed the conflicting point, which I'd like to stress - "dedicated compiler magic". I dislike enforced () in Py3k print as much as two other authors who raised questions about it here in the past week, but I kind of accept it existence for two reasons: 1. it is a clear mark of Python 3 compatible code 2. it makes Python interpreter interpret code faster [1], more reliable [1], less resource consuming [1] on low level platforms, such as Raspberry PI [1]: reference needed - I haven't seen any proof, except maybe a picture of difference in a parse tree (which probably got simplified, because print is now a function, and more complicated, because yield is not) Solution. My position is that I'd like to have both: 1. simple "printsomething" statement as a development aid for quick troubleshooting/debugging/hacking/writing (which may map to a function with single argument call) 2. print() function, for overriding, sending to other functions as param etc. > I just picked one random project of mine, and found 23 instances of > print in 1644 lines of code, or just under 1.4% of lines. In another > project, I had zero instances of print in 2149 lines. In another > project, I had 11 instances of the string print, but only 1 was the > print statement (the rest were in documentation), out of 791 lines, > approximately 0.1%. When researching programs, 90% of code I type are prints, which are never committed anywhere. When developing, I believe it is about 30%, because it is several orders faster to check that all inputs/outputs work correctly than to write proper test suite. So, your metrics is mostly relevant for the primary "print typing" use case that I personally find very annoying if applied to Python 3. > Consistency is far more important than saving a few key strokes. If > print magically treats parentheses as optional, so that these are the > same: > > print a, b, c, sep="**" > print(a, b, c, sep="**") > > people will be confused why print is so special and they can't do the > same with any arbitrary function: > > result = 1 + my_function a, b, c, keyword_argument=42 The first case is "print as statement". The second is "print as expression". People won't be confused, because 'result = print a,b,a' syntax is incorrect in the scenario I proposed. > And of course, there is the problem that leaving out the parentheses > makes it ambiguous in Python 3. Does this: > > my_tuple = (100, 200, print 42, "hello") > > print "42" and generate the tuple (100, 200, None, "hello"), or does it > print "42 hello" and generate the tuple (100, 200, None)? Will it be good if only 'my_tuple = (100, 200, print(42), "hello")' syntax is supported in expression? -- anatoly t. From techtonik at gmail.com Tue Nov 12 09:49:14 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 12 Nov 2013 11:49:14 +0300 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> Message-ID: On Tue, Nov 12, 2013 at 11:45 AM, anatoly techtonik wrote: > On Tue, Nov 12, 2013 at 10:30 AM, Steven D'Aprano wrote: >> On Tue, Nov 12, 2013 at 07:44:03PM +1300, Greg Ewing wrote: >>> Xuancong Wang wrote: >>> >As you know, reading from and writing to IO is a high frequency >>> >operation. >>> >>> Actually, no, I don't know that. Thinking about the code I write, >>> the proportion of it devoted to textual output is probably less >>> than 1%, often far less. So by your argument it should have >>> a rather verbose encoding! >> >> I agree strongly! print is simply not important enough to dedicated >> compiler magic to have it behave differently from every other function. > ... > Although practicality beats purity. > ... > > I bet that most people here started with Python at time when UX as a > definition was unknown, but things that resulted in good UX were called > 'pythonic'. http://www.youtube.com/watch?v=Ovj4hFxko7c > > Now about strong agreement from Steven. I believe this agreement is > strictly theoretical. You know - "In theory, practice and theory are the > same, but in practice..." - of this type. If we leave the discussion of > "dedicated compiler magic" and "proper editor" aside and discuss only > the User eXperience - don't you think that fast-typing "print var" is more > convenient than "print(var)"? > > I believe that nobody objects against a better UX, but Steven pinpointed > the conflicting point, which I'd like to stress - "dedicated compiler magic". > I dislike enforced () in Py3k print as much as two other authors who > raised questions about it here in the past week, but I kind of accept it > existence for two reasons: > > 1. it is a clear mark of Python 3 compatible code > 2. it makes Python interpreter interpret code faster [1], more reliable [1], > less resource consuming [1] on low level platforms, such as > Raspberry PI > > [1]: reference needed - I haven't seen any proof, except maybe a picture > of difference in a parse tree (which probably got simplified, because > print is now a function, and more complicated, because yield is not) s/yield is/yield, yield from are/ > Solution. My position is that I'd like to have both: > 1. simple "printsomething" statement as a development > aid for quick troubleshooting/debugging/hacking/writing (which may map > to a function with single argument call) > 2. print() function, for overriding, sending to other functions as param etc. > >> I just picked one random project of mine, and found 23 instances of >> print in 1644 lines of code, or just under 1.4% of lines. In another >> project, I had zero instances of print in 2149 lines. In another >> project, I had 11 instances of the string print, but only 1 was the >> print statement (the rest were in documentation), out of 791 lines, >> approximately 0.1%. > > When researching programs, 90% of code I type are prints, which are > never committed anywhere. When developing, I believe it is about 30%, > because it is several orders faster to check that all inputs/outputs work > correctly than to write proper test suite. > > So, your metrics is mostly relevant for the primary "print typing" use > case that I personally find very annoying if applied to Python 3. s/relevant/irrelevant/ >> Consistency is far more important than saving a few key strokes. If >> print magically treats parentheses as optional, so that these are the >> same: >> >> print a, b, c, sep="**" >> print(a, b, c, sep="**") >> >> people will be confused why print is so special and they can't do the >> same with any arbitrary function: >> >> result = 1 + my_function a, b, c, keyword_argument=42 > > The first case is "print as statement". > The second is "print as expression". > > People won't be confused, because 'result = print a,b,a' syntax is > incorrect in the scenario I proposed. > >> And of course, there is the problem that leaving out the parentheses >> makes it ambiguous in Python 3. Does this: >> >> my_tuple = (100, 200, print 42, "hello") >> >> print "42" and generate the tuple (100, 200, None, "hello"), or does it >> print "42 hello" and generate the tuple (100, 200, None)? > > Will it be good if only 'my_tuple = (100, 200, print(42), "hello")' syntax > is supported in expression? From techtonik at gmail.com Tue Nov 12 10:12:41 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 12 Nov 2013 12:12:41 +0300 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: On Tue, Nov 12, 2013 at 8:03 AM, Xuancong Wang wrote: > > I think Perl has much lower efficiency than python, especially you need to > type a $ before every variable. The input effort of $ is very high because > you need to press shift. > > Also, you need to type {} for every function/for/while structure. > > In terms of language efficiency, I think Perl is no comparison to Python. And even in Python you can write inefficient. One of the reasons Mercurial got awesome without all the community support that Git had, is efficient coding style that enables typing at the speed of thought. Quoting http://mercurial.selenic.com/wiki/CodingStyle?action=recall&rev=17 "in general, don't make mpm use his shift key any more than he has to" I really like the hack with applying Huffman code to human brain behavior. Speaking of Mercurial vs Git, I hate Git in command line, because of notoriously long subcommand syntax, where HG uses adaptive shortcuts. > We can roughly estimate the input effort of every key in the following way: > Normal alphabet keys: effort=1 > Numbers 0~9: effort=1.2 > Shift/Tab: effort=0.6 > Ctrl/Alt: effort=0.8 > {}[];'\,./-=: effort=1.2 > (effort measures how difficult it is to press the key) > Therefore, any composed keys like shift+9='(', the input effort is > 0.6+1.2=1.8 > that's why we should try to avoid composed keys if it's not necessary. The methodology metrics rocks! =) But for me it would be differently: Numbers 0~9: effort=1 Left Shift/Ctrl: effort=0.8 Alt: effort=0.5 Tab: effort=0.5 Right Shift: effort=1.2 Right Ctrl: effort=0.6 And in terms of print frequency occurrences it needs to be applied to interaction Python sessions, editor commands, not just to final source code. From abarnert at yahoo.com Tue Nov 12 10:51:52 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 12 Nov 2013 01:51:52 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> Message-ID: <564F1CCE-13D3-4192-A572-ADA058649A9E@yahoo.com> On Nov 11, 2013, at 21:03, Gregory Salvan wrote: > Thank you. > > You're almost right, but there is misunderstanding. > > I suggested generating value object by freezing object states and to have object identities defined on their values instead of their memory allocation. > so : > pi1 = keyword(3.14) > pi2 = keyword(3.14) > assert pi1 is pi2 # is true So effectively you want to add interning for arbitrary types. So... why? You won't get any direct performance benefits from being able to use is instead of == here--comparing two floats is as fast as comparing two pointers on most platforms. And why would you want users to compare with is instead of ==? It means everyone who uses pi has to know that it's a keyword, even though he gets no real benefit from doing so. There might be some indirect benefits in that we could skip refcounting on objects that are guaranteed to live forever. But you could get the same benefits with a less drastic change--basically, just a way to declare an object as "permanent", without changing its semantics in any other way. That sub-idea might be worth exploring. > and: > you = Person('Andrew Barnert') > andrew = keyword(you) > andrew.name = 'Dr. Sam Beckett' # will raise an error. That's already easy today. For example, you can declare name as a @property with no setter, or inherit from namedtuple. > It is value object not singleton because pi1 and pi2 are not of the same instance. Why not? In fact, that seems to lose most of the benefits of interning. Effectively you've just created normal objects that can override the is operator, which I think is a bad idea. > This matter when doing concurrency but this is not the point as you can cheat with "repr" or "hash". > > I would focus mainly on how to extend language syntax and permit writing code like: > 'abc' should_equal 'abc' I'm not sure how this fits in with the rest of your idea at all. That requires changing the parser at runtime, which has nothing to do with objects being permanent or their is operators being overridable to mean equality or anything else. Meanwhile, you should take a look at how parsing and compiling works in Python (the dev guide has a great section on it). Modules are parsed and compiled, and then the bytecode is run at import time. If that code can change the grammar, how can we use compiled bytecode at all? Without an implementation that compiles and interprets statement by statement on the fly, it seems like this would be very difficult. You may want to look at MacroPy, which allows making (less dramatic) changes to the language syntax via import hooks. > Value object seems to me necessary to avoid side effects. I don't think I understand why, if by "value object" you mean "object with a custom is operator". If you just mean "immutable object", that sounds more reasonable (although I'm still not sure i get it), but again, we already have those. A float, for example, is already immutable today. > I miss your point about marshal/pickle and will dig it to understand. > I hope in future I can be more relevant. > > Sorry for the inconvenience, thanks for taking the time to answer, > good continuation. > > > > > 2013/11/11 Andrew Barnert >> I could go through this point by point, but I think I can head it all off by explaining what I think is a fundamental misconception that's misleading you. >> >> In Python, variables aren't "memory locations" where values live, they're just names that are bound to values that live somewhere on their own. Rebinding a variable to a different value doesn't mutate anything. You can't make variables refer to other variables, and making two variables refer to the same value doesn't have the same referential transparency, thread safety, etc. issues as making a variable refer to another variable. For example, in the following code: >> >> me = Person('Andrew Barnert') >> andrew = me >> me = Person('Dr. Sam Beckett') >> >> ... nothing has been mutated. In particular, the andrew variable is unchanged; it's still referring to the same Person('Andrew Barnert') object as before, not the new one. >> >> If I make the Person type mutable, and call a mutating method on it (or do so implicitly, by setting an attribute or a keyed or indexed member), then of course I can change the value, which will be visible to all variables referring to that value. But assignment does not do that. >> >> So, you already have almost everything you want. You can create new immutable types and global singleton values of those types. Or create Enums. Or just create singleton objects whose type doesn't matter, which are immutable and compared by identity, just by calling object(). All of this is trivial. >> >> Taking one of your examples: >> >> Pi = 3.14159 >> >> ... gives you everything you wanted, except for the fact that you can accidentally or maliciously rebind it to a new value. It's already an immutable, thread-safe constant. >> >> You can even make a variable name un-rebindable with a module import hook that produces a custom globals, if you really want to, although I can't imagine that ever being worth doing. >> >> The only additional thing a keyword gives you is that the compiler can prevent you from rebinding the name to a different value, at compile time rather than run time. That's it. >> >> Meanwhile, the disadvantage of allowing new keywords to be defined for immutable constants is huge. You could no longer compile, or even parse, any code without checking each token against the current runtime environment. That makes the parser slower and more complicated, introduces a dependency that makes it hard to keep the components separate, makes pyc files and marshal/pickle and so on useless, makes it much harder for code tools to use the ast module, etc. All for a very tiny benefit. >> >> Sent from a random iPhone >> >> On Nov 11, 2013, at 3:20, Gregory Salvan wrote: >> >>> Hi all, >>> I've just joined you, I hope I'll not make you loose your time with questions already covered. >>> Sorry, if my suggest represents lot of work (certainly too much) and about my lack of knowledge about internal implementation, but I hope opening a discussion can help. >>> >>> I would suggest to introduce a new statement to provide a way to extend language syntax. >>> >>> Abstract: >>> While reading this article about None history I realised I was not alone to wonder about some aspect of the language. >>> >>> Some of these are: >>> - constant and immutability >>> - null object and "None is a singleton" >>> - readability of code >>> >>> Constant: >>> It's a convention, no matter with that. >>> My question is more if there is no memory overuse to have fixed values that behave as variable. >>> >>> Null Object: >>> NoneType is not overridable. >>> Having null objects instance of NoneType wouldn't make sense ? >>> Can't assign keywords so "None is a singleton". >>> Wouldn't it be more simple to have "keyword" type and be able to assign a keyword to a variable ? >>> With an object hierarchy like: keyword <- ReservedKeyword >>> I find it can make clearer why some keywords are reserved without changing actual behaviour. >>> >>> Readability of code: >>> I have shouldDsl use case in mind. >>> The introduction illustrate my thought: >>> "The goal of Should-DSL is to write should expectations in Python as clear and readable as possible, using an ?almost? natural language (limited by some Python language?s constraints)." >>> >>> I find ShouldDsl perturbing, as the aim is laudable but the implementation stays a hack of the language. >>> I see the influence of functional paradigm in this syntax and agreed with the fact it becomes a "must have". >>> >>> I love Python syntax and find, it has very few limits, but maybe it can be improved by reducing some constraints. >>> >>> One concerns in extending syntax is that the language can evolve on users usage. >>> I see it like an open laboratory where developpers experiments new keywords like "should" implemented in their favorite language and when this keyword is largely accepted by the community it can finally integrate the core. >>> >>> A solution: >>> I was first thinking of a statement like "Instance" to declare an immutable object with a single instance (like None), but I then consider how it would be great to be able to extend the syntax. >>> I thought the keyword "Keyword", which exists in clojure for example, can do the job. >>> >>> A first implementation can be: >>> >>> Keyword (): >>> field = # fixed value: self.field = "something" will raise an error >>> def behaviour(self, *args, **kwargs): >>> # do something >>> def __get__(self, left, right, block_content): >>> # what to do when accessing keyword >>> >>> Inline implementation: >>> Keyword () >>> >>> >>> Examples of use: >>> >>> Constant: >>> Keyword Pi(3.14159) >>> >>> Null Object and immutable objects: >>> Class Person(object): >>> def __init__(self, name): >>> self.name = name >>> >>> JohnDoe = Person(name="anonymous") >>> >>> Keyword Anonymous(None, JohnDoe) >>> >>> Anonymous.name = "JohnDoe" # raise an error >>> assert isinstance(Anonymous, type(None)) # is true >>> >>> >>> As Anonymous is immutable , it is also thread safe. >>> The mecanism which consist of freezing an object can be applied in other situation. >>> >>> An interesting approach can be to define keywords identity throught a hash of their value, like value object. >>> >>> Should keyword: >>> >>> Keyword ShouldEqual(keyword): >>> def __get__(self, left, right, block): >>> if left != right: >>> raise AssertError() >>> >>> ### weird examples: >>> >>> Multiline Lambdas: >>> Keyword MultilineLambda(keyword): >>> def __get__(self, left, right, block_content): >>> def multiline_lambda(*args, **kwargs): >>> self.assert_args_valid(right, args, kwargs) # compare "right" (tuple) with args and raise exception in case not valid >>> return self.run(block_content, args, kwargs) # "run" is a shortcut because I don't know which type is the most appropriate for block content >>> return multiline_lambda >>> >>> divide = MultilineLambda a, b: >>> if b == 0: >>> raise MyDivideByZeroError() >>> return a/b >>> >>> Reimplementing else: >>> Keyword otherwise(keyword): >>> def __get__(self, left, right, block): >>> if left: return # left is the result of "if" block >>> self.run(block) >>> >>> if something: >>> # do something >>> otherwise: >>> # other >>> >>> Whereas this example is weird, it seems to me that we can better figure out what language do. >>> >>> >>> Thanks all to have kept python open and give us opportunities to freely discuss about its future implementation. >>> >>> Have a nice day, >>> Gr?gory >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Tue Nov 12 11:00:44 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 12 Nov 2013 02:00:44 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> Message-ID: <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> On Nov 11, 2013, at 21:27, Chris Angelico wrote: > On Tue, Nov 12, 2013 at 4:03 PM, Gregory Salvan wrote: >> I suggested generating value object by freezing object states and to have >> object identities defined on their values instead of their memory >> allocation. > > I can see some potential in this if there were a way to say "This will > never change, don't refcount it or GC-check it"; that might improve > performance across a fork (or across threads), but it'd take a lot of > language support. Effectively, you would be forfeiting the usual GC > memory saving "this isn't needed, get rid of it" and fixing it in > memory someplace. The question would be: Is the saving of not writing > to that memory (updating refcounts, or marking for a mark/sweep GC, or > whatever the equivalent is for each Python implementation) worth the > complexity of checking every object to see if it's a frozen one? Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? In such an implementation, I think the cost of checking a second flag to avoid the interlocked increment would, at least on many platforms (including x86, x86_64, and arm9), be comparatively very cheap, and if used widely could provide big benefits. I believe CPython and standard PyPy just use plain adds under the GIL, and Jython and IronPython leave all the gc up to the underlying VM, so it would probably be a lot harder to get enough benefit there without a lot more effort. Also, as I said in my previous message, I don't think this "permanent value" idea is in any way dependent on any of the other stuff suggested, and in fact would work better without them. (For example, being able to have multiple separate copies of the permanent object that act as if they're identical, and can't be distinguished at the Python level.) > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From abarnert at yahoo.com Tue Nov 12 11:11:57 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 12 Nov 2013 02:11:57 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: <78581161-763A-4E0F-8352-6A4CE438D162@yahoo.com> On Nov 11, 2013, at 19:45, Xuancong Wang wrote: > Another suggestion is that 'enumerate' is also frequently used, hopefully we can shorten the command as well. One huge advantage of everything being regular functions is that it's ridiculously easy to experiment with this. Want to see what it's like to use "en" or "ix" or whatever instead of enumerate? Just do "en = enumerate", and you can start using it. See how it affects your typing speed, and the readability of your code. (Obviously it will make your code less readable to the general Python community, but ignore that; the interesting question is whether you--or, better, a small group you work in--find it readable once you get used to it). Meanwhile, I personally vastly prefer print as a function to a statement. I can pass print to a function instead of having to write an out-of-line wrapper with def. I can do quick joining without spaces, and no-newline-ing without having to mess with magic commas. But then, like most of the others who prefer print as a function, I don't actually use it nearly as much as the people who are complaining, so maybe that doesn't mean much. From ncoghlan at gmail.com Tue Nov 12 11:23:17 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Nov 2013 20:23:17 +1000 Subject: [Python-ideas] Extending language syntax In-Reply-To: <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> Message-ID: On 12 Nov 2013 20:03, "Andrew Barnert" wrote: > > On Nov 11, 2013, at 21:27, Chris Angelico wrote: > > > On Tue, Nov 12, 2013 at 4:03 PM, Gregory Salvan wrote: > >> I suggested generating value object by freezing object states and to have > >> object identities defined on their values instead of their memory > >> allocation. > > > > I can see some potential in this if there were a way to say "This will > > never change, don't refcount it or GC-check it"; that might improve > > performance across a fork (or across threads), but it'd take a lot of > > language support. Effectively, you would be forfeiting the usual GC > > memory saving "this isn't needed, get rid of it" and fixing it in > > memory someplace. The question would be: Is the saving of not writing > > to that memory (updating refcounts, or marking for a mark/sweep GC, or > > whatever the equivalent is for each Python implementation) worth the > > complexity of checking every object to see if it's a frozen one? > > Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? In such an implementation, I think the cost of checking a second flag to avoid the interlocked increment would, at least on many platforms (including x86, x86_64, and arm9), be comparatively very cheap, and if used widely could provide big benefits. PyParallel uses some neat tricks to skip almost all memory management in worker threads. Currently Windows only though, and one reader's "neat trick" may be another's "awful hack" :) Cheers, Nick. > > I believe CPython and standard PyPy just use plain adds under the GIL, and Jython and IronPython leave all the gc up to the underlying VM, so it would probably be a lot harder to get enough benefit there without a lot more effort. > > Also, as I said in my previous message, I don't think this "permanent value" idea is in any way dependent on any of the other stuff suggested, and in fact would work better without them. (For example, being able to have multiple separate copies of the permanent object that act as if they're identical, and can't be distinguished at the Python level.) > > > > > ChrisA > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Nov 12 11:42:57 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Nov 2013 20:42:57 +1000 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <78581161-763A-4E0F-8352-6A4CE438D162@yahoo.com> References: <78581161-763A-4E0F-8352-6A4CE438D162@yahoo.com> Message-ID: On 12 Nov 2013 20:15, "Andrew Barnert" wrote: > > On Nov 11, 2013, at 19:45, Xuancong Wang wrote: > > > Another suggestion is that 'enumerate' is also frequently used, hopefully we can shorten the command as well. > > One huge advantage of everything being regular functions is that it's ridiculously easy to experiment with this. Want to see what it's like to use "en" or "ix" or whatever instead of enumerate? Just do "en = enumerate", and you can start using it. See how it affects your typing speed, and the readability of your code. (Obviously it will make your code less readable to the general Python community, but ignore that; the interesting question is whether you--or, better, a small group you work in--find it readable once you get used to it). > > Meanwhile, I personally vastly prefer print as a function to a statement. I can pass print to a function instead of having to write an out-of-line wrapper with def. I can do quick joining without spaces, and no-newline-ing without having to mess with magic commas. But then, like most of the others who prefer print as a function, I don't actually use it nearly as much as the people who are complaining, so maybe that doesn't mean much. A few months ago I came up with a working "call statement" implementation that would allow the parens to be omitted from all simple calls, not just print: http://bugs.python.org/issue18788 That shows such an approach is technically feasible, but it also makes it clear there are major readability issues if the LHS is allowed to be an arbitrary expression. I'm still vaguely curious what a full PEP for 3.5 (with a suitably constrained LHS) might look like, but I'm not interested enough to write it myself. Cheers, Nick. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From cf.natali at gmail.com Tue Nov 12 11:37:29 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 12 Nov 2013 11:37:29 +0100 Subject: [Python-ideas] Extending language syntax In-Reply-To: <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> Message-ID: 2013/11/12 Andrew Barnert : > Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? In such an implementation, I think the cost of checking a second flag to avoid the interlocked increment would, at least on many platforms (including x86, x86_64, and arm9), be comparatively very cheap, and if used widely could provide big benefits. How would you do this in a thread-safe way without atomic operations, or at least memory barriers? cf From steve at pearwood.info Tue Nov 12 12:42:08 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 12 Nov 2013 22:42:08 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <78581161-763A-4E0F-8352-6A4CE438D162@yahoo.com> Message-ID: <20131112114207.GE2085@ando> On Tue, Nov 12, 2013 at 08:42:57PM +1000, Nick Coghlan wrote: > A few months ago I came up with a working "call statement" implementation > that would allow the parens to be omitted from all simple calls, not just > print: http://bugs.python.org/issue18788 > > That shows such an approach is technically feasible, but it also makes it > clear there are major readability issues if the LHS is allowed to be an > arbitrary expression. IPython has had this feature for a while: In [1]: len [] ------> len([]) Out[1]: 0 As IPython is attempting to be an interactive shell rather than just a REPL, that makes a certain amount of sense, but it does lead to some unfortunate situations: In [2]: len [] + len [] ------> len([] + len []) ------------------------------------------------------------ File "", line 1 len([] + len []) ^ SyntaxError: invalid syntax -- Steven From ethan at stoneleaf.us Tue Nov 12 11:47:44 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 12 Nov 2013 02:47:44 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: Message-ID: <52820750.80102@stoneleaf.us> On 11/11/2013 09:52 PM, Xuancong Wang wrote: > You probably ought to weigh the ease of typing against readability. Which is more important? By how much? Readability, a thousand times. > As people are used to python2, I don't think changing print to print() will improve code readability by how much, > especially now VIM already highlights the print command in python. Probably you don't use print in python quite often, > so you may not sense the difference in typing brackets. If you ever used C/C++/perl frequently, you should sense the > inconvenience caused by brackets in loop structures. If you don't like `print()`, do a `p = print` and then all you have is `p()` -- of course, you just lost a bunch a readability. But seriously, have often does any real program use print? > Separately, have you measured typing speed rigorously? > The typing effort varies from person to person, from beginners to experts, maybe you are too expert to sense the > difference which is sensitive to beginners. There's no rule-of-thumb and it's not just a speed issue. It's also related > to energy spent by your fingers. So obviously, pressing 2 keys, you need to spend at least twice the amount of energy. > And even more if you need to hold down 1 key. > > Anyway, I am just introducing an alternative form to the print() function. As long as this does not reduce the > performance of the interpreter significantly, it should not harm. And it should also improve the downward compatibility. `print` is now a function, it's now going back to a keyword, and the interpreter loop isn't changing to support such a tiny use-case. -- ~Ethan~ From ncoghlan at gmail.com Tue Nov 12 13:10:31 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 12 Nov 2013 22:10:31 +1000 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <52820750.80102@stoneleaf.us> References: <52820750.80102@stoneleaf.us> Message-ID: On 12 November 2013 20:47, Ethan Furman wrote: > If you don't like `print()`, do a `p = print` and then all you have is `p()` > -- of course, you just lost a bunch a readability. > > But seriously, have often does any real program use print? It's far more common in utility scripts (such as those written by system administrators) than it is in applications. The print change between Python 2 and 3 is one that doesn't really affect application developers all that much in practice (other than when trying things out in the REPL, and apparently not even then if using IPython), but can be more of an issue with those writing scripts where the standard streams are the primary IO mechanism. We tend not to hear from the latter group as much as we do from application developers, though. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From apieum at gmail.com Tue Nov 12 12:16:57 2013 From: apieum at gmail.com (Gregory Salvan) Date: Tue, 12 Nov 2013 12:16:57 +0100 Subject: [Python-ideas] Extending language syntax In-Reply-To: <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> Message-ID: Sorry Chris Angelico and Andrew Barnert, I have not enough knowledge of python implementations to correctly answer your questions. I just wanted to share the idea in order to see if It was interesting before I dig in that way. I thought token can act like a macro, replacing at compile time things like: 'abc' should_equal 'abc' by should_equal.__get__('abc', 'abc', None) Then I saw assert_equal as a "keyword" like "if" and it seemed obvious to have "assert if is if" raising assertion error. Finally authorising should_equal.__get__ to behave differently due to its state seemed dangerous. As I need immutable objects with an identity given by their values, I deduce it was value objects. This might be confusing. Thanks for your enlightments. 2013/11/12 Andrew Barnert > On Nov 11, 2013, at 21:27, Chris Angelico wrote: > > > On Tue, Nov 12, 2013 at 4:03 PM, Gregory Salvan > wrote: > >> I suggested generating value object by freezing object states and to > have > >> object identities defined on their values instead of their memory > >> allocation. > > > > I can see some potential in this if there were a way to say "This will > > never change, don't refcount it or GC-check it"; that might improve > > performance across a fork (or across threads), but it'd take a lot of > > language support. Effectively, you would be forfeiting the usual GC > > memory saving "this isn't needed, get rid of it" and fixing it in > > memory someplace. The question would be: Is the saving of not writing > > to that memory (updating refcounts, or marking for a mark/sweep GC, or > > whatever the equivalent is for each Python implementation) worth the > > complexity of checking every object to see if it's a frozen one? > > Is there any implementation (like one of the PyPy sub projects) that uses > refcounting, with interlocked increments if two interpreter threads are > live but plain adds otherwise? In such an implementation, I think the cost > of checking a second flag to avoid the interlocked increment would, at > least on many platforms (including x86, x86_64, and arm9), be comparatively > very cheap, and if used widely could provide big benefits. > > I believe CPython and standard PyPy just use plain adds under the GIL, and > Jython and IronPython leave all the gc up to the underlying VM, so it would > probably be a lot harder to get enough benefit there without a lot more > effort. > > Also, as I said in my previous message, I don't think this "permanent > value" idea is in any way dependent on any of the other stuff suggested, > and in fact would work better without them. (For example, being able to > have multiple separate copies of the permanent object that act as if > they're identical, and can't be distinguished at the Python level.) > > > > > ChrisA > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue Nov 12 14:17:41 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 12 Nov 2013 05:17:41 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <52820750.80102@stoneleaf.us> References: <52820750.80102@stoneleaf.us> Message-ID: <52822A75.4080506@stoneleaf.us> On 11/12/2013 02:47 AM, Ethan Furman wrote: > > it's now going back to a keyword s/now/not/ -- ~Ethan~ From rosuav at gmail.com Tue Nov 12 15:52:31 2013 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Nov 2013 01:52:31 +1100 Subject: [Python-ideas] Extending language syntax In-Reply-To: <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> Message-ID: On Tue, Nov 12, 2013 at 9:00 PM, Andrew Barnert wrote: > On Nov 11, 2013, at 21:27, Chris Angelico wrote: > >> On Tue, Nov 12, 2013 at 4:03 PM, Gregory Salvan wrote: >>> I suggested generating value object by freezing object states and to have >>> object identities defined on their values instead of their memory >>> allocation. >> >> I can see some potential in this if there were a way to say "This will >> never change, don't refcount it or GC-check it"; that might improve >> performance across a fork (or across threads), but it'd take a lot of >> language support. Effectively, you would be forfeiting the usual GC >> memory saving "this isn't needed, get rid of it" and fixing it in >> memory someplace. The question would be: Is the saving of not writing >> to that memory (updating refcounts, or marking for a mark/sweep GC, or >> whatever the equivalent is for each Python implementation) worth the >> complexity of checking every object to see if it's a frozen one? > > Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? In such an implementation, I think the cost of checking a second flag to avoid the interlocked increment would, at least on many platforms (including x86, x86_64, and arm9), be comparatively very cheap, and if used widely could provide big benefits. > > I believe CPython and standard PyPy just use plain adds under the GIL, and Jython and IronPython leave all the gc up to the underlying VM, so it would probably be a lot harder to get enough benefit there without a lot more effort. The only Python I actually work with is CPython, so I can't say what does or doesn't exist. But plausibly, it's possible to have a one-bit flag that says "this is permanent, don't GC it" and then it won't get its refcount updated (in CPython), or won't get marked (in a mark/sweep GC), or whatever. Then, if you have an entire page of frozen objects, and you fork() using the standard Linux (and other) semantics of copy-on-write, you would never need to write to that page. The reason this might require that the objects be immutable is this: When an object is marked as frozen, everything it references can also automatically be marked frozen. (By definition, they'll always be in use too.) That would only work, though, if the set of objects thus referenced doesn't change. PI = complex(3.14159) sys.freeze(PI) # should freeze the float value 3.14159 PI.real = 3.141592653589793 # awkward I can imagine that this might potentially offer some *huge* benefits in a system that does a lot of forking (maybe a web server?), but all I have to go on is utter and total speculation. And the cost of checking "Is this frozen? No, update its refcount" everywhere means that there's a penalty even if fork() is never used. So actually, this might work out best as a "special-build" Python, and maybe only as a toy/experiment. ChrisA From steve at pearwood.info Tue Nov 12 16:56:34 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 13 Nov 2013 02:56:34 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <52820750.80102@stoneleaf.us> Message-ID: <20131112155634.GF2085@ando> On Tue, Nov 12, 2013 at 10:10:31PM +1000, Nick Coghlan wrote: > On 12 November 2013 20:47, Ethan Furman wrote: > > If you don't like `print()`, do a `p = print` and then all you have is `p()` > > -- of course, you just lost a bunch a readability. > > > > But seriously, have often does any real program use print? > > It's far more common in utility scripts (such as those written by > system administrators) than it is in applications. The print change > between Python 2 and 3 is one that doesn't really affect application > developers all that much in practice (other than when trying things > out in the REPL, and apparently not even then if using IPython), but > can be more of an issue with those writing scripts where the standard > streams are the primary IO mechanism. > > We tend not to hear from the latter group as much as we do from > application developers, though. I've written a few short utility scripts in my time, and I rarely call print directly except for the most basic scripts. Normally I'll have facility to control output, perhaps a verbosity level, or at least a verbose flag, say: def pr(message): if gVerbose: print message So I think even in scripting, calling print directly is less common than it might otherwise seem. -- Steven From elazarg at gmail.com Tue Nov 12 18:08:40 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Tue, 12 Nov 2013 19:08:40 +0200 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <20131112155634.GF2085@ando> References: <52820750.80102@stoneleaf.us> <20131112155634.GF2085@ando> Message-ID: 2013/11/12 Steven D'Aprano : > I've written a few short utility scripts in my time, and I rarely call > print directly except for the most basic scripts. Normally I'll have > facility to control output, perhaps a verbosity level, or at least a > verbose flag, say: > > def pr(message): > if gVerbose: > print message > And making print syntactically irregular prevents you from doing things like import builtins builtins.print = print_if_verbose Elazar From abarnert at yahoo.com Tue Nov 12 18:56:20 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 12 Nov 2013 09:56:20 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> Message-ID: <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> On Nov 12, 2013, at 2:37, Charles-Fran?ois Natali wrote: > 2013/11/12 Andrew Barnert : >> Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? In such an implementation, I think the cost of checking a second flag to avoid the interlocked increment would, at least on many platforms (including x86, x86_64, and arm9), be comparatively very cheap, and if used widely could provide big benefits. > > How would you do this in a thread-safe way without atomic operations, > or at least memory barriers? The whole point of a "permanent" flag would be that it's only set at object creation time and never modified, and the object never gets cleaned up. That means you can just check the flag, without an atomic operation, and if it's set you can skip the (slow/cache-killing) atomic increment or decrement. From haoyi.sg at gmail.com Tue Nov 12 19:11:20 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Tue, 12 Nov 2013 10:11:20 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> Message-ID: > to have object identities defined on their values instead of their memory allocation but the whole point of identity is to be their memory allocation! There's already equality if you want to compare on values. I'm not sure what you really want, and I suspect you're also somewhat uncertain. Do you want multiline lambdas, by-name variables, custom blocks, interned objects, infix operators? Other things? It's a lot of distinct feature requests to ask for and it would be good to get them cleared up in everyone's minds. If you want interning for arbitrary expressions, MacroPy lets you do that already in your own code. It interns on a per-declaration basis rather than on a per-value basis, because the task of evaluating an arbitrary expression at macro expansion time is icky. You can pull some other neat tricks with it (e.g. classes whose equality is by default defined by value), but are limited to Python's grammar and parser, so no infix-method-operators and such, but you can trigger macro expansion easily with should_equal['abc', ''abc'] and do whatever "compile"-time substitution you want. On Tue, Nov 12, 2013 at 9:56 AM, Andrew Barnert wrote: > On Nov 12, 2013, at 2:37, Charles-Fran?ois Natali > wrote: > > > 2013/11/12 Andrew Barnert : > >> Is there any implementation (like one of the PyPy sub projects) that > uses refcounting, with interlocked increments if two interpreter threads > are live but plain adds otherwise? In such an implementation, I think the > cost of checking a second flag to avoid the interlocked increment would, at > least on many platforms (including x86, x86_64, and arm9), be comparatively > very cheap, and if used widely could provide big benefits. > > > > How would you do this in a thread-safe way without atomic operations, > > or at least memory barriers? > > The whole point of a "permanent" flag would be that it's only set at > object creation time and never modified, and the object never gets cleaned > up. > > That means you can just check the flag, without an atomic operation, and > if it's set you can skip the (slow/cache-killing) atomic increment or > decrement. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Tue Nov 12 19:19:10 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 12 Nov 2013 10:19:10 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> Message-ID: On Nov 12, 2013, at 6:52, Chris Angelico wrote: > > The reason this might require that the objects be immutable is this: > When an object is marked as frozen, everything it references can also > automatically be marked frozen. (By definition, they'll always be in > use too.) That would only work, though, if the set of objects thus > referenced doesn't change. > > PI = complex(3.14159) > sys.freeze(PI) # should freeze the float value 3.14159 > PI.real = 3.141592653589793 > # awkward Yes, that's a reasonable extension to the permanent-marking idea. But I'm not sure immutability is as necessary as you think. For the fork issue, assuming PI is a C object or a slots object, it couldn't be on the never-written page if you went this way, but it would still benefit from not being refcounted. If you had a permanent complex(3.14159) value, but 3.14159 itself weren't permanent, it would just be a normal float with one refcount that rarely gets copied anywhere. If it does get copied, it gets (atomically) incref'd like normal, but so what? If you, as the app developer, know that for whatever reason this will be much more common in your app than in the usual case, you can always create the permanent float first, then create the permanent complex out of that value. If you do that, and then later mutate PI to point to a different float, the new float can be permanent or normal and it all works. If you're forking, then you'd want PI to be immutable (again assuming its a C or slots object); if you're threading and just want the refcount skipping, you'd want a way to get just that. From cf.natali at gmail.com Tue Nov 12 19:25:21 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 12 Nov 2013 19:25:21 +0100 Subject: [Python-ideas] Extending language syntax In-Reply-To: <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> Message-ID: 2013/11/12 Andrew Barnert : > The whole point of a "permanent" flag would be that it's only set at object creation time and never modified, and the object never gets cleaned up. Of course. I probably wasn't clear, but I was actually referring to this part: > Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? I'm not sure how one would do this without using locking/memory barriers (hence a large performance overhead). Regarding this question, as noted by Nick, you might want to have a look at pyparallel: apparently, it uses a form of region-based memory allocation, per-thread (using mprotect to catch references to main-thread owned objects). But last time I checked with Trent, reference counting/garbage collection was completely disabled in worker theads, which means that if you allocate and free many objects, you'll run out of memory (having an infinite amount of memory certainly makes garbage collection easier :-) ). Cheers, cf From barry at python.org Tue Nov 12 20:22:04 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 12 Nov 2013 14:22:04 -0500 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 References: Message-ID: <20131112142204.37125a14@anarchist> On Nov 12, 2013, at 11:45 AM, Xuancong Wang wrote: >I notice that one major change in python 3 is that it makes 'print' as a >standard function, and it will require typing (). I do understand that it >makes python language more consistent because most of the python >functionalities are implemented as function calls. > >As you know, reading from and writing to IO is a high frequency operation. >By entropy coding theorem (e.g. Huffman coding), an efficient language >should assign shorter language code to more frequent tasks. Typing a '(' >requires holding SHIFT and pressing 9, the input effort is much higher than >that in Python 2. Also, specifying IO has changed from >>* to file=*, which >also becomes more inconvenient. You can blame me for the original print>> syntax, which was the best of the worst suggestions for extending the print statement in Python 2. I also had a hard time making the mental and physical (muscle memory) switch to print() function in Python 3. Having used print() now for several years in both Python 3 and Python 2[*], I can say without hesitation that I'm really glad this change was made. print as a function is just so much better in so many ways. I particularly like the ease with which it can be mocked for testing. -Barry [*] from __future__ import print_function -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From python at mrabarnett.plus.com Tue Nov 12 20:25:55 2013 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 12 Nov 2013 19:25:55 +0000 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <20131112071404.GA18971@ando> References: <20131112071404.GA18971@ando> Message-ID: <528280C3.5050005@mrabarnett.plus.com> On 12/11/2013 07:14, Steven D'Aprano wrote: > On Tue, Nov 12, 2013 at 04:31:56PM +1100, Chris Angelico wrote: > >> It's funny how many words we have that all quite validly describe the >> concept of producing console output. C calls it "print" (printf), or >> "write"; Python calls it "print"; REXX opts for "say"; C++ goes for >> "cout" (console output)... yet all of them are very clear, in their >> own way, and nobody expects Python to create hard copies or REXX to >> use the speaker :) > > I don't know about that. Expecting print to generate hardcopy output is > something which beginners to programming often need to unlearn. If print > doesn't print, which command do you use to actually print? > > As for REXX, I expected "say" to use the speaker. That's what the "say" > command does in Applescript, for example. > I remember a review of a micro (Sharp?) many years ago. The reviewer tried of its version of Basic, using the PRINT command, and was surprised when the printer sprang into life. It turned out that the command for printing to the screen was DISPLAY. From barry at python.org Tue Nov 12 20:26:06 2013 From: barry at python.org (Barry Warsaw) Date: Tue, 12 Nov 2013 14:26:06 -0500 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 References: <52820750.80102@stoneleaf.us> Message-ID: <20131112142606.16006db6@anarchist> On Nov 12, 2013, at 10:10 PM, Nick Coghlan wrote: >It's far more common in utility scripts (such as those written by >system administrators) than it is in applications. The print change >between Python 2 and 3 is one that doesn't really affect application >developers all that much in practice (other than when trying things >out in the REPL, and apparently not even then if using IPython), but >can be more of an issue with those writing scripts where the standard >streams are the primary IO mechanism. > >We tend not to hear from the latter group as much as we do from >application developers, though. Oh yeah, another beautiful thing about the print function: >>> from functools import partial >>> import sys >>> perr = partial(print, file=sys.stderr) >>> perr('you hit a bug') Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From abarnert at yahoo.com Tue Nov 12 21:34:13 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 12 Nov 2013 12:34:13 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> Message-ID: <8333363D-180A-4CDF-AB49-347678FD6E64@yahoo.com> On Nov 12, 2013, at 10:25, Charles-Fran?ois Natali wrote: > Of course. I probably wasn't clear, but I was actually referring to this part: > >> Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? > > I'm not sure how one would do this without using locking/memory > barriers (hence a large performance overhead). A global "multithreading" flag. When you start a thread, it sets the flag in the parent thread, before starting the new thread. Both are guaranteed to see the True value. If there are any other threads, the value was already True. You probably don't ever need it to go back to False--not many programs are multithreaded for a while and then single-threaded again. But if you need this, note that it's never a problem to see a spurious True, it just means you do an atomic read when you didn't need to. But also, you only need to check whether you're the last thread while returning from join, and there's no way anyone could have created another thread between the OS-level join and the end of Thread.join. From cf.natali at gmail.com Tue Nov 12 23:42:22 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Tue, 12 Nov 2013 23:42:22 +0100 Subject: [Python-ideas] Extending language syntax In-Reply-To: <8333363D-180A-4CDF-AB49-347678FD6E64@yahoo.com> References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> <8333363D-180A-4CDF-AB49-347678FD6E64@yahoo.com> Message-ID: 2013/11/12 Andrew Barnert : > On Nov 12, 2013, at 10:25, Charles-Fran?ois Natali wrote: > >> Of course. I probably wasn't clear, but I was actually referring to this part: >> >>> Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? >> >> I'm not sure how one would do this without using locking/memory >> barriers (hence a large performance overhead). > > A global "multithreading" flag. When you start a thread, it sets the flag in the parent thread, before starting the new thread. Both are guaranteed to see the True value. If there are any other threads, the value was already True. That would probably work. The only remaining question is whether it'll actually yield a performance gain: as soon as you have more than one thread in the interpreter (even if it's idle), you'll end up doing atomic incref/decref all over the place, which will just kill performance (and will likely degrade with the number of threads because of increased contention to e.g. lock the memory bus). Atomic refcount just doesn't scale... cf From rosuav at gmail.com Tue Nov 12 23:52:36 2013 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 13 Nov 2013 09:52:36 +1100 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> <8333363D-180A-4CDF-AB49-347678FD6E64@yahoo.com> Message-ID: On Wed, Nov 13, 2013 at 9:42 AM, Charles-Fran?ois Natali wrote: > That would probably work. The only remaining question is whether it'll > actually yield a performance gain: as soon as you have more than one > thread in the interpreter (even if it's idle), you'll end up doing > atomic incref/decref all over the place, which will just kill > performance (and will likely degrade with the number of threads > because of increased contention to e.g. lock the memory bus). > Atomic refcount just doesn't scale... There was a proposal put together in Pike involving multiple arenas, some of which were thread-local and some global. If you have one arena for each thread, which handles refcounted objects without thread-safety, and another pool of frozen objects that don't need to be refcounted at all (at the cost of not ever removing them from memory), then the only objects that need atomic incref/decref are the ones that are actually (potentially) shared. There just needs to be some mechanism for moving an object from the thread-local arena to the shared one, which would be a relatively uncommon operation. ChrisA From abarnert at yahoo.com Tue Nov 12 23:50:47 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 12 Nov 2013 14:50:47 -0800 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> <8333363D-180A-4CDF-AB49-347678FD6E64@yahoo.com> Message-ID: On Nov 12, 2013, at 14:42, Charles-Fran?ois Natali wrote: > 2013/11/12 Andrew Barnert : >> On Nov 12, 2013, at 10:25, Charles-Fran?ois Natali wrote: >> >>> Of course. I probably wasn't clear, but I was actually referring to this part: >>> >>>> Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? >>> >>> I'm not sure how one would do this without using locking/memory >>> barriers (hence a large performance overhead). >> >> A global "multithreading" flag. When you start a thread, it sets the flag in the parent thread, before starting the new thread. Both are guaranteed to see the True value. If there are any other threads, the value was already True. > > That would probably work. The only remaining question is whether it'll > actually yield a performance gain: as soon as you have more than one > thread in the interpreter (even if it's idle), you'll end up doing > atomic incref/decref all over the place, which will just kill > performance (and will likely degrade with the number of threads > because of increased contention to e.g. lock the memory bus). > Atomic refcount just doesn't scale... The point is that if you're _already_ doing atomic refcounts, or the equivalent (like CPython, which does refcounts under the GIL so they're implicitly atomic, which is even less parallel), you can avoid many of those refcounts. From stephen at xemacs.org Wed Nov 13 04:12:51 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 13 Nov 2013 12:12:51 +0900 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> Message-ID: <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> anatoly techtonik writes: > the User eXperience - don't you think that fast-typing "print var" is more > convenient than "print(var)"? Sure, but I don't type either: I typically type "prv". I might need to type another letter or CLOVERLEAF+/ multiple times depending on context. This DTRTs in both Python 2 and Python 3 buffers. This is true in interpreter buffers as well as in programs being saved to files. And similar statements are true in all my coding, whatever language. Why complain about Python syntax when upgrading development tools could gives the same improvements without complicating Python? From xuancong84 at gmail.com Wed Nov 13 04:48:44 2013 From: xuancong84 at gmail.com (Xuancong Wang) Date: Wed, 13 Nov 2013 11:48:44 +0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <78581161-763A-4E0F-8352-6A4CE438D162@yahoo.com> Message-ID: >A few months ago I came up with a working "call statement" implementation that would allow the parens to be omitted from all simple calls, not just print: http://bugs.python.org/issue18788 That sounds a great idea which solves a more general problem. But what are the potential ambiguity issues? Also, how do you remap ">>fp1" to "file=fp1" as an argument to print()? Cheers, Xuancong On Tue, Nov 12, 2013 at 6:42 PM, Nick Coghlan wrote: > > On 12 Nov 2013 20:15, "Andrew Barnert" wrote: > > > > On Nov 11, 2013, at 19:45, Xuancong Wang wrote: > > > > > Another suggestion is that 'enumerate' is also frequently used, > hopefully we can shorten the command as well. > > > > One huge advantage of everything being regular functions is that it's > ridiculously easy to experiment with this. Want to see what it's like to > use "en" or "ix" or whatever instead of enumerate? Just do "en = > enumerate", and you can start using it. See how it affects your typing > speed, and the readability of your code. (Obviously it will make your code > less readable to the general Python community, but ignore that; the > interesting question is whether you--or, better, a small group you work > in--find it readable once you get used to it). > > > > Meanwhile, I personally vastly prefer print as a function to a > statement. I can pass print to a function instead of having to write an > out-of-line wrapper with def. I can do quick joining without spaces, and > no-newline-ing without having to mess with magic commas. But then, like > most of the others who prefer print as a function, I don't actually use it > nearly as much as the people who are complaining, so maybe that doesn't > mean much. > > A few months ago I came up with a working "call statement" implementation > that would allow the parens to be omitted from all simple calls, not just > print: http://bugs.python.org/issue18788 > > That shows such an approach is technically feasible, but it also makes it > clear there are major readability issues if the LHS is allowed to be an > arbitrary expression. > > I'm still vaguely curious what a full PEP for 3.5 (with a suitably > constrained LHS) might look like, but I'm not interested enough to write it > myself. > > Cheers, > Nick. > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From apieum at gmail.com Wed Nov 13 07:50:24 2013 From: apieum at gmail.com (Gregory Salvan) Date: Wed, 13 Nov 2013 07:50:24 +0100 Subject: [Python-ideas] Extending language syntax In-Reply-To: References: <8AB7B777-676A-40E6-BCCF-0ECB124A9502@yahoo.com> <343E9B09-0C97-4D64-A9EE-477AFD7A195C@yahoo.com> <60B09086-1DAD-40DA-B772-0ACE2F3A0F06@yahoo.com> Message-ID: I've open a gdoc to share our visions and be able to collaborate on this subject. It is open, everybody can contribute, I'll be happy if it permits us to find a good solution. https://docs.google.com/document/d/15IPMNzUnK9nd_j7B6wdo7gAn2US52qa8MjGcfQeaRtk/edit?usp=sharing 2013/11/12 Haoyi Li > > to have object identities defined on their values instead of their > memory allocation > > but the whole point of identity is to be their memory allocation! There's > already equality if you want to compare on values. > > I'm not sure what you really want, and I suspect you're also somewhat > uncertain. Do you want multiline lambdas, by-name variables, custom blocks, > interned objects, infix operators? Other things? It's a lot of distinct > feature requests to ask for and it would be good to get them cleared up in > everyone's minds. > > If you want interning for arbitrary expressions, MacroPy lets you do that > already in your own code. > It interns on a per-declaration basis rather than on a per-value basis, > because the task of evaluating an arbitrary expression at macro expansion > time is icky. You can pull some other neat tricks with it (e.g. classes > whose equality is by default defined by value), > but are limited to Python's grammar and parser, so no > infix-method-operators and such, but you can trigger macro expansion easily > with should_equal['abc', ''abc'] and do whatever "compile"-time > substitution you want. > > > > > > > > On Tue, Nov 12, 2013 at 9:56 AM, Andrew Barnert wrote: > >> On Nov 12, 2013, at 2:37, Charles-Fran?ois Natali >> wrote: >> >> > 2013/11/12 Andrew Barnert : >> >> Is there any implementation (like one of the PyPy sub projects) that >> uses refcounting, with interlocked increments if two interpreter threads >> are live but plain adds otherwise? In such an implementation, I think the >> cost of checking a second flag to avoid the interlocked increment would, at >> least on many platforms (including x86, x86_64, and arm9), be comparatively >> very cheap, and if used widely could provide big benefits. >> > >> > How would you do this in a thread-safe way without atomic operations, >> > or at least memory barriers? >> >> The whole point of a "permanent" flag would be that it's only set at >> object creation time and never modified, and the object never gets cleaned >> up. >> >> That means you can just check the flag, without an atomic operation, and >> if it's set you can skip the (slow/cache-killing) atomic increment or >> decrement. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From antony.lee at berkeley.edu Wed Nov 13 08:41:08 2013 From: antony.lee at berkeley.edu (Antony Lee) Date: Tue, 12 Nov 2013 23:41:08 -0800 Subject: [Python-ideas] Issues with inspect.Parameter In-Reply-To: References: Message-ID: I have a (fairly trivial...) patch to fix both of these issues. Should I post it here or submit it to bugs.python.org? Antony 2013/11/10 Antony Lee > The docstring of inspect.Parameter indicates the "default" and > "annotation" attributes are not set if the parameter does not have, > respectively, a default value and an annotation, and that the "kind" > attribute is a string. > But in fact, the "default" and "annotation" attributes are set to > "inspect._empty (== Parameter.empty)" in that case, and the "kind" > attribute has type "_ParameterKind" (essentially a hand-written equivalent > of IntEnum). I suggest to correct the docstring accordingly, and to > replace the implementation of _ParameterKind by a proper IntEnum (if full > backwards compatibility is required), or even just by Enum (which makes a > bit more sense, as the fact that _ParameterKind is a subclass of int > doesn't seem to be documented anywhere). > Antony > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Nov 13 09:03:59 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 13 Nov 2013 00:03:59 -0800 Subject: [Python-ideas] Issues with inspect.Parameter In-Reply-To: References: Message-ID: <5283326F.8010006@stoneleaf.us> On 11/12/2013 11:41 PM, Antony Lee wrote: > > I have a (fairly trivial...) patch to fix both of these issues. Should I post it here or submit it to bugs.python.org Submit it to the bug tracker, then it won't get lost. -- ~Ethan~ From jeff at jeffreyjenkins.ca Wed Nov 13 16:31:14 2013 From: jeff at jeffreyjenkins.ca (Jeff Jenkins) Date: Wed, 13 Nov 2013 10:31:14 -0500 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <52820750.80102@stoneleaf.us> Message-ID: On Tue, Nov 12, 2013 at 7:10 AM, Nick Coghlan wrote: > On 12 November 2013 20:47, Ethan Furman wrote: > > If you don't like `print()`, do a `p = print` and then all you have is > `p()` > > -- of course, you just lost a bunch a readability. > > > > But seriously, have often does any real program use print? > > It's far more common in utility scripts (such as those written by > system administrators) than it is in applications. The print change > between Python 2 and 3 is one that doesn't really affect application > developers all that much in practice (other than when trying things > out in the REPL, and apparently not even then if using IPython), but > can be more of an issue with those writing scripts where the standard > streams are the primary IO mechanism. > > We tend not to hear from the latter group as much as we do from > application developers, though. > > Having written a lot of applications and both small and large scripts I've almost universally regretted using print instead of a logger. It's (slightly) more setup, but the second you want to make any changes to your output (timestamps, suppressing certain messages) it more than makes up for it. > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Wed Nov 13 18:02:47 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 13 Nov 2013 20:02:47 +0300 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: <1370217235.82524.YahooMailNeo@web184703.mail.ne1.yahoo.com> References: <1370217235.82524.YahooMailNeo@web184703.mail.ne1.yahoo.com> Message-ID: On Mon, Jun 3, 2013 at 2:53 AM, Andrew Barnert wrote: > From: anatoly techtonik > Sent: Sunday, June 2, 2013 11:23 AM > > >>FWIW, I am +1 on for the ability to read YAML based configs Python >>without dependencies, but waiting for several years is hard. > > > With all due respect, I don't think you've read even a one-sentence description of YAML, so your entire post is nonsense. I'll try to clarify my post, so that it will be clear for you. Please, ask if something is unclear. You're right. I am not reading specifications prior to using things. What do I personally need from YAML? These are examples of files I use daily: http://tmuxp.readthedocs.org/en/latest/examples.html http://code.google.com/p/rietveld/source/browse/app.yaml https://github.com/agschwender/pilbox/blob/master/provisioning/playbook.yml http://pastebin.com/RG7g260k (OpenXcom save format) > The first sentence of the abstract says, "YAML? is a?data serialization language designed around the common native data types of agile programming languages." So, your idea that we shouldn't use it for serialization, and shouldn't map it to native Python data types, is ridiculous. I don't care really about the abstract. I am a complaining user - not a smart guy, who wrote the spec. So my thinking is the following: 1. Neither of examples above is a persistence data format of serialized native computer language data types. These are just nested mappings and lists. Strictly two dimensional tree data structure, even for openXcom one. It is YAML, or as I said - subset of YAML, and that's why I deliberately called this format "yamlish". 2. Regardless of any desire to use this proposal as an opportunity to see the full YAML 1.2 spec implemented in Python stdlib, I am going to resist. I need work with *safe data format*, which is "human friendly". And I put *safe format* over *serialization format*. > You specifically suggest mapping YAML to XML so we can treat it as a structured document. From the "Relation to XML" section: "YAML is primarily a data serialization language. XML was? designed to support structured documentation." Where? Oh, do you mean this one: "The ideal output for the first version should be generic tree structure with defined names for YAML elements. The tree that can be represented as XML where these names are tags. " It is not about "structured document", it is about "structured data format". "tree that can be represented as XML" is not "XML tree". XML here is just an example of structured nested format that everybody is aware of. I want to say that this "tree structure" should be plain, and 1:1 mapping to XML is necessary and sufficient requirement. > You suggest that we shouldn't build all of YAML, just some bare-minimum subset that's good enough to get started. JSON is already _more_ than a bare-minimum subset of YAML, so we're already done. I didn't know that JSON is not compatible with YAML. Still I am not sure I understand how your argument of "JSON in not YAML" makes it "done" with minimal implementation of YAML. Module name - "yamlish" - defines its purpose as something that my poor language skills can verbalize as "provide support for parsing and writing files in formats, that are subsets of YAML used to store generic user editable, not Python specific declarative data, such as configurations, save files, settings etc.". Because I am not a CS major, I can't describe exactly how to define common things between examples I provided, how these examples are different from usual programming language objects serialized into YAML. I feel that these examples are "yamlish" and I am pretty much appreciate if somebody can come up with proper *definition* of characteristics of the simple data formats (which are still YAML) that give this feeling. Such definition will greatly help to keep it moving in the right direction. > But you'd also like some data-driven way to extend this. YAML has already designed exactly that. Once you have the core schema, you can add new types, and the syntax for those types is data-driven (although the semantics are really only defined in hand-wavy English and probably require code to implement, but I'm not sure how you expect your proposal to be any different, unless you're proposing something like XML Schema). So, either the necessary subset of YAML you want is the entire spec, or you want to do an equal amount of work building something just as complex but not actually YAML. No, it is not data-driven support for extension in "yamlish" format. It is data-driven process of writing parser for "yamlish" - you get one example, parse it, get output, write test, get another, parse it, get output, run previous test. "yamlish" format is only for common, human understandable data files. Perhaps expanding on the idea of "yamlish" format with development process and with details of my "own data transformation theory" was not a good idea, but it was the only chance to find a motivation to write down the stuff. =) Sorry for the overload, and let me clarify things a little. I proposed process for extending support of "yamlish" parser to parse more backward-compatible "yamlish" data formats. There is no mechanism to support conflicting formats, or formats that change the output for existing stuff. That's it. There is no additional API for full YAML, so no complexity involved with maintenance and support of extra features or full YAML speccy. `datatrans` framework I was speaking about is possible implementation of the lib to transform 2D structures between different formats. You know, data transformation process is all the same at some level. On the level above I even can say that everything we do in CS is just data transformation. It is not related to "yamlish" format definition. The only thing that is important that "datatrans" enables many input and many outputs of formats that can be represented in 2D annotated (or generic) tree. It is not related to "yamlish". > The idea of building a useful subset of YAML isn't a bad one. But the way to do that is to go through the features of YAML that JSON doesn't have, and decide which ones you want. For example, YAML with the core schema, but no aliases, no plain strings, and no explicit tags is basically JSON with indented block structure, raw strings, and useful synonyms for key constants (so you can write True instead of true). You could even carefully import a few useful definitions from the type library as long as they're unambiguous (e.g., timestamp). That gives you most of the advantages of YAML that don't bring any safety risks, and its output would be interpretable as full YAML, and it might be a little easier to implement than the full spec. But that has very little to do with your proposal. In particular, leaving out the data-driven features of YAML is what makes it safe and simple. Now I feel that we basically thinking about the same things - simplicity and safety. I didn't read the spec, so I don't know what things are in core YAML schema, so here you know much better than I what needs to be filtered out. My thought was using examples to see what should be filtered out, because iterating over spec will bring many more "useful" features that people with forward thinking might want, but which may be harmful for keeping this small and simple. I really like YAML brevity compared to JSON and other structured data formats (tmuxp example page is a good one). Support for indented data format is also natural for indented language. But it is hard to make format right and not to spoil it with overengineering. About safety. I believe that this "data-driven features of YAML" is the point of confusion. I recall that YAML spec provided some declarative mechanism for extensions. It is not it. My data-driven approach is just "don't design anything upfront, use existing widely used data examples as a spec of data that needs to be parsed". And yes - I don't need this YAML extensibility feature, which I too believe makes YAML unsafe. I need YAML as a format of indented data in a text file. Nothing more. YAML without "extra processing" that leads to potential hacks and execution of unwanted code. I just want to make sure that data format is safe. Currently, Python stdlib lacks a safe serialization format - docs are bleeding red of warnings without specifying any alternatives. I like to call it "yamlish", because if it is named YAML, people will demand dynamism, OOPy "constructor/destructor" tricks, and sooner or later the module users will be pwnd, like it happened with other serialization modules before. Therefore I don't want "serialization as a feature", but I don't mind against "serialization as a side effect" if it is compatible with good intuitive API AND improves the speed without sacrificing _clarity_ and safety. _clarity_ here is the understanding that "there is no way that 'yamlsih' format can be unsafe" at all times. > Meanwhile, I think what you actually want is XSLT processors to convert YAML to and from XML. Fortunately, the YAML community is already working on that at http://www.yaml.org/xml. Then you don't need any new Python code at all; just convert your YAML to XML and use whichever XML library (in the stdlib or not), and you're done. XSLT, that declarative turing complete language. I fed up it. Complexity and performance ruin the beautiful theory. I think that turing-completeness is a trap - solving its gestalts gives a good feeling when you learn it, but it has nothing to do with the real world problems. XSLT processors hog memory AND slow at the same time. XSLT debug is impossible, because process is obscure. I guess that it is also easily exploitable to DoS. XSLT? Not anymore, thanks. XML has only one advantage over all other formats - auto-discoverable validation schemas. That's why it is still so popular. FWIW. Right now Python doesn't have any safe native data for structured data - only linked objects and references. Some time ago I tried to introduce solution to handling structured data by proposing 2D (two dimensional) terminology with a generic tree as base type. But the post became too complicated, lacking pictures, and I was unable to support the communication. I don't want this idea to find a rest in mailing list archives, so if you know how to write such minimal (and safe) parser (and fast) in Python (and maintainable), please tell me. If additional parser language is inevitable, maybe somebody knows of a comparison site similar that http://todomvc.com/ does for MV* frameworks. From techtonik at gmail.com Wed Nov 13 18:15:51 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 13 Nov 2013 20:15:51 +0300 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> References: <1702023F-0AAB-4B6A-8FEB-D026693F9431@gnosis.cx> <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Jun 3, 2013 at 3:59 AM, Stephen J. Turnbull wrote: > David Mertz writes: > > > I would definitely like to have a YAML library--even one with > > restricted function--in the standard library. > > Different use cases (and users) will stick at different restrictions. > This would be endlessly debatable. I think the only restriction that > really makes sense is the load vs. load_unsafe restriction (and that > should be a user decision; the "unsafe" features should be available > to users who want them). Short version of previous letter. "yamlish" is only for simple nested human editable data, such as config files. Format is based on widely popular "organic" examples found on internet and provided in previous letter: http://tmuxp.readthedocs.org/en/latest/examples.html http://code.google.com/p/rietveld/source/browse/app.yaml https://github.com/agschwender/pilbox/blob/master/provisioning/playbook.yml From techtonik at gmail.com Wed Nov 13 18:21:20 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 13 Nov 2013 20:21:20 +0300 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: References: <1702023F-0AAB-4B6A-8FEB-D026693F9431@gnosis.cx> <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Mon, Jun 3, 2013 at 5:56 PM, Philipp A. wrote: >> it is PyYAML based, which is not an option for now as I see it. > > > can you please elaborate on why this is the case? did Kirill Simonov say > ?no?? We don't need full YAML spec implementation for package metadata format in distutils (and for other configs too). Therefore.. Full YAML implementation by PyYAML may be good in general, but for this simple case it is huge, unsafe, slow, C-based, which means the parsing logic is not translatable (i.e. with PythonJS) and can not be optimized with future platform-dependent GPU and CPU cache level 2 tweakers of PyPy JIT. -- anatoly t. From p.f.moore at gmail.com Wed Nov 13 19:47:03 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 13 Nov 2013 18:47:03 +0000 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: References: <1702023F-0AAB-4B6A-8FEB-D026693F9431@gnosis.cx> <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 13 November 2013 17:15, anatoly techtonik wrote: > On Mon, Jun 3, 2013 at 3:59 AM, Stephen J. Turnbull wrote: >> David Mertz writes: >> >> > I would definitely like to have a YAML library--even one with >> > restricted function--in the standard library. >> >> Different use cases (and users) will stick at different restrictions. >> This would be endlessly debatable. I think the only restriction that >> really makes sense is the load vs. load_unsafe restriction (and that >> should be a user decision; the "unsafe" features should be available >> to users who want them). > > Short version of previous letter. "yamlish" is only for simple nested > human editable data, such as config files. Format is based on widely > popular "organic" examples found on internet and provided in previous > letter: 1. Inventing a new data format (your "yamlish" format) is probably a bad idea. There are enough already. 2. Putting support for a newly designed format directly into the stdlib is *definitely* a bad idea. Write a module, put it on PyPI, If it's useful, people will use it. They will help you to iron out the design of the new format - it may evolve into "full" YAML or into JSON, in which case you've learned something about why those formats made the compromises they did, or it will evolve into a popular new format, at which point it might be worth proposing that the module is ready to be included in the stdlib. Or it will not be sufficiently popular, in which case you have at least solved your personal problem. If you are expecting someone else to do this, I think the general message from this thread is that nobody else is interested enough to take this on, so it isn't going to happen, sorry. Paul From techtonik at gmail.com Wed Nov 13 19:58:10 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 13 Nov 2013 21:58:10 +0300 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Nov 13, 2013 at 6:12 AM, Stephen J. Turnbull wrote: > anatoly techtonik writes: > > > the User eXperience - don't you think that fast-typing "print var" is more > > convenient than "print(var)"? > > Sure, but I don't type either: I typically type "prv". I presume this is some kind of IDE you're running. I can't install and use IDEs in every shell / os / embed system / translated language where I have to debug my Python code. > Why complain about Python syntax when upgrading development tools > could gives the same improvements without complicating Python? This wisdom is absent from Python install instruction, so until it is there, we may discuss alternatives. So, what's wrong with supporting both: 1. print xxx - as a statement (which maps to function call) 2. print(xxx) - as an expression (which is a function call) ? From elazarg at gmail.com Wed Nov 13 20:22:02 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Wed, 13 Nov 2013 21:22:02 +0200 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: 2013/11/13 anatoly techtonik : > So, what's wrong with supporting both: > 1. print xxx - as a statement (which maps to function call) > 2. print(xxx) - as an expression (which is a function call) > Python2: >>> print (1,2) (1, 2) Python3: >>> print (1,2) 1 2 What is your suggestion? Repeat Ruby's failure to make function call whitespace-insensitive? >>> print(1,2) 1 2 >>> print (1,2) (1,2) That's horrible. Besides, how do you want code like this one to behave? >>> print = log >>> print xxx Or this: >>> import builtins >>> builtins.print = log >>> print xxx I think it simply not worth it. Elazar From techtonik at gmail.com Wed Nov 13 20:34:34 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 13 Nov 2013 22:34:34 +0300 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: References: <1702023F-0AAB-4B6A-8FEB-D026693F9431@gnosis.cx> <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Nov 13, 2013 at 9:47 PM, Paul Moore wrote: > On 13 November 2013 17:15, anatoly techtonik wrote: >> On Mon, Jun 3, 2013 at 3:59 AM, Stephen J. Turnbull wrote: >>> David Mertz writes: >>> >>> > I would definitely like to have a YAML library--even one with >>> > restricted function--in the standard library. >>> >>> Different use cases (and users) will stick at different restrictions. >>> This would be endlessly debatable. I think the only restriction that >>> really makes sense is the load vs. load_unsafe restriction (and that >>> should be a user decision; the "unsafe" features should be available >>> to users who want them). >> >> Short version of previous letter. "yamlish" is only for simple nested >> human editable data, such as config files. Format is based on widely >> popular "organic" examples found on internet and provided in previous >> letter: > > 1. Inventing a new data format (your "yamlish" format) is probably a > bad idea. There are enough already. > 2. Putting support for a newly designed format directly into the > stdlib is *definitely* a bad idea. It is not a new format. It is YAML subset, limited but fully readable and parseable YAML. If you read it with YAML then save back immediately, you will get the same result. > Write a module, put it on PyPI, If it's useful, people will use it. > They will help you to iron out the design of the new format - it may > evolve into "full" YAML or into JSON, in which case you've learned > something about why those formats made the compromises they did, or it > will evolve into a popular new format, at which point it might be > worth proposing that the module is ready to be included in the stdlib. > Or it will not be sufficiently popular, in which case you have at > least solved your personal problem. Can you more thoroughly criticize the idea, instead of sending me somewhere where I will definitely fail. I've written a few parsers in my life, all manual - I don't have much experience with flex/yacc kind of things, that's why I asked if anybody known a good framework for such stuff. That's why I asked if there is a comparison site similar like http://todomvc.com/ for MV* frameworks, but for Python parser frameworks. Yes/No/Don't Care/Ignore. I don't believe that that among python-ideas subscribers there are no people with experience in different parser frameworks. It is also a good Google Code-In, GSoC project. > If you are expecting someone else to do this, I think the general > message from this thread is that nobody else is interested enough to > take this on, so it isn't going to happen, sorry. The python ideas list is useless without people looking for an exercise and that's worthy implementing regardless of who gave the idea. Good idea without implementation is the same zero as bad idea with. Communicating and defending ideas alone is time-consuming, hard and thankless process already. Even if you come up with implementation for something like hexdump, it will likely be rewritten without any credits if you won't accept the hecking CLA, which nobody cares to explain even on python-legal. From antony.lee at berkeley.edu Wed Nov 13 20:35:39 2013 From: antony.lee at berkeley.edu (Antony Lee) Date: Wed, 13 Nov 2013 11:35:39 -0800 Subject: [Python-ideas] Issues with inspect.Parameter In-Reply-To: References: Message-ID: Here we go: http://bugs.python.org/issue19573 2013/11/12 Antony Lee > I have a (fairly trivial...) patch to fix both of these issues. Should I > post it here or submit it to bugs.python.org? > Antony > > > 2013/11/10 Antony Lee > >> The docstring of inspect.Parameter indicates the "default" and >> "annotation" attributes are not set if the parameter does not have, >> respectively, a default value and an annotation, and that the "kind" >> attribute is a string. >> But in fact, the "default" and "annotation" attributes are set to >> "inspect._empty (== Parameter.empty)" in that case, and the "kind" >> attribute has type "_ParameterKind" (essentially a hand-written equivalent >> of IntEnum). I suggest to correct the docstring accordingly, and to >> replace the implementation of _ParameterKind by a proper IntEnum (if full >> backwards compatibility is required), or even just by Enum (which makes a >> bit more sense, as the fact that _ParameterKind is a subclass of int >> doesn't seem to be documented anywhere). >> Antony >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Wed Nov 13 20:55:27 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 13 Nov 2013 22:55:27 +0300 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Nov 13, 2013 at 10:22 PM, ????? wrote: > 2013/11/13 anatoly techtonik : >> So, what's wrong with supporting both: >> 1. print xxx - as a statement (which maps to function call) >> 2. print(xxx) - as an expression (which is a function call) >> > > Python2: >>>> print (1,2) > (1, 2) > > Python3: >>>> print (1,2) > 1 2 That's horrible. Ok, I agree - this is the reason why it might be impossible to make both statement and a function. But between two ambiguities I chose that the function syntax takes precedence, because "print xxx" is a limited helper for fast-typing, with well-known limitations. More arguments - I doubt people often print tuples without formatting, and even if they do this, it is the code that they usually read immediately and can spot the mistake, so it doesn't hurt. It is not beautiful to have such ambiguity, and may not be recommended, and still "print xxx" is a nice feature to have. > What is your suggestion? Repeat Ruby's failure to make function call > whitespace-insensitive? I don't know Ruby. Where I can read more about this Ruby's fail? >>>> print(1,2) > 1 2 >>>> print (1,2) > (1,2) > > That's horrible. They should both map to print as expression, meaning print as a function, i.e. to the first one. > Besides, how do you want code like this one to behave? > >>>> print = log >>>> print xxx "print xxx" translates to simplified print() call on the language level. > Or this: > >>>> import builtins >>>> builtins.print = log >>>> print xxx The same. When you get AST, it is already print call, or the statement can be translated to expression just after AST is received. From breamoreboy at yahoo.co.uk Wed Nov 13 21:12:33 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Wed, 13 Nov 2013 20:12:33 +0000 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 13/11/2013 18:58, anatoly techtonik wrote: > On Wed, Nov 13, 2013 at 6:12 AM, Stephen J. Turnbull wrote: >> anatoly techtonik writes: >> >> > the User eXperience - don't you think that fast-typing "print var" is more >> > convenient than "print(var)"? >> >> Sure, but I don't type either: I typically type "prv". > > I presume this is some kind of IDE you're running. I can't install and use > IDEs in every shell / os / embed system / translated language where I have > to debug my Python code. > >> Why complain about Python syntax when upgrading development tools >> could gives the same improvements without complicating Python? > > This wisdom is absent from Python install instruction, so until it is there, > we may discuss alternatives. > > > So, what's wrong with supporting both: > 1. print xxx - as a statement (which maps to function call) > 2. print(xxx) - as an expression (which is a function call) > > ? > Nothing. All that's needed is for somebody to write and champion a PEP, write the code, unit tests and documentation and away we go. Are you volunteering? -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From ericsnowcurrently at gmail.com Wed Nov 13 21:45:43 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 13 Nov 2013 13:45:43 -0700 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: References: <1702023F-0AAB-4B6A-8FEB-D026693F9431@gnosis.cx> <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Nov 13, 2013 at 12:34 PM, anatoly techtonik wrote: > I've written a few parsers in my life, all manual - I don't have much > experience with flex/yacc kind of things, that's why I asked if anybody > known a good framework for such stuff. The wiki has a decent listing of parsing libraries, though I don't know how up to date it is. https://wiki.python.org/moin/LanguageParsing FWIW, I like the idea of a library for a safe subset of YAML. Paul is right that it should live on the cheeseshop a while. You might see if anyone on python-list has interest in collaborating with you on such a project. -eric p.s. I appreciate your follow-up email. Your original proposal was long and hard to follow. The idea I like here was lost in there, as evidenced by some of the responses you got. It was much clearer today. Also, those links you gave are nice concrete examples. I'd recommend sticking to that formula of a brief, focused proposal supported by examples. The examples help to mitigate the communication breakdowns. From elazarg at gmail.com Wed Nov 13 21:44:35 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Wed, 13 Nov 2013 22:44:35 +0200 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: 2013/11/13 anatoly techtonik : > I don't know Ruby. Where I can read more about this Ruby's fail? The Ruby Programming Language, 2.1.6.1: "spaces and method invocations": """ Ruby's grammar allows the parentheses around method invocations to be omitted in certain circumstances. This allows Ruby methods to be used as if they were statements, which is an important part of Ruby's elegance. Unfortunately, however, it opens up a pernicious whitespace dependency. Consider the following two lines, which differ only by a single space: f(3+2)+1 f (3+2)+1 The first line passes the value 5 to the function f and then adds 1 to the result. Since the second line has a space after the function name, Ruby assumes that the parentheses around the method call have been omitted. The parentheses that appear after the space are used to group a subexpression, but the entire expression (3+2)+1 is used as the method argument. If warnings are enabled (with -w), Ruby issues a warning whenever it sees ambiguous code like this. The solution to this whitespace dependency is straightforward: * Never put a space between a method name and the opening parenthesis. * If the first argument to a method begins with an open parenthesis, always use parentheses in the method invocation. For example, write f((3+2)+1). * Always run the Ruby interpreter with the -w option so it will warn you if you forget either of the rules above! """ I think avoiding this problem is better than dodging it and have a special warning, Even if print does not return meaningful value in the first place. > >>>>> print(1,2) >> 1 2 >>>>> print (1,2) >> (1,2) >> >> That's horrible. > > They should both map to print as expression, meaning print as a > function, i.e. to the first one. So you want Python to behave like this: >>> xxx = (1,2) >>> print xxx.count(1) 1 >>> print (1,2).count(1) # or (1).real, since 1.real is an error Traceback (most recent call last): File "", line 1, in AttributeError: 'NoneType' object has no attribute 'count' That's pretty incosistent and will surprise beginners. My own feelings is that this feature is the kind of thing that can make a person leave the language, or never begin using it in the first place; Things like this are part of the reason I don't use Ruby. And all this to save keystrokes? If it was part of a bigger feature, like ML's curried functions syntax, it would have been great - things like: perr = print sys.stderr perr "Bad command or file name" But keystrokes just can't be the reason for introducing such inconsistencies. Elazar From breamoreboy at yahoo.co.uk Wed Nov 13 22:15:41 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Wed, 13 Nov 2013 21:15:41 +0000 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: References: <1702023F-0AAB-4B6A-8FEB-D026693F9431@gnosis.cx> <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 13/11/2013 20:45, Eric Snow wrote: > On Wed, Nov 13, 2013 at 12:34 PM, anatoly techtonik wrote: >> I've written a few parsers in my life, all manual - I don't have much >> experience with flex/yacc kind of things, that's why I asked if anybody >> known a good framework for such stuff. > > The wiki has a decent listing of parsing libraries, though I don't > know how up to date it is. > > https://wiki.python.org/moin/LanguageParsing > There's also Ned Batchelder's site http://nedbatchelder.com/text/python-parsers.html last updated 29th December 2012. -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From steve at pearwood.info Thu Nov 14 02:33:25 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 14 Nov 2013 12:33:25 +1100 Subject: [Python-ideas] Stdlib YAML evolution (Was: PEP 426, YAML in the stdlib and implementation discovery) In-Reply-To: References: <1702023F-0AAB-4B6A-8FEB-D026693F9431@gnosis.cx> <87ip1wau0i.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20131114013325.GJ2085@ando> On Wed, Nov 13, 2013 at 10:34:34PM +0300, anatoly techtonik wrote: > I've written a few parsers in my life, all manual - I don't have much > experience with flex/yacc kind of things, that's why I asked if anybody > known a good framework for such stuff. Try PyParsing. -- Steven From ethan at stoneleaf.us Thu Nov 14 02:35:34 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 13 Nov 2013 17:35:34 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <528428E6.6090709@stoneleaf.us> On 11/13/2013 12:44 PM, ????? wrote: > > The Ruby Programming Language, 2.1.6.1: "spaces and method invocations": > """ > Ruby's grammar allows the parentheses around method invocations to be > omitted in certain circumstances. > This allows Ruby methods to be used as if they were statements, which > is an important part of Ruby's elegance. Unfortunately, however, it > opens up a pernicious whitespace dependency. Thanks for the example. Personally, I wouldn't use the word elegance along with the phrase "opens up a pernicious ..." -- ~Ethan~ From abarnert at yahoo.com Thu Nov 14 04:13:19 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 13 Nov 2013 19:13:19 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Nov 13, 2013, at 12:44, ????? wrote: > If it was part of a bigger feature, like > ML's curried functions syntax, it would have been great - things like: > > perr = print sys.stderr > perr "Bad command or file name" I know this is getting way off topic, but the real problem with doing curried/auto-partial functions in Python isn't the parens, it's the variable arguments. An auto-partial function has to accumulate parameters if it doesn't get enough, execute when it does. (Currying gives you that for free, because it means you only get one argument at a time, but you can do auto-partials without currying.) With print, how do you know when it has "enough" arguments? You can write a decorator that only works on functions with fixed parameter counts pretty easily: @autopartial def spam(word, n): for _ in range(n): print(word) eggs = spam('eggs') eggs(3) But to handle a vararg function, you'd need a separate syntax for partializing vs. calling. Using a different operator like [] or % or << seems attractive at first, but it can't handle keywords. You could add a method, so spam._(n=5) returns partial(spam, n=5), but ._ is hideous, and anything meaningful like bind or partial is no longer a shortcut. You could use a special argument value, and ... looks perfect, especially as args[-1]: spam('eggs', ...). Until you consider keywords args, which come after args[-1]. So the best you can do is args[0]: spam(..., 'eggs', n=3). That isn't terrible, but I'm not sure it's nice enough to be worth the cost of people not understanding what it's doing. From haoyi.sg at gmail.com Thu Nov 14 05:17:33 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 13 Nov 2013 20:17:33 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: > But to handle a vararg function, you'd need a separate syntax for partializing vs. calling. I personally like the _ notation used by scala; with macrosyou could easily write something like: f[spam(_, 5)] f[spam(_, n=_)] Which desugars into lambda x: spam(x, 5) lambda x, y: spam(x, n=y) If you were willing to special-case further, you could simply have spam(_, 5) spam(_, n=_) be representative of the partial-application. Granted _ is already used for other things (e.g. i18n) but that's a solvable problem (make i18n use __, let partial application use $). On Wed, Nov 13, 2013 at 7:13 PM, Andrew Barnert wrote: > On Nov 13, 2013, at 12:44, ????? wrote: > > > If it was part of a bigger feature, like > > ML's curried functions syntax, it would have been great - things like: > > > > perr = print sys.stderr > > perr "Bad command or file name" > > I know this is getting way off topic, but the real problem with doing > curried/auto-partial functions in Python isn't the parens, it's the > variable arguments. An auto-partial function has to accumulate parameters > if it doesn't get enough, execute when it does. (Currying gives you that > for free, because it means you only get one argument at a time, but you can > do auto-partials without currying.) > > With print, how do you know when it has "enough" arguments? > > You can write a decorator that only works on functions with fixed > parameter counts pretty easily: > > @autopartial > def spam(word, n): > for _ in range(n): > print(word) > > eggs = spam('eggs') > eggs(3) > > But to handle a vararg function, you'd need a separate syntax for > partializing vs. calling. > > Using a different operator like [] or % or << seems attractive at first, > but it can't handle keywords. > > You could add a method, so spam._(n=5) returns partial(spam, n=5), but ._ > is hideous, and anything meaningful like bind or partial is no longer a > shortcut. > > You could use a special argument value, and ... looks perfect, especially > as args[-1]: spam('eggs', ...). Until you consider keywords args, which > come after args[-1]. So the best you can do is args[0]: spam(..., 'eggs', > n=3). That isn't terrible, but I'm not sure it's nice enough to be worth > the cost of people not understanding what it's doing. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Nov 14 06:28:52 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 13 Nov 2013 21:28:52 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <656BD68A-59B5-431B-B68A-41FE4F1027B4@yahoo.com> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <656BD68A-59B5-431B-B68A-41FE4F1027B4@yahoo.com> Message-ID: Please disregard that incomplete message. Correct version is coming... Sent from a random iPhone On Nov 13, 2013, at 21:28, Andrew Barnert wrote: > On Nov 13, 2013, at 20:17, Haoyi Li wrote: > >> > But to handle a vararg function, you'd need a separate syntax for partializing vs. calling. >> >> I personally like the _ notation used by scala; with macros you could easily write something like: >> >> f[spam(_, 5)] >> f[spam(_, n=_)] >> >> Which desugars into >> >> lambda x: spam(x, 5) >> lambda x, y: spam(x, n=y) > > Actually, you can get about 80% of the way without macros. I've got an expression template library (that I never finished) that lets you write things like this: > > _2 ** (1 - _1) > > Where _1 and _2 are clever objects that turn this into an equivalent of: > > lambda x, y: y ** (1 - x) > > (Although it's actually a chain of calls, one for each operator.) > > The big missing part is handling function calls. It's easy when the function is one of your magic lambda args: _1(0) becomes lambda f: f(0). But when the function is a normal function, there's no __rcall__ to override, so you have to write it like this: > > _(f)(_1) > > And you need similar tricks whenever you have an expression where neither argument is magic but you want the value delayed. For example, to get lambda: a+b you need to write _(a) + b. > > And there's also the fact that not every operator in Python is overloadable. > > Anyway, I like the explicitly-numbered arguments better than the single _, because it allows you to use the same argument multiple times in the expression. (Plus, I already used plain _ as the wrapper to turn a normal value into a magic delayed value.) But I'll bet you could design it so _ works like in Scala, but _1 through _9 work my way (much as format strings can take both {} and {0}). > > As for why they're 1-based instead of 0-based, I don't remember; I suspect the only explanation is that I'm an idiot. > > Anyway, the big problem with the _ delay function is that sometimes you want a value closure, and sometimes a name, and the only way I could think of to handle both is to spell the latter as _('x'), which is ugly, and seems to encourage dynamic strings as arguments (which work, but it's as bad an idea as using globals()['x'] or eval('x') in normal code), and worst of all it looks exactly like gettext, which is a great way to confuse people. > > I later realized that with a bit of frame hacking, I could write a separate function _n(x). Tearing everything apart to make that work is where I ran out of steam and ended up with an incomplete library. > >> If you were willing to special-case further, you could simply have >> >> spam(_, 5) >> spam(_, n=_) >> >> be representative of the partial-application. Granted _ is already used for other things (e.g. i18n) but that's a solvable problem (make i18n use __, let partial application use $). >> >> >> >> >> On Wed, Nov 13, 2013 at 7:13 PM, Andrew Barnert wrote: >>> On Nov 13, 2013, at 12:44, ????? wrote: >>> >>> > If it was part of a bigger feature, like >>> > ML's curried functions syntax, it would have been great - things like: >>> > >>> > perr = print sys.stderr >>> > perr "Bad command or file name" >>> >>> I know this is getting way off topic, but the real problem with doing curried/auto-partial functions in Python isn't the parens, it's the variable arguments. An auto-partial function has to accumulate parameters if it doesn't get enough, execute when it does. (Currying gives you that for free, because it means you only get one argument at a time, but you can do auto-partials without currying.) >>> >>> With print, how do you know when it has "enough" arguments? >>> >>> You can write a decorator that only works on functions with fixed parameter counts pretty easily: >>> >>> @autopartial >>> def spam(word, n): >>> for _ in range(n): >>> print(word) >>> >>> eggs = spam('eggs') >>> eggs(3) >>> >>> But to handle a vararg function, you'd need a separate syntax for partializing vs. calling. >>> >>> Using a different operator like [] or % or << seems attractive at first, but it can't handle keywords. >>> >>> You could add a method, so spam._(n=5) returns partial(spam, n=5), but ._ is hideous, and anything meaningful like bind or partial is no longer a shortcut. >>> >>> You could use a special argument value, and ... looks perfect, especially as args[-1]: spam('eggs', ...). Until you consider keywords args, which come after args[-1]. So the best you can do is args[0]: spam(..., 'eggs', n=3). That isn't terrible, but I'm not sure it's nice enough to be worth the cost of people not understanding what it's doing. >>> >>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> https://mail.python.org/mailman/listinfo/python-ideas >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Nov 14 06:28:22 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 13 Nov 2013 21:28:22 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <656BD68A-59B5-431B-B68A-41FE4F1027B4@yahoo.com> On Nov 13, 2013, at 20:17, Haoyi Li wrote: > > But to handle a vararg function, you'd need a separate syntax for partializing vs. calling. > > I personally like the _ notation used by scala; with macros you could easily write something like: > > f[spam(_, 5)] > f[spam(_, n=_)] > > Which desugars into > > lambda x: spam(x, 5) > lambda x, y: spam(x, n=y) Actually, you can get about 80% of the way without macros. I've got an expression template library (that I never finished) that lets you write things like this: _2 ** (1 - _1) Where _1 and _2 are clever objects that turn this into an equivalent of: lambda x, y: y ** (1 - x) (Although it's actually a chain of calls, one for each operator.) The big missing part is handling function calls. It's easy when the function is one of your magic lambda args: _1(0) becomes lambda f: f(0). But when the function is a normal function, there's no __rcall__ to override, so you have to write it like this: _(f)(_1) And you need similar tricks whenever you have an expression where neither argument is magic but you want the value delayed. For example, to get lambda: a+b you need to write _(a) + b. And there's also the fact that not every operator in Python is overloadable. Anyway, I like the explicitly-numbered arguments better than the single _, because it allows you to use the same argument multiple times in the expression. (Plus, I already used plain _ as the wrapper to turn a normal value into a magic delayed value.) But I'll bet you could design it so _ works like in Scala, but _1 through _9 work my way (much as format strings can take both {} and {0}). As for why they're 1-based instead of 0-based, I don't remember; I suspect the only explanation is that I'm an idiot. Anyway, the big problem with the _ delay function is that sometimes you want a value closure, and sometimes a name, and the only way I could think of to handle both is to spell the latter as _('x'), which is ugly, and seems to encourage dynamic strings as arguments (which work, but it's as bad an idea as using globals()['x'] or eval('x') in normal code), and worst of all it looks exactly like gettext, which is a great way to confuse people. I later realized that with a bit of frame hacking, I could write a separate function _n(x). Tearing everything apart to make that work is where I ran out of steam and ended up with an incomplete library. > If you were willing to special-case further, you could simply have > > spam(_, 5) > spam(_, n=_) > > be representative of the partial-application. Granted _ is already used for other things (e.g. i18n) but that's a solvable problem (make i18n use __, let partial application use $). > > > > > On Wed, Nov 13, 2013 at 7:13 PM, Andrew Barnert wrote: >> On Nov 13, 2013, at 12:44, ????? wrote: >> >> > If it was part of a bigger feature, like >> > ML's curried functions syntax, it would have been great - things like: >> > >> > perr = print sys.stderr >> > perr "Bad command or file name" >> >> I know this is getting way off topic, but the real problem with doing curried/auto-partial functions in Python isn't the parens, it's the variable arguments. An auto-partial function has to accumulate parameters if it doesn't get enough, execute when it does. (Currying gives you that for free, because it means you only get one argument at a time, but you can do auto-partials without currying.) >> >> With print, how do you know when it has "enough" arguments? >> >> You can write a decorator that only works on functions with fixed parameter counts pretty easily: >> >> @autopartial >> def spam(word, n): >> for _ in range(n): >> print(word) >> >> eggs = spam('eggs') >> eggs(3) >> >> But to handle a vararg function, you'd need a separate syntax for partializing vs. calling. >> >> Using a different operator like [] or % or << seems attractive at first, but it can't handle keywords. >> >> You could add a method, so spam._(n=5) returns partial(spam, n=5), but ._ is hideous, and anything meaningful like bind or partial is no longer a shortcut. >> >> You could use a special argument value, and ... looks perfect, especially as args[-1]: spam('eggs', ...). Until you consider keywords args, which come after args[-1]. So the best you can do is args[0]: spam(..., 'eggs', n=3). That isn't terrible, but I'm not sure it's nice enough to be worth the cost of people not understanding what it's doing. >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Thu Nov 14 10:09:17 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 14 Nov 2013 10:09:17 +0100 Subject: [Python-ideas] where statement in Pyret References: <527F5804.9010605@ziade.org> Message-ID: <20131114100917.6ba49590@fsol> On Mon, 11 Nov 2013 07:57:21 +1000 Nick Coghlan wrote: > > > unittest-like structuration is really what works best for most testing > situations, IMO. Alternative testing schemes for "easier" or "more > intuitive" testing have generally failed as general-purpose tools. > > Agreed, but the spelling of test assertions as methods on test cases isn't > an essential part of that structure. That's true, OTOH I disagree that assertions as methods is any kind of hindrance to easy testing. Actually, that paradigm makes it trivial to define your own assertions in a way that makes them look like the built-in ones. Regards Antoine. From solipsis at pitrou.net Thu Nov 14 10:12:02 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 14 Nov 2013 10:12:02 +0100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 References: Message-ID: <20131114101202.201b7cc5@fsol> On Tue, 12 Nov 2013 11:45:30 +0800 Xuancong Wang wrote: > Hi python developers, > > I notice that one major change in python 3 is that it makes 'print' as a > standard function, and it will require typing (). I do understand that it > makes python language more consistent because most of the python > functionalities are implemented as function calls. > > As you know, reading from and writing to IO is a high frequency operation. > By entropy coding theorem (e.g. Huffman coding), an efficient language > should assign shorter language code to more frequent tasks. Typing a '(' > requires holding SHIFT and pressing 9, the input effort is much higher than > that in Python 2. Also, specifying IO has changed from >>* to file=*, which > also becomes more inconvenient. As a reminder, having to hold SHIFT to enter a parenthesis depends on your keyboard layout. Not everyone uses an American keyboard. Regards Antoine. From abarnert at yahoo.com Thu Nov 14 11:45:12 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 14 Nov 2013 02:45:12 -0800 (PST) Subject: [Python-ideas] Quick Lambdas and Partials (Re: A suggestion for Python 3 vs Python 2) In-Reply-To: <1384409368.98498.YahooMailNeo@web184706.mail.ne1.yahoo.com> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <656BD68A-59B5-431B-B68A-41FE4F1027B4@yahoo.com> <1384409368.98498.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: <1384425912.12726.YahooMailNeo@web184701.mail.ne1.yahoo.com> I think the discussions on expression-template quick lambdas and auto-partialing functions are both way off topic for the thread about changing Python syntax to allow function calls without parens (or just print, or whatever).?And I don't think they really belong on python-ideas at all (unless someone has a suggestion for language or stdlib support for either, which seems unlikely). Since I started both ideas, the derailing is entirely my fault. Anyway, if anyone's interested, I slapped together a quick implementation of each idea.?https://github.com/abarnert/quicklambda is incomplete, but playable-with;?https://github.com/abarnert/quickpartial is not working at all (in fact, I haven't even pushed a commit yet). >> On Nov 13, 2013, at 20:17, Haoyi Li wrote: >>> I personally like the _ notation used by scala; with macros you could >>> easily write something like: >>> >>> f[spam(_, 5)]? >>> f[spam(_, n=_)] ? From graffatcolmingov at gmail.com Thu Nov 14 13:37:19 2013 From: graffatcolmingov at gmail.com (Ian Cordasco) Date: Thu, 14 Nov 2013 06:37:19 -0600 Subject: [Python-ideas] Quick Lambdas and Partials (Re: A suggestion for Python 3 vs Python 2) In-Reply-To: <1384425912.12726.YahooMailNeo@web184701.mail.ne1.yahoo.com> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <656BD68A-59B5-431B-B68A-41FE4F1027B4@yahoo.com> <1384409368.98498.YahooMailNeo@web184706.mail.ne1.yahoo.com> <1384425912.12726.YahooMailNeo@web184701.mail.ne1.yahoo.com> Message-ID: On a related note, I've been intrigued by Haskell's idea of currying all functions, and have an implementation that works for pure functions but not for callable classes or methods over on https://github.com/sigmavirus24/curryer. On Thu, Nov 14, 2013 at 4:45 AM, Andrew Barnert wrote: > I think the discussions on expression-template quick lambdas and auto-partialing functions are both way off topic for the thread about changing Python syntax to allow function calls without parens (or just print, or whatever). And I don't think they really belong on python-ideas at all (unless someone has a suggestion for language or stdlib support for either, which seems unlikely). > > Since I started both ideas, the derailing is entirely my fault. > > Anyway, if anyone's interested, I slapped together a quick implementation of each idea. https://github.com/abarnert/quicklambda is incomplete, but playable-with; https://github.com/abarnert/quickpartial is not working at all (in fact, I haven't even pushed a commit yet). > >>> On Nov 13, 2013, at 20:17, Haoyi Li wrote: > > >>>> I personally like the _ notation used by scala; with macros you could >>>> easily write something like: >>>> >>>> f[spam(_, 5)] >>>> f[spam(_, n=_)] > > ? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From steve at pearwood.info Thu Nov 14 16:05:21 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 15 Nov 2013 02:05:21 +1100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <20131114150520.GK2085@ando> On Wed, Nov 13, 2013 at 07:13:19PM -0800, Andrew Barnert wrote: > I know this is getting way off topic, but the real problem with doing > curried/auto-partial functions in Python isn't the parens, it's the > variable arguments. An auto-partial function has to accumulate > parameters if it doesn't get enough, execute when it does. (Currying > gives you that for free, because it means you only get one argument at > a time, but you can do auto-partials without currying.) I disagree. The real problem is the ambiguity. Given: y = divmod(x) did I intend for y to be a partial function, or did I mean to type divmod(x, 3) but mess up? Given how few coders come from a functional programming background, my money is that it's an error. In languages with static typing, it's easy for the compiler to resolve this: if y is declared as a function object, then I meant for the partial application. If y is declared as an int, then it's an error. But you can't do this in Python. [...] > But to handle a vararg function, you'd need a separate syntax for > partializing vs. calling. We have that. It's called functools.partial :-) Aside: am I the only one who wishes there was a functools.rpartial, that binds from the right instead of the left? > Using a different operator like [] or % or << seems attractive at > first, but it can't handle keywords. > > You could add a method, so spam._(n=5) returns partial(spam, n=5), but > ._ is hideous, and anything meaningful like bind or partial is no > longer a shortcut. And why is this a problem? You read code more often than you type it, and quite frankly, creating partial applications of functions shouldn't be so common in real-world code that it needs to be so concise. > You could use a special argument value, and ... looks perfect, > especially as args[-1]: spam('eggs', ...). You and I have very different ideas about perfection. -- Steven From masklinn at masklinn.net Thu Nov 14 16:30:45 2013 From: masklinn at masklinn.net (Masklinn) Date: Thu, 14 Nov 2013 16:30:45 +0100 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <20131114150520.GK2085@ando> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <20131114150520.GK2085@ando> Message-ID: <27903D3C-21DC-4211-8F80-82C319E61FAD@masklinn.net> On 2013-11-14, at 16:05 , Steven D'Aprano wrote: > On Wed, Nov 13, 2013 at 07:13:19PM -0800, Andrew Barnert wrote: > >> I know this is getting way off topic, but the real problem with doing >> curried/auto-partial functions in Python isn't the parens, it's the >> variable arguments. An auto-partial function has to accumulate >> parameters if it doesn't get enough, execute when it does. (Currying >> gives you that for free, because it means you only get one argument at >> a time, but you can do auto-partials without currying.) > > I disagree. The real problem is the ambiguity. Given: > > y = divmod(x) > > did I intend for y to be a partial function, or did I mean to type > divmod(x, 3) but mess up? Given how few coders come from a functional > programming background, my money is that it's an error. > > In languages with static typing, it's easy for the compiler to resolve > this: if y is declared as a function object, then I meant for the > partial application. In most of those languages, y likely isn?t explicitly typed and its type will be inferred depending on how it?s used. > If y is declared as an int, then it's an error. But > you can?t do this in Python. But it?ll likely blow up a few lines below when you try to perform arithmetic operations on a function (not that this is a good thing, but it?s no different from a function returning an unexpected None) > [...] >> But to handle a vararg function, you'd need a separate syntax for >> partializing vs. calling. > > We have that. It's called functools.partial :-) > > Aside: am I the only one who wishes there was a functools.rpartial, that > binds from the right instead of the left? I?ve wanted this as well. Or the ability to somehow ?fill holes?. Scala- or Clojure-type parse transforms are pretty neat for that (write the call with holes/placeholders, the call is deferred as a function taking arguments to fill the placeholders) Examples of Scala have already been provided, in Clojure it?s an explicit reader form: #(foo 1 2 %) is equivalent (and expanded) to (fn [arg] (foo 1 2 arg)) From elazarg at gmail.com Thu Nov 14 16:47:00 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Thu, 14 Nov 2013 17:47:00 +0200 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <27903D3C-21DC-4211-8F80-82C319E61FAD@masklinn.net> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <20131114150520.GK2085@ando> <27903D3C-21DC-4211-8F80-82C319E61FAD@masklinn.net> Message-ID: 2013/11/14 Masklinn : > > On 2013-11-14, at 16:05 , Steven D'Aprano wrote: >> Aside: am I the only one who wishes there was a functools.rpartial, that >> binds from the right instead of the left? > > I?ve wanted this as well. Or the ability to somehow ?fill holes?. > Scala- or Clojure-type parse transforms are pretty neat for that > (write the call with holes/placeholders, the call is deferred as a function > taking arguments to fill the placeholders) > > Examples of Scala have already been provided, in Clojure it?s an explicit > reader form: > > #(foo 1 2 %) > > is equivalent (and expanded) to > > (fn [arg] (foo 1 2 arg)) Ellipsis seems suitable for the latter: sub5 = partial(sub, ..., 5) (At the expense of giving up the abitily to pass ellipsis to partial functions). Elazar From barry at python.org Thu Nov 14 17:30:13 2013 From: barry at python.org (Barry Warsaw) Date: Thu, 14 Nov 2013 11:30:13 -0500 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 References: <20131114101202.201b7cc5@fsol> Message-ID: <20131114113013.5dde23f1@anarchist> On Nov 14, 2013, at 10:12 AM, Antoine Pitrou wrote: >As a reminder, having to hold SHIFT to enter a parenthesis depends on >your keyboard layout. http://www.artima.com/weblogs/viewpost.jsp?thread=173477 pinkie-ly y'rs, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From abarnert at yahoo.com Thu Nov 14 18:45:57 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 14 Nov 2013 09:45:57 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <20131114150520.GK2085@ando> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <20131114150520.GK2085@ando> Message-ID: <4B83BFB0-0295-450C-96C7-7997CA315621@yahoo.com> On Nov 14, 2013, at 7:05, Steven D'Aprano wrote: > On Wed, Nov 13, 2013 at 07:13:19PM -0800, Andrew Barnert wrote: > >> I know this is getting way off topic, but the real problem with doing >> curried/auto-partial functions in Python isn't the parens, it's the >> variable arguments. An auto-partial function has to accumulate >> parameters if it doesn't get enough, execute when it does. (Currying >> gives you that for free, because it means you only get one argument at >> a time, but you can do auto-partials without currying.) > > I disagree. The real problem is the ambiguity. Given: > > y = divmod(x) > > did I intend for y to be a partial function, or did I mean to type > divmod(x, 3) but mess up? Given how few coders come from a functional > programming background, my money is that it's an error. This may be a reason not to do it, but it's not something that makes it hard or impossible to do it. Variable arguments _are_ something that makes it hard or impossible to do it. > In languages with static typing, it's easy for the compiler to resolve > this: if y is declared as a function object, then I meant for the > partial application. If y is declared as an int, then it's an error. But > you can't do this in Python. That's not how it works in any of the languages we're talking about. They have a type inference system, usually based on Hindley-Milner, not a C-style nominal type system. Of course sometimes declared (or elsewhere-inferred) variable types will resolve an ambiguity during unification, but that's not the usual case; declared or inferred function types do so far more often. divmod is int*int->int*int (or, in a curried language, int->int->int*int), and therefore this expression is only typable if it's a partial, and y's type is inferred from the type the expression has as a partial. (And if you've declared y to have an explicit type that isn't compatible with the partial, you get a unification error.) If you allow varargs functions, where the type can be int->int or int*int->int, HM doesn't work. And, not coincidentally, none of these languages allow varargs functions. > [...] >> But to handle a vararg function, you'd need a separate syntax for >> partializing vs. calling. > > We have that. It's called functools.partial :-) Yes, but the (sub-)thread started off with (summarizing) "I wish we had something more brief/simple/Scala-like for partials than functools.partial". And functools.partial is not an answer to that. > Aside: am I the only one who wishes there was a functools.rpartial, that > binds from the right instead of the left? I've often wanted that. I've even built it a few times. It has the same problem with keyword arguments and keyword-only params that auto-partialing does, but it doesn't feel like as much of a limitation--you just wouldn't ever use it in those cases (especially since you can often just bind the argument by keyword in those cases). I've also occasionally wanted a partial where you specify the indices of the arguments as keywords (which gives you rpartial just by using -1), so I could more easily bind the second of three (or even of variable) arguments. But that comes up less often. >> Using a different operator like [] or % or << seems attractive at >> first, but it can't handle keywords. >> You could add a method, so spam._(n=5) returns partial(spam, n=5), but >> ._ is hideous, and anything meaningful like bind or partial is no >> longer a shortcut. > > And why is this a problem? I assume you're not asking about why ._ being hideous is a problem, but why .partial being not a shortcut is a problem, right? Offering foo.partial(n=5) as an alternative for partial(foo, n=5) doesn't provide any benefit--it's not easier to read, more obvious to write, shorter, or better in any other way. It just gives you two obvious ways to do it instead of one. Which is bad. > You read code more often than you type it, > and quite frankly, creating partial applications of functions shouldn't > be so common in real-world code that it needs to be so concise. It's common in many application areas. For example, I have a row of radio buttons. Instead of defining 5 separate callback functions, I write one callback function that takes a number from 0 to 4, and then I bind each button to partial(callback, n). Many people today instead bind each button to a lambda, I suspect because the tkinter tutorials all use lambda for this purpose instead of partial. I've also seen people write an explicit make_btn_callback function so they can bind each button to make_btn_callback(n) Anyway, compare these three and tell me you see zero benefit in the last one: Button(frame, str(i), lambda event: click(i, event)) Button(frame, str(i), partial(click, i)) Button(frame, str(i), click(i, ...)) From abarnert at yahoo.com Thu Nov 14 18:58:19 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 14 Nov 2013 09:58:19 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <20131114150520.GK2085@ando> <27903D3C-21DC-4211-8F80-82C319E61FAD@masklinn.net> Message-ID: <9A03C089-41C3-49C2-9199-65589E23DCFC@yahoo.com> On Nov 14, 2013, at 7:47, ????? wrote: > 2013/11/14 Masklinn : >> >> On 2013-11-14, at 16:05 , Steven D'Aprano wrote: >>> Aside: am I the only one who wishes there was a functools.rpartial, that >>> binds from the right instead of the left? >> >> I?ve wanted this as well. Or the ability to somehow ?fill holes?. >> Scala- or Clojure-type parse transforms are pretty neat for that >> (write the call with holes/placeholders, the call is deferred as a function >> taking arguments to fill the placeholders) >> >> Examples of Scala have already been provided, in Clojure it?s an explicit >> reader form: >> >> #(foo 1 2 %) This example doesn't actually show rpartial--but it's enough to make it obvious to anyone how you'd do that in Clojure, so that's ok. >> >> is equivalent (and expanded) to >> >> (fn [arg] (foo 1 2 arg)) > > Ellipsis seems suitable for the latter: > > sub5 = partial(sub, ..., 5) > > (At the expense of giving up the abitily to pass ellipsis to partial functions). I don't know why, but to me that strongly implies that I'm binding argument -1 (after 0 or more arguments, or maybe 1 or more), rather than argument 2 (after exactly 1). But that's not necessarily a bad thing. But if other people expect it to bind 2, they'll be surprised when they try it on a 3-argument (or variable-argument) function. (And yes, I realize that mixing 1-based arg counting with python negative indices is potentially confusing. I don't think it's confusing in this particular case, but in, say, documentation for a stdlib function it could be.) From haoyi.sg at gmail.com Thu Nov 14 19:02:00 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 14 Nov 2013 10:02:00 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <4B83BFB0-0295-450C-96C7-7997CA315621@yahoo.com> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <20131114150520.GK2085@ando> <4B83BFB0-0295-450C-96C7-7997CA315621@yahoo.com> Message-ID: > creating partial applications of functions shouldn't be so common in real-world code that it needs to be so concise. I don't quite agree with this; I think a lot of people use some sort of partial application every day and just don't know it. - Of the times you use lambda, how often does it redirects all arguments to a single function, with the rest hard coded? - How often do you instantiate a class (with some args), call a method on that class (with some args) and then throw it away, never to be seen again? Both of these are basically crufty approximations for partial application, and I see them pretty often, especially the second one. On Thu, Nov 14, 2013 at 9:45 AM, Andrew Barnert wrote: > On Nov 14, 2013, at 7:05, Steven D'Aprano wrote: > > > On Wed, Nov 13, 2013 at 07:13:19PM -0800, Andrew Barnert wrote: > > > >> I know this is getting way off topic, but the real problem with doing > >> curried/auto-partial functions in Python isn't the parens, it's the > >> variable arguments. An auto-partial function has to accumulate > >> parameters if it doesn't get enough, execute when it does. (Currying > >> gives you that for free, because it means you only get one argument at > >> a time, but you can do auto-partials without currying.) > > > > I disagree. The real problem is the ambiguity. Given: > > > > y = divmod(x) > > > > did I intend for y to be a partial function, or did I mean to type > > divmod(x, 3) but mess up? Given how few coders come from a functional > > programming background, my money is that it's an error. > > This may be a reason not to do it, but it's not something that makes it > hard or impossible to do it. Variable arguments _are_ something that makes > it hard or impossible to do it. > > > In languages with static typing, it's easy for the compiler to resolve > > this: if y is declared as a function object, then I meant for the > > partial application. If y is declared as an int, then it's an error. But > > you can't do this in Python. > > That's not how it works in any of the languages we're talking about. They > have a type inference system, usually based on Hindley-Milner, not a > C-style nominal type system. > > Of course sometimes declared (or elsewhere-inferred) variable types will > resolve an ambiguity during unification, but that's not the usual case; > declared or inferred function types do so far more often. divmod is > int*int->int*int (or, in a curried language, int->int->int*int), and > therefore this expression is only typable if it's a partial, and y's type > is inferred from the type the expression has as a partial. (And if you've > declared y to have an explicit type that isn't compatible with the partial, > you get a unification error.) > > If you allow varargs functions, where the type can be int->int or > int*int->int, HM doesn't work. And, not coincidentally, none of these > languages allow varargs functions. > > > [...] > >> But to handle a vararg function, you'd need a separate syntax for > >> partializing vs. calling. > > > > We have that. It's called functools.partial :-) > > Yes, but the (sub-)thread started off with (summarizing) "I wish we had > something more brief/simple/Scala-like for partials than > functools.partial". And functools.partial is not an answer to that. > > > Aside: am I the only one who wishes there was a functools.rpartial, that > > binds from the right instead of the left? > > I've often wanted that. I've even built it a few times. It has the same > problem with keyword arguments and keyword-only params that auto-partialing > does, but it doesn't feel like as much of a limitation--you just wouldn't > ever use it in those cases (especially since you can often just bind the > argument by keyword in those cases). > > I've also occasionally wanted a partial where you specify the indices of > the arguments as keywords (which gives you rpartial just by using -1), so I > could more easily bind the second of three (or even of variable) arguments. > But that comes up less often. > > >> Using a different operator like [] or % or << seems attractive at > >> first, but it can't handle keywords. > >> You could add a method, so spam._(n=5) returns partial(spam, n=5), but > >> ._ is hideous, and anything meaningful like bind or partial is no > >> longer a shortcut. > > > > And why is this a problem? > > I assume you're not asking about why ._ being hideous is a problem, but > why .partial being not a shortcut is a problem, right? > > Offering foo.partial(n=5) as an alternative for partial(foo, n=5) doesn't > provide any benefit--it's not easier to read, more obvious to write, > shorter, or better in any other way. It just gives you two obvious ways to > do it instead of one. Which is bad. > > > You read code more often than you type it, > > and quite frankly, creating partial applications of functions shouldn't > > be so common in real-world code that it needs to be so concise. > > It's common in many application areas. For example, I have a row of radio > buttons. Instead of defining 5 separate callback functions, I write one > callback function that takes a number from 0 to 4, and then I bind each > button to partial(callback, n). > > Many people today instead bind each button to a lambda, I suspect because > the tkinter tutorials all use lambda for this purpose instead of partial. > > I've also seen people write an explicit make_btn_callback function so they > can bind each button to make_btn_callback(n) > > Anyway, compare these three and tell me you see zero benefit in the last > one: > > Button(frame, str(i), lambda event: click(i, event)) > Button(frame, str(i), partial(click, i)) > Button(frame, str(i), click(i, ...)) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Thu Nov 14 19:26:16 2013 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 14 Nov 2013 10:26:16 -0800 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: <9A03C089-41C3-49C2-9199-65589E23DCFC@yahoo.com> References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <20131114150520.GK2085@ando> <27903D3C-21DC-4211-8F80-82C319E61FAD@masklinn.net> <9A03C089-41C3-49C2-9199-65589E23DCFC@yahoo.com> Message-ID: On Thu, Nov 14, 2013 at 9:58 AM, Andrew Barnert wrote: > > sub5 = partial(sub, ..., 5) > > > > (At the expense of giving up the abitily to pass ellipsis to partial > functions). > > I don't know why, but to me that strongly implies that I'm binding > argument -1 (after 0 or more arguments, or maybe 1 or more), rather than > argument 2 (after exactly 1). > > But that's not necessarily a bad thing. But if other people expect it to > bind 2, they'll be surprised when they try it on a 3-argument (or > variable-argument) function. > > (And yes, I realize that mixing 1-based arg counting with python negative > indices is potentially confusing. I don't think it's confusing in this > particular case, but in, say, documentation for a stdlib function it could > be.)0 > How about: from functools import partial, __ sub5 = partial(sub, __, 5) xyz = partial(x, __, y, __, z) (Not quote sure what number of _ would work.) Then you could use ... to do the -1 argument binding: xyz = partial(x, __, y, ..., z) Incidentally, since partial(x) doesn't do anything useful (why does it not raise an exception?) would the following ever be reasonable to support? sub5 = partial(sub)(__, 5) The advantage is that the signature of the partial function stands alone making it easier to read. That is, def new_partial(func, *args, **kwargs): if not args and not kwargs: return partial(partial, func) return partial(func, *args, **kwargs) Probably the status quo wins. --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From elazarg at gmail.com Thu Nov 14 21:00:07 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Thu, 14 Nov 2013 22:00:07 +0200 Subject: [Python-ideas] A suggestion for Python 3 vs Python 2 In-Reply-To: References: <5281CE33.3070203@canterbury.ac.nz> <20131112073057.GB18971@ando> <877gcdhu4s.fsf@uwakimon.sk.tsukuba.ac.jp> <20131114150520.GK2085@ando> <27903D3C-21DC-4211-8F80-82C319E61FAD@masklinn.net> <9A03C089-41C3-49C2-9199-65589E23DCFC@yahoo.com> Message-ID: 2013/11/14 Bruce Leban : > Incidentally, since partial(x) doesn't do anything useful (why does it not > raise an exception?) would the following ever be reasonable to support? > > sub5 = partial(sub)(__, 5) > > > The advantage is that the signature of the partial function stands alone > making it easier to read. That is, > > def new_partial(func, *args, **kwargs): > if not args and not kwargs: > return partial(partial, func) > return partial(func, *args, **kwargs) > You get this automatically for partial-as-method: partial_sub = sub.partial sub5 = partial_sub(_, 5) It will not happen in accident (or at least it is very unlikely) so there'll be no need for an exception; when it does happen in accident, the behavior is easily understood. You also get cleaner separation of the parameters' roles - after all, x.foo *is* partial(). Elazar From apieum at gmail.com Fri Nov 15 08:55:29 2013 From: apieum at gmail.com (Gregory Salvan) Date: Fri, 15 Nov 2013 08:55:29 +0100 Subject: [Python-ideas] remove coupling between unittest and assertions Message-ID: Hi, I would seperate unittest from its assertions. I suggest to create the module "assertions" in stdlib. - it will not change unittest api - it improves decoupling and SOC so other testing library will be able to share a common base code avoiding duplication - it would be great to write guard clauses with more expressive error message like that for example: from assertions import assert_in def my_function(*args, **kwargs): assert_in(''option", kwargs) ... This last point merits more reflexion since "assert" is removed when python compiles in optimize mode. A such mecanism can be added further in assertions lib. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Nov 15 12:35:21 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 15 Nov 2013 21:35:21 +1000 Subject: [Python-ideas] remove coupling between unittest and assertions In-Reply-To: References: Message-ID: On 15 November 2013 17:55, Gregory Salvan wrote: > Hi, > I would seperate unittest from its assertions. > I suggest to create the module "assertions" in stdlib. > > - it will not change unittest api > - it improves decoupling and SOC so other testing library will be able to > share a common base code avoiding duplication > - it would be great to write guard clauses with more expressive error > message like that for example: > from assertions import assert_in > > def my_function(*args, **kwargs): > assert_in(''option", kwargs) > ... > > This last point merits more reflexion since "assert" is removed when python > compiles in optimize mode. A such mecanism can be added further in > assertions lib. We've (very briefly) discussed the idea of decoupling the APIs in http://bugs.python.org/issue18054 (the specific proposal mentioned there was to standardise the "matcher" concept used in testtools). I'd certainly like to see something along those lines in Python 3.5 - having good assertions independent of the assert statement and the unittest object model would be helpful (although it may still be a submodule of unittest). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From random832 at fastmail.us Fri Nov 15 19:16:25 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 15 Nov 2013 13:16:25 -0500 Subject: [Python-ideas] Unicode stdin/stdout (was: Re: python 3.3 repr) In-Reply-To: <5286054F.6000707@chamonix.reportlab.co.uk> References: <5286054F.6000707@chamonix.reportlab.co.uk> Message-ID: <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> Of course, the real solution to this issue is to replace sys.stdout on windows with an object that can handle Unicode directly with the WriteConsoleW function - the problem there is that it will break code that expects to be able to use sys.stdout.buffer for binary I/O. I also wasn't able to get the analogous stdin replacement class to work with input() in my attempts. From drekin at gmail.com Sat Nov 16 13:17:00 2013 From: drekin at gmail.com (drekin at gmail.com) Date: Sat, 16 Nov 2013 04:17:00 -0800 (PST) Subject: [Python-ideas] Unicode stdin/stdout (was: Re: python 3.3repr) In-Reply-To: <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> Message-ID: <5287623c.c7b60e0a.62c9.ffffad0a@mx.google.com> Hello. > Of course, the real solution to this issue is to replace sys.stdout on windows with an object that can handle Unicode directly with the WriteConsoleW function - the problem there is that it will break code that expects to be able to use sys.stdout.buffer for binary I/O. I also wasn't able to get the analogous stdin replacement class to work with input() in my attempts. You can look on the result of ReadConsoleW function call as on sequence of bytes encoding a string in utf-16-le. So the call can be made in custom raw io object and standard hierarchy text io ???> buffered io ???> raw io can be formed. See http://bugs.python.org/file31756/streams.py . It works also for stdin and input(). Only problem is that Python interactive REPL doesn't use sys.stdin object for input. See http://bugs.python.org/issue1602 and http://bugs.python.org/issue17620 for details. Regards, Drekin From storchaka at gmail.com Sat Nov 16 21:58:32 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 16 Nov 2013 22:58:32 +0200 Subject: [Python-ideas] Gzip and zip extra field In-Reply-To: References: Message-ID: 29.05.13 16:25, Serhiy Storchaka ???????(??): > Gzip files can contains an extra field [1] and some applications use > this for extending gzip format. The current GzipFile implementation > ignores this field on input and doesn't allow to create a new file with > an extra field. > > ZIP file entries also can contains an extra field [2]. Currently it just > saved as bytes in the `extra` attribute of ZipInfo. > > I propose to save an extra field for gzip file and provide structural > access to subfields. > > f = gzip.GzipFile('somefile.gz', 'rb') > f.extra_bytes # A raw extra field as bytes > # iterating over all subfields > for xid, data in f.extra_map.items(): > ... > # get Apollo file type information > f.extra_map[b'AP'] # (or f.extra_map['AP']?) > # creating gzip file with extra field > f = gzip.GzipFile('somefile.gz', 'wb', extra=extrabytes) > f = gzip.GzipFile('somefile.gz', 'wb', extra=[(b'AP', apollodata)]) > f = gzip.GzipFile('somefile.gz', 'wb', extra={b'AP': apollodata}) > # change Apollo file type information > f.extra_map[b'AP'] = ... > > Issue #17681 [3] has preliminary patches. There is some open doubt about > interface. Is not it over-engineered? > > Currently GzipFile supports seamless reading a sequence of separately > compressed gzip files. Every such chunk can have own extra field (this > is used in dictzip for example). It would be desirable to be able to > read only until the end of current chunk in order not to miss an extra > field. > > [1] http://www.gzip.org/format.txt > [2] http://www.pkware.com/documents/casestudies/APPNOTE.TXT > [3] http://bugs.python.org/issue17681 Is anyone interested in this feature? It needs bikeshedding. From ericsnowcurrently at gmail.com Sat Nov 16 23:31:20 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 16 Nov 2013 15:31:20 -0700 Subject: [Python-ideas] Gzip and zip extra field In-Reply-To: References: Message-ID: On Sat, Nov 16, 2013 at 1:58 PM, Serhiy Storchaka wrote: > It needs bikeshedding. No need to say that explicitly. You did post to python-ideas after all. -eric From abarnert at yahoo.com Sun Nov 17 01:19:37 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 16 Nov 2013 16:19:37 -0800 Subject: [Python-ideas] Gzip and zip extra field In-Reply-To: References: Message-ID: Are any of the gzip standard extra fields in common usage today? I tried lookup up the URLs listed in the definitions; one is a 404 page and another is just an image with links to someone's Facebook and similar personal pages. As for the zip extra fields, at least some of them seem like they're only useful if the zip module actually interprets them. For example, given a zip64, if zipfile can read/extract a 5GB file directly, you have no need to look at the Zip64 extra info directly; if it can't do so, you won't get any useful benefit out of looking at the extra info. Other fields might be useful for building a more powerful wrapper module around zipfile--e.g., you could substitute the native NTFS or POSIX timestamps in the extra info for the possibly-less-accurate normal zip timestamps. Sent from a random iPhone On Nov 16, 2013, at 12:58, Serhiy Storchaka wrote: > 29.05.13 16:25, Serhiy Storchaka ???????(??): >> Gzip files can contains an extra field [1] and some applications use >> this for extending gzip format. The current GzipFile implementation >> ignores this field on input and doesn't allow to create a new file with >> an extra field. >> >> ZIP file entries also can contains an extra field [2]. Currently it just >> saved as bytes in the `extra` attribute of ZipInfo. >> >> I propose to save an extra field for gzip file and provide structural >> access to subfields. >> >> f = gzip.GzipFile('somefile.gz', 'rb') >> f.extra_bytes # A raw extra field as bytes >> # iterating over all subfields >> for xid, data in f.extra_map.items(): >> ... >> # get Apollo file type information >> f.extra_map[b'AP'] # (or f.extra_map['AP']?) >> # creating gzip file with extra field >> f = gzip.GzipFile('somefile.gz', 'wb', extra=extrabytes) >> f = gzip.GzipFile('somefile.gz', 'wb', extra=[(b'AP', apollodata)]) >> f = gzip.GzipFile('somefile.gz', 'wb', extra={b'AP': apollodata}) >> # change Apollo file type information >> f.extra_map[b'AP'] = ... >> >> Issue #17681 [3] has preliminary patches. There is some open doubt about >> interface. Is not it over-engineered? >> >> Currently GzipFile supports seamless reading a sequence of separately >> compressed gzip files. Every such chunk can have own extra field (this >> is used in dictzip for example). It would be desirable to be able to >> read only until the end of current chunk in order not to miss an extra >> field. >> >> [1] http://www.gzip.org/format.txt >> [2] http://www.pkware.com/documents/casestudies/APPNOTE.TXT >> [3] http://bugs.python.org/issue17681 > > Is anyone interested in this feature? It needs bikeshedding. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From guido at python.org Sun Nov 17 02:39:13 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Nov 2013 17:39:13 -0800 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: <20131116181328.GB3152@snakebite.org> References: <20131116181328.GB3152@snakebite.org> Message-ID: Trent, I watched your video and read your slides. (Does the word "motormouth" mean anything to you? :-) Clearly your work isn't ready for python-dev -- it is just too speculative. I've moved python-dev to BCC and added python-ideas. It possibly doesn't even belong on python-ideas -- if you are serious about wanting to change Linux or other *NIX variants, you'll have to go find a venue where people who do forward-looking kernel work hang out. Finally, I'm not sure why you are so confrontational about the way Twisted and Tulip do things. We are doing things the only way they *can* be done without overhauling the entire CPython implementation (which you have proven will take several major release cycles, probably until 4.0). It's fine that you are looking further forward than most of us. I don't think it makes sense that you are blaming the rest of us for writing libraries that can be used today. On Sat, Nov 16, 2013 at 10:13 AM, Trent Nelson wrote: > Hi folks, > > Video of the presentation I gave last weekend at PyData NYC > regarding PyParallel just went live: https://vimeo.com/79539317 > > Slides are here: > https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores-1 > > The work was driven by the async I/O discussions around this time > last year on python-ideas. That resulted in me sending this: > > > http://markmail.org/thread/kh3qgjbydvxt3exw#query:+page:1+mid:arua62vllzugjy2v+state:results > > ....where I attempted to argue that there was a better way of > doing async I/O on Windows than the status quo of single-threaded, > non-blocking I/O with an event multiplex syscall. > > I wasn't successful in convincing anyone at the time; I had no code > to back it up and I didn't articulate my plans for GIL removal at > the time either (figuring the initial suggestion would be met with > enough scepticism as is). > > So, in the video above, I spend a lot of time detailing how IOCP > works on Windows, how it presents us with a better environment than > UNIX for doing asynchronous I/O, and how it paired nicely with the > other work I did on coming up with a way for multiple threads to > execute simultaneously across all cores without introducing any > speed penalties. > > I'm particularly interested to hear if the video/slides helped > UNIX-centric people gain a better understanding of how Windows does > IOCP and why it would be preferable when doing async I/O. > > The reverse is also true: if you still think single-threaded, non- > blocking synchronous I/O via kqueue/epoll is better than the > approach afforded by IOCP, I'm interested in hearing why. > > As crazy as it sounds, my long term goal would be to try and > influence Linux and BSD kernels to implement thread-agnostic I/O > support such that an IOCP-like mechanism could be exposed; Solaris > and AIX already do this via event ports and AIX's verbatim copy of > Windows' IOCP API. > > (There is some promising work already being done on Linux; see > recent MegaPipe paper for an example.) > > Regards, > > Trent. > _______________________________________________ > Python-Dev mailing list > Python-Dev at python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From trent at snakebite.org Sun Nov 17 03:24:56 2013 From: trent at snakebite.org (Trent Nelson) Date: Sat, 16 Nov 2013 21:24:56 -0500 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> Message-ID: <20131117022455.GA8127@snakebite.org> On Sat, Nov 16, 2013 at 05:39:13PM -0800, Guido van Rossum wrote: > Trent, I watched your video and read your slides. (Does the word > "motormouth" mean anything to you? :-) Side-effect of both a) not having time to rehearse, and b) trying to compress 153 slides into 45 minutes :-) > Finally, I'm not sure why you are so confrontational about the way Twisted > and Tulip do things. We are doing things the only way they *can* be done > without overhauling the entire CPython implementation (which you have > proven will take several major release cycles, probably until 4.0). It's > fine that you are looking further forward than most of us. I don't think > it makes sense that you are blaming the rest of us for writing libraries > that can be used today. I watched the video today; there's a point where I say something along the lines of "that's not how you should do IOCP; they're doing it wrong". That definitely came out wrong -- when limited to a single-threaded execution model, which today's Python is, then calling GetQueuedCompletionStatus() in a single-threaded event loop is really the only option you have. (I think I also say "that's just as bad as select()"; I didn't mean that either -- it's definitely better than select() when you're limited to the single-threaded execution model. What I was trying to convey was that doing it like that wasn't really how IOCP was designed to be used -- which is why I dig into the intrinsic link between IOCP, async I/O and threading for so many slides.) And in hindsight, perhaps I need to put more emphasis on the fact that it *is* very experimental work with a long-term view, versus Tulip/asyncio, which was intended for *now*. So although Tulip and PyParallel spawned from the same discussions and are attempting to attack the same problem -- it's really not fair for me to discredit Tulip/Twisted in favor of PyParallel because they're on completely different playing fields with vastly different implementation time frames (I'm thinking 5+ years before this work lands in a mainstream Python release -- if it ever does. And if not, hey, it can live on as another interpreter, just like Stackless et al). > Clearly your work isn't ready for python-dev -- it is just too > speculative. I've moved python-dev to BCC and added python-ideas. > > It possibly doesn't even belong on python-ideas -- if you are serious > about wanting to change Linux or other *NIX variants, you'll have to go > find a venue where people who do forward-looking kernel work hang out. Yeah this e-mail was more of a final follow up to e-mails I sent to python-ideas last year re: the whole "alternate async approach" thread. (I would have replied to that thread directly, had I kept it in my inbox.) Trent. From guido at python.org Sun Nov 17 03:56:11 2013 From: guido at python.org (Guido van Rossum) Date: Sat, 16 Nov 2013 18:56:11 -0800 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: <20131117022455.GA8127@snakebite.org> References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: On Sat, Nov 16, 2013 at 6:24 PM, Trent Nelson wrote: > On Sat, Nov 16, 2013 at 05:39:13PM -0800, Guido van Rossum wrote: > [snip] > Finally, I'm not sure why you are so confrontational about the way > Twisted > > and Tulip do things. We are doing things the only way they *can* be > done > > without overhauling the entire CPython implementation (which you have > > proven will take several major release cycles, probably until 4.0). > It's > > fine that you are looking further forward than most of us. I don't > think > > it makes sense that you are blaming the rest of us for writing > libraries > > that can be used today. > > I watched the video today; there's a point where I say something > along the lines of "that's not how you should do IOCP; they're > doing it wrong". That definitely came out wrong -- when limited > to a single-threaded execution model, which today's Python is, then > calling GetQueuedCompletionStatus() in a single-threaded event loop > is really the only option you have. > > (I think I also say "that's just as bad as select()"; I didn't mean > that either -- it's definitely better than select() when you're > limited to the single-threaded execution model. What I was trying > to convey was that doing it like that wasn't really how IOCP was > designed to be used -- which is why I dig into the intrinsic link > between IOCP, async I/O and threading for so many slides.) > I wish you had spent more time on explaining how IOCP works and less on judging other approaches. Summarizing my understanding of what you're saying, it seems the "right" way to use IOCP on a multi-core machine is to have one thread per core (barring threads you need for unavoidably blocking stuff) and to let the kernel schedule callbacks on all those threads. As long as the callbacks don't block and events come in at a rate to keep all those cores busy this will be optimal. But this is almost tautological. It only works if the threads don't communicate with each other or with the main thread (all shared data must be read-only). But heh, if that's all, one process per core works just as well. :-) I don't really care how well CHARGEN (I had to look it up) scales. For HTTP, it's great for serving static contents from a cache or from the filesystem, but if that's all you serve, why use Python? Real web apps use intricate combinations of databases, memcache, in-memory cache, and template expansion. The biggest difference you can make there is probably getting rid of the ORM in favor of more direct SQL, and next on the list would be reimplementing template expansion in C. (And heck, you could release the GIL while you're doing that. :-) And in hindsight, perhaps I need to put more emphasis on the fact > that it *is* very experimental work with a long-term view, versus > Tulip/asyncio, which was intended for *now*. So although Tulip and > PyParallel spawned from the same discussions and are attempting to > attack the same problem -- it's really not fair for me to discredit > Tulip/Twisted in favor of PyParallel because they're on completely > different playing fields with vastly different implementation time > frames (I'm thinking 5+ years before this work lands in a mainstream > Python release -- if it ever does. And if not, hey, it can live on > as another interpreter, just like Stackless et al). > I would love it if you could write a list of things a callback *cannot* do when it is in parallel mode. I believe that list includes mutating any kind of global/shared state (any object created in the main thread is read-only in parallel mode -- it seems you had to work hard to make string interning work, which is semantically transparent but mutates hidden global state). In addition (or, more likely, as a consequence!) a callback cannot create anything that lasts beyond the callback's lifetime, except for the brief time between the callback's return and the completion of the I/O operation involving the return value. (Actually, I missed how you do this -- doesn't this mean you cannot release the callback's heap until much later?) So it seems that the price for extreme concurrency is the same as always -- you can only run purely functional code. Haskell fans won't mind, but for Python this seems to be putting the cart before the horse -- who wants to write Python with those constraints? [snip] -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sun Nov 17 07:31:37 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 16 Nov 2013 22:31:37 -0800 (PST) Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: <1384669897.63110.YahooMailNeo@web184704.mail.ne1.yahoo.com> From: Guido van Rossum Sent: Saturday, November 16, 2013 6:56 PM >Summarizing my understanding of what you're saying, it seems??the "right" way to use IOCP on a multi-core machine is to have one thread per core (barring threads you need for unavoidably blocking stuff) and to let the kernel schedule callbacks on all those threads. As long as the callbacks don't block and events come in at a rate to keep all those cores busy this will be optimal. > >But this is almost tautological. It only works if the threads don't communicate with each other or with the main thread (all shared data must be read-only). But heh, if that's all, one process per core works just as well. :-) I got the same impression from the presentation. First, I completely agree with the fact that most Unix servers are silly on Windows even in the single-threaded case?simulating epoll on top of single-threaded completion-based GQCS just so you can simulate a completion-based design on top of your simulated ready-based epoll is wasteful and overly complex. But that's a much more minor issue than taking advantage of Windows' integration between threading and async I/O, and one that many server frameworks have already fixed, and that PyParallel isn't necessary for. I also agree that using IOCP for a multi-threaded proactor instead of a single-threaded reactor plus dispatcher is a huge win in the kinds of shared-memory threaded apps that you can't write in CPython. From my experience building a streaming video server and an IRC-esque interactive communications server, using a reactor plus dispatcher on Windows means?one core completely wasted, 40% less performance from the others, and much lower scalability; emulating a proactor on Unix on top of a reactor and dispatcher is around a 10% performance cost (plus a bit of extra code complexity).?So, a threaded proactor wins, unless you really don't care about Windows. But PyParallel doesn't look like it supports such applications any better than stock CPython. As soon as you need to send data from one client to other clients, you're not in a shared-nothing parallel context anymore.?Even less extreme cases than streaming video or chat, where all you need is, e.g., shared caching of?dynamically-generated data, I don't see how you'd do that in PyParallel. If you can build a simple multi-user chat server with PyParallel, and show it using all my cores, that would be a lot more compelling. From ncoghlan at gmail.com Sun Nov 17 07:35:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 17 Nov 2013 16:35:23 +1000 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: On 17 November 2013 12:56, Guido van Rossum wrote: > On Sat, Nov 16, 2013 at 6:24 PM, Trent Nelson wrote: >> And in hindsight, perhaps I need to put more emphasis on the fact >> that it *is* very experimental work with a long-term view, versus >> Tulip/asyncio, which was intended for *now*. So although Tulip and >> PyParallel spawned from the same discussions and are attempting to >> attack the same problem -- it's really not fair for me to discredit >> Tulip/Twisted in favor of PyParallel because they're on completely >> different playing fields with vastly different implementation time >> frames (I'm thinking 5+ years before this work lands in a mainstream >> Python release -- if it ever does. And if not, hey, it can live on >> as another interpreter, just like Stackless et al). > > > I would love it if you could write a list of things a callback *cannot* do > when it is in parallel mode. I believe that list includes mutating any kind > of global/shared state (any object created in the main thread is read-only > in parallel mode -- it seems you had to work hard to make string interning > work, which is semantically transparent but mutates hidden global state). In > addition (or, more likely, as a consequence!) a callback cannot create > anything that lasts beyond the callback's lifetime, except for the brief > time between the callback's return and the completion of the I/O operation > involving the return value. (Actually, I missed how you do this -- doesn't > this mean you cannot release the callback's heap until much later?) > > So it seems that the price for extreme concurrency is the same as always -- > you can only run purely functional code. Haskell fans won't mind, but for > Python this seems to be putting the cart before the horse -- who wants to > write Python with those constraints? MapReduce fans already do :) I think there's some interesting potential in Trent's PyParallel work, but it needs something analogous to Rust's ability to transfer object ownership between threads (thus enabling message passing) to expand beyond the simple worker thread model which is really only interesting on Windows (where processes are expensive - on *nix, processes are generally cheap enough that PyParallel is unlikely to be worth the hassle). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From abarnert at yahoo.com Sun Nov 17 08:41:22 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 16 Nov 2013 23:41:22 -0800 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: On Nov 16, 2013, at 22:35, Nick Coghlan wrote: > I think there's some interesting potential in Trent's PyParallel work, > but it needs something analogous to Rust's ability to transfer object > ownership between threads (thus enabling message passing) I wonder whether an explicit copy_to_main_thread (maybe both shallow and deep variants) function, maybe with a Queue subclass that called it automatically in the put method, would be sufficient for a decent class of applications to be built? > to expand > beyond the simple worker thread model which is really only interesting > on Windows (where processes are expensive - on *nix, processes are > generally cheap enough that PyParallel is unlikely to be worth the > hassle). Windows is fine at scheduling ncores separate processes; it's just slow at _starting_ each one. And most servers aren't constantly creating and reaping processes; they create ncores processes at startup or when they're first needed. So if it takes 0.9 seconds instead of 0.2 to restart your server, is that a big enough problem to rewrite the whole server (and the interpreter)? As I said in my previous message, the benefit of being able to skip refcounting might make it worth doing. But avoiding process creation overhead isn't much if a win. From cf.natali at gmail.com Sun Nov 17 10:25:11 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Sun, 17 Nov 2013 10:25:11 +0100 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: There's something which really bothers me (well, that has been bothering me since the beginning): """ Memory Deallocation within Parallel Contexts ?These parallel contexts aren?t intended to be long-running bits of code/algorithm ?Let?s not free() anything? ??.and just blow away the entire heap via HeapFree() with one call, once the context has finished ?Cons: oYou technically couldn?t do this: def work(): for x in xrange(0, 1000000000): ? o(Why would you!) ?So? there?s no point referencing counting objects allocated within parallel contexts! """ So basically, pyparallel solves the issue of garbage collection in a multi-threaded process by not doing garbage collection: yeah, sure, things get a lot simpler, but in real life, you do want to have loops such as above, I don't see how one could pretend otherwise. That's simply a show-stopper to me. In fact, I find the whole programming model completely puzzling. Depending on whether you're in the main thread or not: - you're only able to write to thread-local data (thread-local is the sense allocated by the current thread, not thread-specific): what happens if some parallel context calls "import foo"? - you won't be able to allocate/free many objects So, in fact, in your parallel contexts, you can do so little that it's IMO almost useless in practice. It's not Python - more like a cross between Haskell and Python - and moreover, it means that some code cannot be executed in parallel context. Which means that you basically cannot use any library, since you don't know what's doing under the hood (it might die with a MemoryError or an invalid write to main thread memory). Compare this to e.g. Go's goroutines and channels, and you'll see how one might solve those issues in a sensible way (including user-level thread multiplexing over kernel-level threads, and using epoll ;-). In short, I'm really skeptical, to say the least... cf From solipsis at pitrou.net Sun Nov 17 11:27:39 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Nov 2013 11:27:39 +0100 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: <20131117112739.42db5c76@fsol> On Sat, 16 Nov 2013 21:24:56 -0500 Trent Nelson wrote: > > And in hindsight, perhaps I need to put more emphasis on the fact > that it *is* very experimental work with a long-term view, versus > Tulip/asyncio, which was intended for *now*. So although Tulip and > PyParallel spawned from the same discussions and are attempting to > attack the same problem I don't think they are attempting to attack the same problem. asyncio and similar frameworks (Twisted, Tornado, etc.) try to solve the issue of I/O concurrency, while you are trying to solve the issue of CPU parallelism (i.e. want Python to actually exploit several CPUs simutaneously: asyncio doesn't really care about that, although it has a primitive to let you communicate with subprocesses). Yes, you can want to "optimize" static data serving by using several CPU cores at once, but that sounds quite pointless except perhaps for a few niche situations (and as Guido says, there are perfectly good off-the-shelf solutions for efficient static data serving). I think most people who'd like the GIL removed are not I/O-bound. Regards Antoine. From solipsis at pitrou.net Sun Nov 17 11:30:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Nov 2013 11:30:52 +0100 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: <20131117113052.0e82585f@fsol> On Sun, 17 Nov 2013 16:35:23 +1000 Nick Coghlan wrote: > > I think there's some interesting potential in Trent's PyParallel work, > but it needs something analogous to Rust's ability to transfer object > ownership between threads (thus enabling message passing) to expand > beyond the simple worker thread model which is really only interesting > on Windows (where processes are expensive - on *nix, processes are > generally cheap enough that PyParallel is unlikely to be worth the > hassle). This is a bit of an oversimplification. The cost of processes is not only the cost of spawning them. There is also the CPU cost of marshalling data between processes, and the memory cost of having duplicate structures and data in your various processes. (also, note that using a process pool generally amortizes the spawning cost quite well) Regards Antoine. From abarnert at yahoo.com Sun Nov 17 11:47:50 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 17 Nov 2013 02:47:50 -0800 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: <20131117113052.0e82585f@fsol> References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> <20131117113052.0e82585f@fsol> Message-ID: <2A8E8993-6B29-4541-AF20-2E806164DE06@yahoo.com> On Nov 17, 2013, at 2:30, Antoine Pitrou wrote: > On Sun, 17 Nov 2013 16:35:23 +1000 > Nick Coghlan wrote: >> >> I think there's some interesting potential in Trent's PyParallel work, >> but it needs something analogous to Rust's ability to transfer object >> ownership between threads (thus enabling message passing) to expand >> beyond the simple worker thread model which is really only interesting >> on Windows (where processes are expensive - on *nix, processes are >> generally cheap enough that PyParallel is unlikely to be worth the >> hassle). > > This is a bit of an oversimplification. The cost of processes is not > only the cost of spawning them. There is also the CPU cost of > marshalling data between processes, and the memory cost of having > duplicate structures and data in your various processes. But PyParallel doesn't seem to provide _any_ way to pass data between threads. So, the fact that multiprocessing provides only a slow way to pass data between processes can't be considered a weakness. Any program that could be written in PyParallel can't see those costs. From storchaka at gmail.com Sun Nov 17 14:17:15 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 17 Nov 2013 15:17:15 +0200 Subject: [Python-ideas] Gzip and zip extra field In-Reply-To: References: Message-ID: 17.11.13 02:19, Andrew Barnert ???????(??): > Are any of the gzip standard extra fields in common usage today? I tried lookup up the URLs listed in the definitions; one is a 404 page and another is just an image with links to someone's Facebook and similar personal pages. dictzip and BGZF use gzip format with different random access extensions. Both are very popular in their domains. > As for the zip extra fields, at least some of them seem like they're only useful if the zip module actually interprets them. For example, given a zip64, if zipfile can read/extract a 5GB file directly, you have no need to look at the Zip64 extra info directly; if it can't do so, you won't get any useful benefit out of looking at the extra info. Other fields might be useful for building a more powerful wrapper module around zipfile--e.g., you could substitute the native NTFS or POSIX timestamps in the extra info for the possibly-less-accurate normal zip timestamps. In the first place high-level support of extra field will simplify ZIP64 support in zipfile (and will made it less buggy). Also it will help with support of UTF-8 filenames and extended file attributes. From arigo at tunes.org Sun Nov 17 16:52:22 2013 From: arigo at tunes.org (Armin Rigo) Date: Sun, 17 Nov 2013 16:52:22 +0100 Subject: [Python-ideas] Fwd: [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> Message-ID: Hi Trent, On Sat, Nov 16, 2013 at 7:13 PM, Trent Nelson wrote: > Slides are here: https://speakerdeck.com/trent/pyparallel-how-we-removed-the-gil-and-exploited-all-cores-1 Please stop me if I'm wrong. This allows the Python programmer to run a bunch of new threads in parallel; each new thread has read-only access to all pre-existing objects; all objects created by this new thread must die at the end. Disregarding issues of performance, this seems to be exactly the same model as "multiprocessing": the new thread (or process) cannot have any direct impact on the objects seen by the parent thread (or process). In fact, multiprocessing's capabilities are a superset of PyParallel's: e.g. you can change existing objects. That will not be reflected in the parent process, but will be visible in the future in the same process. This seems like a very useful thing to do in some cases. The performance benefits of PyParallel are probably only relevant on Windows (because it has no fork()), but I agree it's interesting if you're on Windows. However, the main issue I have with the whole approach of PyParallel is that it seems to be giving a subset of "multiprocessing" in terms of programming model. I already hate multiprocessing for giving the programmer a large set of constrains to work around; I suppose I don't have to explain my opinion of PyParallel... But more importantly, why would we want to hack at the source code of CPython like you did in order to get a result that was already done? Please tell me where I'm wrong. A bient?t, Armin. From solipsis at pitrou.net Sun Nov 17 19:09:32 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 17 Nov 2013 19:09:32 +0100 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: <20131117190932.296585ce@fsol> On Sat, 16 Nov 2013 21:24:56 -0500 Trent Nelson wrote: > On Sat, Nov 16, 2013 at 05:39:13PM -0800, Guido van Rossum wrote: > > Trent, I watched your video and read your slides. (Does the word > > "motormouth" mean anything to you? :-) > > Side-effect of both a) not having time to rehearse, and b) trying > to compress 153 slides into 45 minutes :-) I've just read the slides. You've done rather weird and audacious things. That was a very interesting read, thank you! Regards Antoine. From trent at snakebite.org Sun Nov 17 23:34:32 2013 From: trent at snakebite.org (Trent Nelson) Date: Sun, 17 Nov 2013 17:34:32 -0500 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> Message-ID: <20131117223431.GA10151@snakebite.org> (I saw that there were a number of additional e-mails echo'ing Guido's sentiment/concerns re: shared nothing. I picked this thread to reply to and tried to provide as much info as possible in lieu of replying to everyone individually.) On Sat, Nov 16, 2013 at 06:56:11PM -0800, Guido van Rossum wrote: > I wish you had spent more time on explaining how IOCP works and less on > judging other approaches. Heh, it's funny, with previous presentations, I didn't labor the point anywhere near as much, and I found that when presenting to UNIX people, they were very defensive of the status quo. I probably over-compensated a little too much in the opposite direction this time; I don't think anyone is going to argue vehemently that the UNIX status quo is optimal on Windows; but a side-effect is that it unnecessarily slanders existing bodies of work (Twisted et al) that have undeniably improved the overall ecosystem over the past decade. > Summarizing my understanding of what you're saying, it seems the "right" > way to use IOCP on a multi-core machine is to have one thread per core > (barring threads you need for unavoidably blocking stuff) and to let the > kernel schedule callbacks on all those threads. As long as the callbacks > don't block and events come in at a rate to keep all those cores busy this > will be optimal. The only thing I'd add is that, when speaking in terms of socket servers and whatnot, it helps to visualize Python callbacks as "the bits of logic that need to run before invoking the next asynchronous call". Anything I/O related can be done via an asynchronous call; that's basically the exit point of the processing thread -- it dispatches the async WSARecv() (for example), then moves onto the next request in the I/O completion port's queue. When that WSARecv() returns, we get all the info we need from the completion context to figure out what we just did, and, based on the protocol we provided, what needs to be done next. So, we do a little more pure Python processing and then dispatch the next asynchronous call, which, in this case, might be a WSASend(); the thread then moves onto the next request in the queue. That's all handled by the PxSocket_IOLoop monstrosity: http://hg.python.org/sandbox/trent/file/0e70a0caa1c0/Python/pyparallel.c#l6246 I got the inspiration for that implementation from CEval_FrameEx; you basically have one big inline method where you can go from anything to anything without having to call additional C functions; thus, doing back-to-back sends, for example, won't exhaust your stack. That allows us to do the dynamic switch between sync and async depending on protocol preference, current client load, number of active I/O hogs, that sort of thing: http://hg.python.org/sandbox/trent/file/0e70a0caa1c0/Python/pyparallel.c#l6467 PxSocket_IOLoop currently only handles 1:1 TCP/IP connections, which limits its applicability. I want to expand that -- I should be able to connect any sort of end points together in any fashion -- similar to how ZeroMQ allows the bridge/fan-out/router type composition. An endpoint would be anything that allows me to initiate an async operation against it, e.g. file, device, socket, whatever. This is where Windows really shines, because you can literally do everything either synchronously or asynchronously. There should also be support for 1:m and m:n relationships between endpoints (i.e. an IRC chat server). So I see PxSocket_IOLoop turning into a more generic PxThread_Loop that can handle anything-to-anything -- basically, calling the Python code that needs to run before dispatching the next async call. The current implementation also does a lot of live introspection against the protocol object to figure out what to do next; i.e. first entry point for a newly-connected client is here: http://hg.python.org/sandbox/trent/file/0e70a0caa1c0/Python/pyparallel.c#l6326 At every entry point into the loop, and at every point *after* the relevant Python code has been run, we're relying on the protocol to tell us what to do next in a very hard-coded fashion. I think for PxThread_Loop to become truly dynamic, it should mirror CEval_FrameEx even closer; the protocol analysis should be done separately, the output of which is a stream of async-opcode bytes that direct the main dispatching logic: http://hg.python.org/sandbox/trent/file/0e70a0caa1c0/Python/pyparallel.c#l6305 dispatch: switch (next_opcode) { TARGET(maybe_shutdown_send_or_recv); TARGET(handle_error); TARGET(connection_made_callback); TARGET(data_received_callback); TARGET(send_complete_callback); TARGET(overlapped_recv_callback); TARGET(post_callback_that_supports_sending_retval); TARGET(post_callback_that_does_not_support_sending_retval); TARGET(close_); TARGET(try_send); default: break; } Then we'd have one big case statement just like with CEval_FrameEx that handles all possible async-opcodes, rather than the goto spaghetti in the current PxSocket_IOLoop. The async opcodes would be generic and platform-independent; i.e. file write, file read, single socket write, multi-socket write, etc. On Windows/Solaris/AIX, everything could be handled asynchronously, on other platforms, you would have to fake it using an event loop + multiplex method, identical to how twisted/tornado/tulip do it currently. > But this is almost tautological. It only works if the threads don't > communicate with each other or with the main thread (all shared data must > be read-only). But heh, if that's all, one process per core works just as > well. :-) Ok, so, heh, I lied in the presentation. The main thread won't be frozen per-se, and the parallel threads will have a way to share state. I've already done a huge amount of work on this, but it's very involved and that presentation was long enough as it is. Also, it's easier to understand why reference counting and GC isn't needed in parallel contexts if you just assume the main thread isn't running. In reality, one of the first things I had to figure out was how these parallel contexts could communicate state back to the main thread -- because without this ability, how the heck would you propagate an exception raised in a parallel thread back to the main thread? The exception will be backed my memory allocated in the parallel context -- that can't be free'd until the exception has been dealt with and no references to it remain. As that was one of the first problems I had to solve, it has one of the hackiest implementations :-) The main thread's async.run_once() implementation can detect which parallel threads raised exceptions (because they've done an interlocked push to the main thread's error list) and it will extend the lifetime of the context for an additional number of subsequent runs of run_once(). Once the TTL of the context drops to 0, it is finally released. The reason it's hacky is because there's no direct correlation between the exception object finally having no references to it and the point we destroy the context. If you persisted the exception object to a list in the main thread somewhere, you'd segfault down the track when trying to access that memory. So, on the second iteration, I came up with some new concepts; context persistence and object promotion. A main-thread list or dict could be async protected such that this would work: # main thread d1 = {} l1 = [] async.protect(d1) async.protect(l1) # this would also work d2 = async.dict() l2 = async.list() # fyi: async.rdtsc() returns a PyLong wrapped # version of the CPU TSC; handy for generating # non-interned objects allocated from a parallel # context def callback(name): d1[name] = async.rdtsc() l1.append(async.rdtsc()) async.submit_work(callback, 'foo') async.submit_work(callback, 'bar') async.submit_work(callback, 'moo') async.run() That actually works; the async.protect() call intercepts the object's tp_as_mapping and tp_as_sequence fields and redirects them to thread-safe versions that use read/write locks. It also toggles a persistence bit on both the parallel context and the parallel long object, such that reference counting *is* actually enabled on it once it's back in the main thread -- when the ref count drops to 0, we check to see if it's an object that's been persisted, and if so, we decref the original context -- when the context's refcount gets to 0, only *then* do we free it. (I also did some stuff where you could promote simple objects where it made sense -- i.e. there's no need to keep a 4k context around if the end result was a scalar that could be represented in 50-200 bytes; just memcpy it from the main thread ("promote it to a main thread object with reference counting") and free the context.) You can see some examples of the different type of stuff you can do here: http://hg.python.org/sandbox/trent/file/0e70a0caa1c0/Lib/async/test/test_primitives.py The problem though was that none of my unit tests assigned more than ten items to a list/dict, so I never encountered a resize :-) You can imagine what happens when a resize takes place within a parallel context -- the list/dict is realloc'd using the parallel context heap allocator -- that's not ideal, it's a main thread object, it shouldn't be reallocated with temporary parallel thread memory. I think that was the point where I went "oh, bollocks!" and switched over to tackling the async socket stuff. However, the async socket work forced me to implement all sorts of new concepts, including the heap snapshots and TLS heap overrides (for interned strings). Pair that with the page locking stuff and I have a much richer set of tools at my disposal to solve that problem -- I just need to completely overhaul everything memory related now that I know how it needs to be implemented :-) As for the dict/list assignment/resize, the key problem is figuring out whether a PyObject_Realloc call is taking place because we're resizing a main thread container object -- that's not an easy thing to figure out -- all you have is a pointer at the time you need to make the decision. That's where the memory refactoring work comes in -- I'm still working on the details, but the general idea is that you'll be able to do very efficient pointer address tests against known base address masks to figure out the origin of the object and how the current memory request needs to be satisfied. The other option I played around with was an interlocked list type that is exposed directly to Python: x = async.xlist() def callback(): x.push(async.rdtsc()) for _ in xrange(0, 10): async.submit_work(callback) async.run() # interlocked flush of all results into a list. results = x.flush() The key difference between an interlocked list and a normal list is that an interlocked list has its very own localized heap, just like parallel contexts have; pushing a scalar onto the list automatically "promotes it". That is, the object is memcpy'd directly using the xlist's heap, and we can keep that heap alive independently to the parallel contexts that pushed objects onto it. I was also planning on using this as a waitable queue, so you could compose pipelines of producers/consumers and that sort of thing. Then I ran out of time :-) > I don't really care how well CHARGEN (I had to look it up) scales. For > HTTP, it's great for serving static contents from a cache or from the > filesystem, but if that's all you serve, why use Python? Real web apps use > intricate combinations of databases, memcache, in-memory cache, and > template expansion. The biggest difference you can make there is probably > getting rid of the ORM in favor of more direct SQL, and next on the list > would be reimplementing template expansion in C. (And heck, you could > release the GIL while you're doing that. :-) Agree with the general sentiment "if that's all you're doing, why use Python?". The async HTTP server should allow other things to be built on top of it such that it's adding value over and above, say, an apache instance serving static files. > And in hindsight, perhaps I need to put more emphasis on the fact > that it *is* very experimental work with a long-term view, versus > Tulip/asyncio, which was intended for *now*. So although Tulip and > PyParallel spawned from the same discussions and are attempting to > attack the same problem -- it's really not fair for me to discredit > Tulip/Twisted in favor of PyParallel because they're on completely > different playing fields with vastly different implementation time > frames (I'm thinking 5+ years before this work lands in a mainstream > Python release -- if it ever does. And if not, hey, it can live on > as another interpreter, just like Stackless et al). > > I would love it if you could write a list of things a callback *cannot* do > when it is in parallel mode. I believe that list includes mutating any > kind of global/shared state (any object created in the main thread is > read-only in parallel mode -- it seems you had to work hard to make string > interning work, which is semantically transparent but mutates hidden > global state). In addition (or, more likely, as a consequence!) a callback > cannot create anything that lasts beyond the callback's lifetime, except > for the brief time between the callback's return and the completion of the > I/O operation involving the return value. (Actually, I missed how you do > this -- doesn't this mean you cannot release the callback's heap until > much later?) So, I think I already answered that above. The next presentation (PyCon Montreal) will be purely focused on this stuff -- I've been beating the alternate approach to async I/O for long enough ;-) > So it seems that the price for extreme concurrency is the same as always > -- you can only run purely functional code. Haskell fans won't mind, but > for Python this seems to be putting the cart before the horse -- who wants > to write Python with those constraints? Basically it's all still work in progress, but the PyParallel-for- parallel-compute use case is very important. And there's no way that can be done without having a way to return the results of parallel computation back into the next stage of your pipeline where more analysis is done. Getting hired by Continuum is actually great for this use case; we're in the big data, parallel task decomposition space, after all, not the writing-async-socket-server business ;-) I know Peter and Travis are both very supportive of PyParallel so its just a matter of trying to find time to work on it between consultancy engagements. Trent. From ncoghlan at gmail.com Mon Nov 18 00:19:38 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Nov 2013 09:19:38 +1000 Subject: [Python-ideas] [Python-Dev] PyParallel: alternate async I/O and GIL removal In-Reply-To: <20131117223431.GA10151@snakebite.org> References: <20131116181328.GB3152@snakebite.org> <20131117022455.GA8127@snakebite.org> <20131117223431.GA10151@snakebite.org> Message-ID: On 18 Nov 2013 08:35, "Trent Nelson" wrote: > > So, on the second iteration, I came up with some new concepts; > context persistence and object promotion. A main-thread list or > dict could be async protected such that this would work: > > # main thread > d1 = {} > l1 = [] > async.protect(d1) > async.protect(l1) > # this would also work > d2 = async.dict() > l2 = async.list() > > # fyi: async.rdtsc() returns a PyLong wrapped > # version of the CPU TSC; handy for generating > # non-interned objects allocated from a parallel > # context > > def callback(name): > d1[name] = async.rdtsc() > l1.append(async.rdtsc()) > > async.submit_work(callback, 'foo') > async.submit_work(callback, 'bar') > async.submit_work(callback, 'moo') > async.run() > > That actually works; the async.protect() call intercepts the > object's tp_as_mapping and tp_as_sequence fields and redirects them > to thread-safe versions that use read/write locks. It also toggles > a persistence bit on both the parallel context and the parallel long > object, such that reference counting *is* actually enabled on it > once it's back in the main thread -- when the ref count drops to 0, > we check to see if it's an object that's been persisted, and if so, > we decref the original context -- when the context's refcount gets > to 0, only *then* do we free it. > > (I also did some stuff where you could promote simple objects > where it made sense -- i.e. there's no need to keep a 4k context > around if the end result was a scalar that could be represented > in 50-200 bytes; just memcpy it from the main thread ("promote > it to a main thread object with reference counting") and free > the context.) Sweet, this is basically the Rust memory model, which is the direction I'd hoped you would end up going with this (hence why I was asking if you had looked into the details of Rust at the PyCon language summit). For anyone that hasn't looked at Rust, all variables are thread local by default. There are then two mechanisms for sharing with other threads: ownership transfer and promotion to the shared heap. All of this is baked into the compiler, so things like trying to access an object after sending it to another thread trigger a compile error. PyParallel has the additional complication of remaining compatible with standard code that assumes shared memory by default when running in serial mode, but that appears to be a manageable problem. Cheers, Nick. -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Mon Nov 18 13:33:49 2013 From: robin at reportlab.com (Robin Becker) Date: Mon, 18 Nov 2013 12:33:49 +0000 Subject: [Python-ideas] Unicode stdin/stdout In-Reply-To: <5289FE6C.7030007@chamonix.reportlab.co.uk> References: <5286054F.6000707@chamonix.reportlab.co.uk> <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> <5289FE6C.7030007@chamonix.reportlab.co.uk> Message-ID: <528A092D.4060408@chamonix.reportlab.co.uk> On 18/11/2013 11:47, Robin Becker wrote: ........... > #c:\python33\lib\site-packages\sitecustomize.py > import sys, codecs > sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach()) > sys.stderr = codecs.getwriter("utf-8")(sys.stderr.detach()) ........ it seems that the above needs extra stuff to make some distutils logging work etc etc; so now I'm using sitecustomize.py containing import sys, codecs sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach()) sys.stdout.encoding = 'utf8' sys.stderr = codecs.getwriter("utf-8")(sys.stderr.detach()) sys.stderr.encoding = 'utf8' -- Robin Becker From ncoghlan at gmail.com Mon Nov 18 13:59:16 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 18 Nov 2013 22:59:16 +1000 Subject: [Python-ideas] Unicode stdin/stdout In-Reply-To: <528A092D.4060408@chamonix.reportlab.co.uk> References: <5286054F.6000707@chamonix.reportlab.co.uk> <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> <5289FE6C.7030007@chamonix.reportlab.co.uk> <528A092D.4060408@chamonix.reportlab.co.uk> Message-ID: On 18 Nov 2013 22:36, "Robin Becker" wrote: > > On 18/11/2013 11:47, Robin Becker wrote: > ........... >> >> #c:\python33\lib\site-packages\sitecustomize.py >> import sys, codecs >> sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach()) >> sys.stderr = codecs.getwriter("utf-8")(sys.stderr.detach()) > > ........ > it seems that the above needs extra stuff to make some distutils logging work etc etc; so now I'm using sitecustomize.py containing > > import sys, codecs > sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach()) > sys.stdout.encoding = 'utf8' > sys.stderr = codecs.getwriter("utf-8")(sys.stderr.detach()) > sys.stderr.encoding = 'utf8' Note that calling detach() on the standard streams isn't officially supported, since it breaks the shadow streams saved in sys.__stderr__, etc. Cheers, Nick. > > -- > Robin Becker > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From robin at reportlab.com Mon Nov 18 14:13:45 2013 From: robin at reportlab.com (Robin Becker) Date: Mon, 18 Nov 2013 13:13:45 +0000 Subject: [Python-ideas] Unicode stdin/stdout In-Reply-To: References: <5286054F.6000707@chamonix.reportlab.co.uk> <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> <5289FE6C.7030007@chamonix.reportlab.co.uk> <528A092D.4060408@chamonix.reportlab.co.uk> Message-ID: <528A1289.2060405@chamonix.reportlab.co.uk> On 18/11/2013 12:59, Nick Coghlan wrote: > On 18 Nov 2013 22:36, "Robin Becker" wrote: ........... > > Note that calling detach() on the standard streams isn't officially > supported, since it breaks the shadow streams saved in sys.__stderr__, etc. > > Cheers, > Nick. ......... it would be easier if we could just have a preferred encoding somewhere, or allow us to mess with the stream encoding encoding itself. I assume we're not allowed to do that for some good reason. -- Robin Becker From victor.stinner at gmail.com Mon Nov 18 16:25:32 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 18 Nov 2013 16:25:32 +0100 Subject: [Python-ideas] Unicode stdin/stdout In-Reply-To: <528A092D.4060408@chamonix.reportlab.co.uk> References: <5286054F.6000707@chamonix.reportlab.co.uk> <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> <5289FE6C.7030007@chamonix.reportlab.co.uk> <528A092D.4060408@chamonix.reportlab.co.uk> Message-ID: Why do you need to force the UTF-8 encoding? Your locale is not correctly configured? It's better to set PYTHONIOENCODING rather than replacing sys.stdout/stderr at runtime. There is an open issue to add a TextIOWrapper.set_encoding() method: http://bugs.python.org/issue15216 Victor From random832 at fastmail.us Mon Nov 18 23:30:06 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Mon, 18 Nov 2013 17:30:06 -0500 Subject: [Python-ideas] Unicode stdin/stdout In-Reply-To: <528A092D.4060408@chamonix.reportlab.co.uk> References: <5286054F.6000707@chamonix.reportlab.co.uk> <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> <5289FE6C.7030007@chamonix.reportlab.co.uk> <528A092D.4060408@chamonix.reportlab.co.uk> Message-ID: <1384813806.25855.49113337.110F19E2@webmail.messagingengine.com> On Mon, Nov 18, 2013, at 7:33, Robin Becker wrote: > UTF-8 stuff This doesn't really solve the issue I was referring to, which is that windows _console_ (i.e. not redirected file or pipe) I/O can only support unicode via wide character (UTF-16) I/O with a special function, not via using byte-based I/O with the normal write function. From abarnert at yahoo.com Tue Nov 19 06:27:41 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 18 Nov 2013 21:27:41 -0800 (PST) Subject: [Python-ideas] Unicode stdin/stdout In-Reply-To: <1384813806.25855.49113337.110F19E2@webmail.messagingengine.com> References: <5286054F.6000707@chamonix.reportlab.co.uk> <1384539385.17784.47988289.02AE655C@webmail.messagingengine.com> <5289FE6C.7030007@chamonix.reportlab.co.uk> <528A092D.4060408@chamonix.reportlab.co.uk> <1384813806.25855.49113337.110F19E2@webmail.messagingengine.com> Message-ID: <1384838861.8937.YahooMailNeo@web184706.mail.ne1.yahoo.com> From: "random832 at fastmail.us" > On Mon, Nov 18, 2013, at 7:33, Robin Becker wrote: >> UTF-8 stuff > > This doesn't really solve the issue I was referring to, which is that > windows _console_ (i.e. not redirected file or pipe) I/O can only > support unicode via wide character (UTF-16) I/O with a special function, > not via using byte-based I/O with the normal write function. The problem is that Windows 16-bit I/O doesn't fit into the usual io module hierarchy. Not because it uses an encoding of UTF-16 (although anyone familiar with ReadConsoleW/WriteConsoleW from other languages may be a bit confused that Python's lowest-level wrappers around them deal in byte counts instead of WCHAR counts), but because you have to use HANDLEs instead of fds. So, there are going to be some compromises and some complexity. One possibility is to use as much of the io hierarchy as possible, but not try to make it flexible enough to be reusable for arbitrary HANDLEs: Add?WindowsFileIO and WindowsConsoleIO classes that implement RawIOBase with a native HANDLE and ReadFile/WriteFile and ReadConsoleW/WriteConsoleW respectively. Both work in terms of bytes (which means WindowsConsoleIO.read has to //2 its argument, and write has to *2 the result). You also need a create_windows_io function that wraps a HANDLE by calling GetConsoleMode and constructing a WindowsConsoleIO or WindowsFileIO as appropriate, then creates a BufferedReader/Writer around that, then constructs a TextIOWrapper with UTF-16 or the default encoding around that. At startup, you just do that for the three GetStdHandle handles, and that's your stdin, stdout, and stderr. Besides not being reusable enough for people who want to wrap HANDLEs from other libraries or attach to new consoles from Python, it's not clear what fileno() should return. You could fake it and return the MSVCRT fds that correspond to the same files as the HANDLEs, but it's possible to end up with one redirected and not the other (e.g., if you detach the console), and I'm not sure what happens if you mix and match the two. A more "correct" solution would be to call _open_osfhandle on?the HANDLE (and then keep track of the fact that os.close closes the HANDLE, or leave it up to the user to deal with bad handle errors?), but I'm not sure that's any better in practice. Also, should a console HANDLE use _O_WTEXT for its fd (in which case the user has to know that he has a _O_WTEXT handle even though there's no way to see that from Python), or not (in which case he's mixing 8-bit and 16-bit I/O on the same file)? It might be reasonable to just not expose fileno(); most code that wants the fileno() for stdin is just going to do something Unix-y that's not going to work anyway (select it, tcsetattr it, pass it over a socket to another file, ?). A different approach would be to reuse as _little_ of io as possible, instead of as much: Windows stdin/stdout/stderr could each be custom TextIOBase implementations that work straight on HANDLEs and don't?even support buffer (or detach), much less fileno. That exposes even less functionality to users, of course. It also means we need a parallel implementation of all the buffering logic. (On the other hand, it also leaves the door open to expose some Windows functionality, like async ReadFileEx/WriteFileEx, in a way that would be very hard through the normal layers?) It shouldn't be too hard to write most of these via an extension module or ctypes to experiment with it. As long as you're careful not to mix winsys.stdout and sys.stdout (the module could even set sys.stdin, sys.stdout, sys.stderr=stdin, stdout, stderr at import time, or just del them, for a bit of protection), it should work. It might be worth implementing a few different designs to play with, and putting them through their paces with some modules and scripts that do different things with stdio (including running the scripts with cmd.exe redirected I/O and with subprocess PIPEs) to see which ones have problems or limitations that are hard to foresee in advance. If you have a design that you think sounds good, and are willing to experiment the hell out of it, and don't know how to get started but would be willing to debug and finish a mostly-written/almost-working implementation, I could slap something together with ctypes to get you started. From drekin at gmail.com Tue Nov 19 10:46:30 2013 From: drekin at gmail.com (drekin at gmail.com) Date: Tue, 19 Nov 2013 01:46:30 -0800 (PST) Subject: [Python-ideas] Unicode stdin/stdout (was: Re: python 3.3repr) In-Reply-To: <1384838861.8937.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: <528b3376.6f26b40a.5adf.ffff90f3@mx.google.com> I really wanted to be able to write Unicode in Windows console, so I have written the following code: http://bugs.python.org/file31756/streams.py (based on other samples of code I found), in connection with http://bugs.python.org/issue1602 . It addresses the design of using as much as possible from io hierarchy. It definitely doesn't cover many details but I actually use it in my interactive console. Regards, Drekin From random832 at fastmail.us Tue Nov 19 18:08:23 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Tue, 19 Nov 2013 12:08:23 -0500 Subject: [Python-ideas] Unicode stdin/stdout (was: Re: python 3.3repr) In-Reply-To: <528b3376.6f26b40a.5adf.ffff90f3@mx.google.com> References: <528b3376.6f26b40a.5adf.ffff90f3@mx.google.com> Message-ID: <1384880903.20250.49468621.5BD34917@webmail.messagingengine.com> On Tue, Nov 19, 2013, at 4:46, drekin at gmail.com wrote: > I really wanted to be able to write Unicode in Windows console, so I have > written the following code: http://bugs.python.org/file31756/streams.py > (based on other samples of code I found), in connection with > http://bugs.python.org/issue1602 . It addresses the design of using as > much as possible from io hierarchy. It definitely doesn't cover many > details but I actually use it in my interactive console. For this whole tactic of "using as much as possible from the io hierarchy" and acting like a raw stream that reads UTF-16 bytes, I'm worried that at some point it's going to run into something that tries to read a single byte - to which your code will return 0. From drekin at gmail.com Tue Nov 19 18:31:46 2013 From: drekin at gmail.com (Draic Kin) Date: Tue, 19 Nov 2013 18:31:46 +0100 Subject: [Python-ideas] Unicode stdin/stdout (was: Re: python 3.3repr) Message-ID: (Sorry for posting twice, forgot to add the list to Cc.) On Tue, Nov 19, 2013 at 6:08 PM, wrote: > On Tue, Nov 19, 2013, at 4:46, drekin at gmail.com wrote: > > I really wanted to be able to write Unicode in Windows console, so I have > > written the following code: http://bugs.python.org/file31756/streams.py > > (based on other samples of code I found), in connection with > > http://bugs.python.org/issue1602 . It addresses the design of using as > > much as possible from io hierarchy. It definitely doesn't cover many > > details but I actually use it in my interactive console. > > For this whole tactic of "using as much as possible from the io > hierarchy" and acting like a raw stream that reads UTF-16 bytes, I'm > worried that at some point it's going to run into something that tries > to read a single byte - to which your code will return 0. > Well, if you use the textio object than there are no problems and you couldn't use buffered or raw io at all with the other approach. If you try to read one byte from raw io, it just returns b''. More serious problems occur with buffered io if you somehow get a buffer of odd length. In that case flush call ends in infinity loop. Some of these could be solved by adding some checking code to buffered io which will raise an exception if you try to do something bad. -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Tue Nov 19 18:51:21 2013 From: greg at krypto.org (Gregory P. Smith) Date: Tue, 19 Nov 2013 09:51:21 -0800 Subject: [Python-ideas] making a module callable Message-ID: While I don't ordinarily endorse this use case it'd be nice not to require hacks involving sys.modules monkeying such as: https://github.com/has207/flexmock/pull/89 specifically: https://github.com/has207/flexmock/commit/bd47fa8189c7dff349de257c0e061b9fcea2330d to make a module callable. This obviously touches on the larger ideas of what is a module vs a class and why are they different given that they're "just" namespaces. (sorry. not offering ideas myself just hoping others have them) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Tue Nov 19 18:55:30 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 19 Nov 2013 20:55:30 +0300 Subject: [Python-ideas] A nice __repr__ for the ast.* classes? In-Reply-To: References: Message-ID: On Wed, May 1, 2013 at 6:51 AM, Nick Coghlan wrote: > > In this particular case, I'm also inclined to favour the approach of > using ast.dump as the repr for AST nodes. Call it +0. > > Adding support for depth limiting and peer node limiting to ast.dump > (with missing nodes replaced with "...") would also be neat. https://bitbucket.org/techtonik/astdump/src/ed73edbdccb5fd9d4255d7ed64f45b6922447ecb/astdump.py?at=default#cl-35 Something like this? -- anatoly t. From flying-sheep at web.de Tue Nov 19 19:09:13 2013 From: flying-sheep at web.de (Philipp A.) Date: Tue, 19 Nov 2013 19:09:13 +0100 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: 2013/11/19 Gregory P. Smith While I don't ordinarily endorse this use case it'd be nice not to require > hacks involving sys.modules monkeying such ashttps://github.com/has207/flexmock/pull/89to make a module callable. > > This obviously touches on the larger ideas of what is a module vs a class > and why are they different given that they're "just" namespaces. > hmm, interesting thought. there are some modules who just have one single main use (pprint) and could profit from that. imho it would simplify the situation. currently, everything is callable that has a __call__ property which is itself callable: class Foo(): __call__(self): pass def bar(): pass foo = Foo() foo() foo.bar() def baz(): pass foo.baz = baz foo.baz() except modules. which feels is imho like an artificial limitation. ? phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Tue Nov 19 19:19:38 2013 From: fuzzyman at gmail.com (Michael Foord) Date: Tue, 19 Nov 2013 18:19:38 +0000 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: On 19 November 2013 18:09, Philipp A. wrote: > 2013/11/19 Gregory P. Smith > > While I don't ordinarily endorse this use case it'd be nice not to require >> hacks involving sys.modules monkeying such ashttps://github.com/has207/flexmock/pull/89to make a module callable. >> >> This obviously touches on the larger ideas of what is a module vs a class >> and why are they different given that they're "just" namespaces. >> > hmm, interesting thought. there are some modules who just have one > single main use (pprint) and could profit from that. > > imho it would simplify the situation. currently, everything is callable > that has a __call__ property which is itself callable: > Not quite true: Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 01:25:11) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> class Foo: pass ... >>> f = Foo() >>> f.__call__ = lambda *args: 3 >>> >>> f() Traceback (most recent call last): File "", line 1, in TypeError: 'Foo' object is not callable This is why module objects are not callable even if they have a __call__. They are *instances* of ModuleType and the __call__ method is looked up on their type, not the instance itself. So modules not being callable even when they a __call__ is not an anomaly, even if it is not convenient sometimes. Michael > class Foo(): > __call__(self): > pass > def bar(): > pass > > foo = Foo() > foo() > foo.bar() > def baz(): > pass > > foo.baz = baz > foo.baz() > > except modules. which feels is imho like an artificial limitation. > > ? phil > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From drekin at gmail.com Tue Nov 19 19:23:00 2013 From: drekin at gmail.com (Draic Kin) Date: Tue, 19 Nov 2013 19:23:00 +0100 Subject: [Python-ideas] Unicode stdin/stdout Message-ID: On Tue, Nov 19, 2013 at 7:02 PM, wrote: > On Tue, Nov 19, 2013, at 12:28, Draic Kin wrote: > > Well, if you use the textio object than there are no problems and you > > couldn't use buffered or raw io at all with the other approach. If you > > try > > to read one byte from raw io, it just returns b''. > > I get that, but my understanding is that this implies EOF. > Does it? If so then rawio.read(1) should raise en exception since other options are to lose one byte or return more bytes than demanded which seem to be unacceptable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Tue Nov 19 22:01:14 2013 From: flying-sheep at web.de (Philipp A.) Date: Tue, 19 Nov 2013 22:01:14 +0100 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: 2013/11/19 Michael Foord On 19 November 2013 18:09, Philipp A. wrote: >> >> imho it would simplify the situation. currently, everything is callable >> that has a __call__ property which is itself callable: >> > This is why module objects are not callable even if they have a __call__. > They are *instances* of ModuleType and the __call__ method is looked up on > their type, not the instance itself. So modules not being callable even > when they a __call__ is not an anomaly, even if it is not convenient > sometimes. > you?re right, apologies. so the hack consists of switching a module?s class during runtime? there?s also another hack, calldules, making that automatic (funnily via implicits effects when doing import calldules). note that it isn?t serious! just a programming exercise. -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Tue Nov 19 22:39:19 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Tue, 19 Nov 2013 13:39:19 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: > there are some modules who just have one single main use (pprint) and could profit from that. A milion times this! pprint.pprint() time.time() random.random() copy.copy() md5.md5() timeit.timeit() glob.glob() cStringIO.cStringIO() StringIO.StringIO() On Tue, Nov 19, 2013 at 1:01 PM, Philipp A. wrote: > 2013/11/19 Michael Foord > > On 19 November 2013 18:09, Philipp A. wrote: >> >>> imho it would simplify the situation. currently, everything is callable >>> that has a __call__ property which is itself callable: >>> >> This is why module objects are not callable even if they have a __call__. >> They are *instances* of ModuleType and the __call__ method is looked up on >> their type, not the instance itself. So modules not being callable even >> when they a __call__ is not an anomaly, even if it is not convenient >> sometimes. >> > you?re right, apologies. so the hack consists of switching a module?s > class during runtime? > > there?s also another hack, calldules, > making that automatic (funnily via implicits effects when doing import > calldules). note that it isn?t serious! just a programming exercise. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From grosser.meister.morti at gmx.net Tue Nov 19 23:05:48 2013 From: grosser.meister.morti at gmx.net (=?windows-1252?Q?Mathias_Panzenb=F6ck?=) Date: Tue, 19 Nov 2013 23:05:48 +0100 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: <528BE0BC.40304@gmx.net> Maybe the solution would be to make it possible to "return" something else than a module object, similar to how node.js does it? In node.js: callme.js: module.exports = function (a) { console.log("a:",a); }; > require("./callme.js")("b") a: b undefined > Possible way to do it in Python *already*: callme.py: import sys sys.modules[__name__] = lambda a: print("a:",a) >>> import callme >>> callme('b') a: b >>> Maybe not very nice, but is there a reason why not to do this (except for it's ugliness)? On 11/19/2013 10:39 PM, Haoyi Li wrote: > > there are some modules who just have one single main use (pprint) and could profit from that. > > A milion times this! > > pprint.pprint() > time.time() > random.random() > copy.copy() > md5.md5() > timeit.timeit() > glob.glob() > cStringIO.cStringIO() > StringIO.StringIO() > > > > On Tue, Nov 19, 2013 at 1:01 PM, Philipp A. > wrote: > > 2013/11/19 Michael Foord > > > On 19 November 2013 18:09, Philipp A. > wrote: > > imho it would simplify the situation. currently, everything is callable that has a |__call__| property which > is itself callable: > > This is why module objects are not callable even if they have a __call__. They are *instances* of ModuleType and > the __call__ method is looked up on their type, not the instance itself. So modules not being callable even when > they a __call__ is not an anomaly, even if it is not convenient sometimes. > > you?re right, apologies. so the hack consists of switching a module?s class during runtime? > > there?s also another hack, calldules , making that automatic (funnily via > implicits effects when doing |import calldules|). note that it isn?t serious! just a programming exercise. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > From g.brandl at gmx.net Tue Nov 19 23:07:14 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 19 Nov 2013 23:07:14 +0100 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: Am 19.11.2013 22:39, schrieb Haoyi Li: >> there are some modules who just have one single main use (pprint) and could > profit from that. > > A milion times this! > > pprint.pprint() > time.time() > random.random() > copy.copy() > md5.md5() > timeit.timeit() > glob.glob() > cStringIO.cStringIO() > StringIO.StringIO() No, sorry. Modules should be nothing more than modules: collections of APIs, not APIs themselves. If you want to write copy(), use import-from. Georg From ncoghlan at gmail.com Tue Nov 19 23:29:52 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 20 Nov 2013 08:29:52 +1000 Subject: [Python-ideas] making a module callable In-Reply-To: <528BE0BC.40304@gmx.net> References: <528BE0BC.40304@gmx.net> Message-ID: On 20 Nov 2013 08:06, "Mathias Panzenb?ck" wrote: > > Maybe the solution would be to make it possible to "return" something else than a module object, similar to how node.js does it? Already supported, modules just have to replace themselves with an instance of a custom class in sys.modules. PEP 451 makes it even easier to write custom finders and loaders that return custom module types. Cheers, Nick. > > In node.js: > > callme.js: > module.exports = function (a) { console.log("a:",a); }; > > > require("./callme.js")("b") > a: b > undefined > > > > Possible way to do it in Python *already*: > > callme.py: > import sys > sys.modules[__name__] = lambda a: print("a:",a) > > >>> import callme > >>> callme('b') > a: b > >>> > > > Maybe not very nice, but is there a reason why not to do this (except for it's ugliness)? > > > > On 11/19/2013 10:39 PM, Haoyi Li wrote: >> >> > there are some modules who just have one single main use (pprint) and could profit from that. >> >> A milion times this! >> >> pprint.pprint() >> time.time() >> random.random() >> copy.copy() >> md5.md5() >> timeit.timeit() >> glob.glob() >> cStringIO.cStringIO() >> StringIO.StringIO() >> >> >> >> On Tue, Nov 19, 2013 at 1:01 PM, Philipp A. > wrote: >> >> 2013/11/19 Michael Foord > >> >> >> On 19 November 2013 18:09, Philipp A. > wrote: >> >> imho it would simplify the situation. currently, everything is callable that has a |__call__| property which >> is itself callable: >> >> This is why module objects are not callable even if they have a __call__. They are *instances* of ModuleType and >> the __call__ method is looked up on their type, not the instance itself. So modules not being callable even when >> they a __call__ is not an anomaly, even if it is not convenient sometimes. >> >> you?re right, apologies. so the hack consists of switching a module?s class during runtime? >> >> there?s also another hack, calldules < https://pypi.python.org/pypi/calldules>, making that automatic (funnily via >> implicits effects when doing |import calldules|). note that it isn?t serious! just a programming exercise. >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> >> >> >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Tue Nov 19 23:37:57 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 19 Nov 2013 17:37:57 -0500 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: On 11/19/2013 4:39 PM, Haoyi Li wrote: > 2013/11/19 Michael Foord > > > > On 19 November 2013 18:09, Philipp A. > > wrote: > > imho it would simplify the situation. currently, everything > is callable that has a |__call__| property which is itself > callable: > > This is why module objects are not callable even if they have a > __call__. They are *instances* of ModuleType and the __call__ > method is looked up on their type, not the instance itself. So > modules not being callable even when they a __call__ is not an > anomaly, even if it is not convenient sometimes. In order to make modules callable, ModuleType must have a __call__ method. In order to make the call execute code in the module, that method should delegate to a callable in the module instance that has a known special name, such as __main__. class ModuleType(): def __call__(self, *args, **kwds): return self.__main__(*args, **kwds) Doc: "The __main__ object of a module is its main callable, the one that is called if the module is called without specifying anything else. If this were done... > Someone > >there are some modules who just have one single main use (pprint) > > and could profit from that. > pprint.pprint() then adding __main__ = pprint to pprint should make the following work: import pprint; pprint(ob) > time.time() > random.random() > copy.copy() > md5.md5() > timeit.timeit() > glob.glob() > cStringIO.cStringIO() > StringIO.StringIO() etc -- Terry Jan Reedy From rosuav at gmail.com Wed Nov 20 00:20:51 2013 From: rosuav at gmail.com (Chris Angelico) Date: Wed, 20 Nov 2013 10:20:51 +1100 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: On Wed, Nov 20, 2013 at 9:37 AM, Terry Reedy wrote: > In order to make modules callable, ModuleType must have a __call__ method. > In order to make the call execute code in the module, that method should > delegate to a callable in the module instance that has a known special name, > such as __main__. > > class ModuleType(): > def __call__(self, *args, **kwds): > return self.__main__(*args, **kwds) Hmm Classes allow you to control the metaclass. Should modules allow such a declaration? That would make this sort of thing fully customizable. But is there any way to avoid the chicken-and-egg problem of trying to logically put that into the same source file as the module whose metaclass is being changed? Considering that the creation of a class involves building up its dictionary of contents and _then_ calling type(), it could in theory be possible to build up a dictionary of module contents, possibly find something with a magic name like __metamodule__, and then use that as the module's type. But this might become rather convoluted. ChrisA From random832 at fastmail.us Wed Nov 20 17:12:42 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Wed, 20 Nov 2013 11:12:42 -0500 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: <1384963962.22956.49921225.192836EA@webmail.messagingengine.com> On Tue, Nov 19, 2013, at 18:20, Chris Angelico wrote: > But is there any way to avoid the chicken-and-egg problem of trying to > logically put that into the same source file as the module whose > metaclass is being changed? Considering that the creation of a class > involves building up its dictionary of contents and _then_ calling > type(), it could in theory be possible to build up a dictionary of > module contents, possibly find something with a magic name like > __metamodule__, and then use that as the module's type. But this might > become rather convoluted. Isn't this how metaclasses used to actually work? From rosuav at gmail.com Wed Nov 20 17:15:54 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 21 Nov 2013 03:15:54 +1100 Subject: [Python-ideas] making a module callable In-Reply-To: <1384963962.22956.49921225.192836EA@webmail.messagingengine.com> References: <1384963962.22956.49921225.192836EA@webmail.messagingengine.com> Message-ID: On Thu, Nov 21, 2013 at 3:12 AM, wrote: > On Tue, Nov 19, 2013, at 18:20, Chris Angelico wrote: >> But is there any way to avoid the chicken-and-egg problem of trying to >> logically put that into the same source file as the module whose >> metaclass is being changed? Considering that the creation of a class >> involves building up its dictionary of contents and _then_ calling >> type(), it could in theory be possible to build up a dictionary of >> module contents, possibly find something with a magic name like >> __metamodule__, and then use that as the module's type. But this might >> become rather convoluted. > > Isn't this how metaclasses used to actually work? Is it? I've no idea. I haven't been doing complex stuff in Python all that long, so I don't know any of the details back past about 2.5 or so (apart from what I've learned since). ChrisA From ericsnowcurrently at gmail.com Wed Nov 20 21:14:37 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 20 Nov 2013 13:14:37 -0700 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: On Nov 19, 2013 10:52 AM, "Gregory P. Smith" wrote: > > While I don't ordinarily endorse this use case it'd be nice not to require hacks involving sys.modules monkeying such as: > > https://github.com/has207/flexmock/pull/89 > specifically: > https://github.com/has207/flexmock/commit/bd47fa8189c7dff349de257c0e061b9fcea2330d > > to make a module callable. > > This obviously touches on the larger ideas of what is a module vs a class and why are they different given that they're "just" namespaces. What's the use case for a callable module? In the flexmock example, is it just so they can do an import instead of a from..import? As Georg said, modules are just top-level namespaces, API containers. Importing the callable you want out of a module is easy. However, the underlying idea is something that has come up before and may be worth more consideration. tl;dr: __metamodule__ (pre-bikeshedding) would be a good way to go, but isn't worth it and may be an attractive nuisance. If we are going to support customization of module classes, I'd rather we do it via a general API (e.g. Chris's __metamodule__) than piecemeal (via special-casing __call__, etc.). However, you can already use a custom module type in the two ways that Nick mentioned, the first of which flexmock is doing (and Django does IIRC). Sticking something into sys.modules to replace the currently executing module is indeed a hack. The import system accommodates this not by design (unless someone is willing to come forward and admit guilt ) but mostly as an incidental implementation artifact of the import machinery from many releases ago. [1] As Nick mentioned, PEP 451 introduces an optional create_module() method on loaders that returns the module object to use during loading. This is nice if you are already writing a loader. Otherwise it's a pain (the import hook machinery isn't exactly simple) and usually won't be worth your time. Furthermore, your loader will probably be applied to multiple modules (which may be what you need). It certainly isn't a one-off, add-something-to-the-affected-module sort of thing. Basically, having to write a loader and plug it in is like (only more complicated) having to use a metaclass just to implement a __prepare_class__() that returns an OrderedDict, all so you can have an ordered class namespace. Loader.create_module() is a good addition, but is too low level to use as a replacement for the sys.modules hack. In contrast, something like __metamodule__ would be an effective replacement. It would be similar in spirit and in syntax to __init_class__ in PEP 422 (and __metaclass__ in Python 2), defined at the top of the module and used for the module. The thing that appeals to me is that we could deprecate the sys.modules hack. :) The big question is, is having a custom module type a common enough need? To me the desire for it usually implies a misunderstanding of the purpose of modules. If we had an easier API would it be an attractive nuisance? Unless it's a big win I don't think it's a good idea, and I'm not convinced it's common enough a need. -eric [1] A module replacing itself in sys.modules came up during the importlib bootstrap integration, where it required adding yet another special-case backward-compatibility pain point to the importlib implementation. I can't find the actual email, but I refer to what happened in http://bugs.python.org/msg166630, note "[3]". It certainly surprised us that you could do it and that people actually were. At this point I guess the latter shouldn't have been surprising. :) From ethan at stoneleaf.us Wed Nov 20 21:23:14 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 20 Nov 2013 12:23:14 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: <528D1A32.6030206@stoneleaf.us> On 11/20/2013 12:14 PM, Eric Snow wrote: > > Sticking something into sys.modules to replace the currently executing > module is indeed a hack. The import system accommodates this not by > design (unless someone is willing to come forward and admit guilt > ) but mostly as an incidental implementation artifact of the > import machinery from many releases ago. [1] Actually, it is intentional. An excerpt from https://mail.python.org/pipermail/python-ideas/2012-May/014969.html > There is actually a hack that is occasionally used and recommended: > a module can define a class with the desired functionality, and then > at the end, replace itself in sys.modules with an instance of that > class (or with the class, if you insist, but that's generally less > useful). E.g.: > > # module foo.py > > import sys > > class Foo: > def funct1(self, ): > def funct2(self, ): > > sys.modules[__name__] = Foo() > > This works because the import machinery is actively enabling this > hack, and as its final step pulls the actual module out of > sys.modules, after loading it. (This is no accident. The hack was > proposed long ago and we decided we liked enough to support it in > the import machinery.) -- ~Ethan~ From ericsnowcurrently at gmail.com Wed Nov 20 22:01:28 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 20 Nov 2013 14:01:28 -0700 Subject: [Python-ideas] making a module callable In-Reply-To: <528D1A32.6030206@stoneleaf.us> References: <528D1A32.6030206@stoneleaf.us> Message-ID: On Wed, Nov 20, 2013 at 1:23 PM, Ethan Furman wrote: > On 11/20/2013 12:14 PM, Eric Snow wrote: >> Sticking something into sys.modules to replace the currently executing >> module is indeed a hack. The import system accommodates this not by >> design (unless someone is willing to come forward and admit guilt >> ) but mostly as an incidental implementation artifact of the >> import machinery from many releases ago. [1] > > > Actually, it is intentional. An excerpt from > https://mail.python.org/pipermail/python-ideas/2012-May/014969.html > >> There is actually a hack that is occasionally used and recommended: >> a module can define a class with the desired functionality, and then >> at the end, replace itself in sys.modules with an instance of that >> class (or with the class, if you insist, but that's generally less >> useful). E.g.: >> >> # module foo.py >> >> import sys >> >> class Foo: >> def funct1(self, ): >> def funct2(self, ): >> >> sys.modules[__name__] = Foo() >> >> This works because the import machinery is actively enabling this >> hack, and as its final step pulls the actual module out of >> sys.modules, after loading it. (This is no accident. The hack was >> proposed long ago and we decided we liked enough to support it in >> the import machinery.) I stand corrected. :) -eric From haoyi.sg at gmail.com Wed Nov 20 22:14:05 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 20 Nov 2013 13:14:05 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: <528D1A32.6030206@stoneleaf.us> References: <528D1A32.6030206@stoneleaf.us> Message-ID: > Sticking something into sys.modules to replace the currently executing module is indeed a hack. The import system accommodates this not by design (unless someone is willing to come forward and admit guilt ) but mostly as an incidental implementation artifact of the import machinery from many releases ago. Yeah, and the fact that people are jumping throw these hoops and doing "nasty hacks" despite their nastiness means that there's a real need for the functionality. If it was easy and people did it, then we don't learn anything, same if it's difficult and people don't do it. On the other hand, if a feature is easy and people don't do it, then maybe that feature deserves to be deprecated/made less easy. Similarly, if it's difficult/nasty/hacky and you find people doing it anyway, then the functionality probably deserves to be made easier to use. The hackiness if an artifact of the way things are now, but this whole thread is about changing the way things are now. We should be shaping the machinery to fit what people do, rather than trying to shape people to fit the machinery which was arbitrarily designed a long time ago. On Wed, Nov 20, 2013 at 12:23 PM, Ethan Furman wrote: > On 11/20/2013 12:14 PM, Eric Snow wrote: > >> >> Sticking something into sys.modules to replace the currently executing >> module is indeed a hack. The import system accommodates this not by >> design (unless someone is willing to come forward and admit guilt >> ) but mostly as an incidental implementation artifact of the >> import machinery from many releases ago. [1] >> > > Actually, it is intentional. An excerpt from https://mail.python.org/ > pipermail/python-ideas/2012-May/014969.html > > There is actually a hack that is occasionally used and recommended: >> a module can define a class with the desired functionality, and then >> at the end, replace itself in sys.modules with an instance of that >> class (or with the class, if you insist, but that's generally less >> useful). E.g.: >> >> # module foo.py >> >> import sys >> >> class Foo: >> def funct1(self, ): >> def funct2(self, ): >> >> sys.modules[__name__] = Foo() >> >> This works because the import machinery is actively enabling this >> hack, and as its final step pulls the actual module out of >> sys.modules, after loading it. (This is no accident. The hack was >> proposed long ago and we decided we liked enough to support it in >> the import machinery.) >> > > -- > ~Ethan~ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Nov 21 02:57:11 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 20 Nov 2013 17:57:11 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: On Nov 20, 2013, at 12:14, Eric Snow wrote: > In contrast, something like __metamodule__ would be an effective > replacement. It would be similar in spirit and in syntax to > __init_class__ in PEP 422 (and __metaclass__ in Python 2), defined at > the top of the module and used for the module. The thing that appeals > to me is that we could deprecate the sys.modules hack. :) Given that __metaclass__ was removed in Python 3, doesn't "this is an exact parallel to __metaclass__" argue against the idea, rather than for? Or at least against the name? (Maybe __init_module__?) Anyway, I think a module replacing itself with something callable is both more flexible and more in line with the way people are actually doing things today, so maybe a "less hacky" way to do the sys.modules hack is what people actually want here. From ethan at stoneleaf.us Thu Nov 21 03:40:07 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 20 Nov 2013 18:40:07 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: Message-ID: <528D7287.7040305@stoneleaf.us> On 11/20/2013 05:57 PM, Andrew Barnert wrote: > On Nov 20, 2013, at 12:14, Eric Snow wrote: > >> In contrast, something like __metamodule__ would be an effective >> replacement. It would be similar in spirit and in syntax to >> __init_class__ in PEP 422 (and __metaclass__ in Python 2), defined at >> the top of the module and used for the module. The thing that appeals >> to me is that we could deprecate the sys.modules hack. :) > > Given that __metaclass__ was removed in Python 3, doesn't "this is an > exact parallel to __metaclass__" argue against the idea, rather than > for? Or at least against the name? (Maybe __init_module__?) > > Anyway, I think a module replacing itself with something callable is > both more flexible and more in line with the way people are actually > doing things today, so maybe a "less hacky" way to do the sys.modules > hack is what people actually want here. sys.modules is a dictionary. dict[name] = something is the normal way to set values to keys. What is so horrible about this idiom? -- ~Ethan~ From ncoghlan at gmail.com Thu Nov 21 04:44:17 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Nov 2013 13:44:17 +1000 Subject: [Python-ideas] making a module callable In-Reply-To: <528D7287.7040305@stoneleaf.us> References: <528D7287.7040305@stoneleaf.us> Message-ID: On 21 Nov 2013 13:02, "Ethan Furman" wrote: > > On 11/20/2013 05:57 PM, Andrew Barnert wrote: > >> On Nov 20, 2013, at 12:14, Eric Snow wrote: >> >>> In contrast, something like __metamodule__ would be an effective >>> replacement. It would be similar in spirit and in syntax to >>> __init_class__ in PEP 422 (and __metaclass__ in Python 2), defined at >>> the top of the module and used for the module. The thing that appeals >>> to me is that we could deprecate the sys.modules hack. :) >> >> >> Given that __metaclass__ was removed in Python 3, doesn't "this is an >> exact parallel to __metaclass__" argue against the idea, rather than >> for? Or at least against the name? (Maybe __init_module__?) >> >> Anyway, I think a module replacing itself with something callable is >> both more flexible and more in line with the way people are actually >> doing things today, so maybe a "less hacky" way to do the sys.modules >> hack is what people actually want here. > > > sys.modules is a dictionary. > > dict[name] = something > > is the normal way to set values to keys. > > What is so horrible about this idiom? It potentially causes problems for module reloading and it definitely causes problems for the import engine PEP (since sys.modules is process global state). I expect we'll revisit this later in the 3.5 development cycle (we haven't even merged the accepted PEP 451 for 3.4 yet), but formalising the current module replacement idiom is already a more likely outcome than making module instances callable. Cheers, Nick. > > -- > ~Ethan~ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ppershing at gmail.com Thu Nov 21 11:55:46 2013 From: ppershing at gmail.com (=?UTF-8?B?UGVyZcWhw61uaSBQZXRlcg==?=) Date: Thu, 21 Nov 2013 11:55:46 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python Message-ID: Recently I fell in love with Dart method cascading operator .. to the degree that I find it really convenient and I am missing it in the Python. The cascade operator myObj ..setX(5) ..y=6 is the syntactic sugar equivalent of the following: tmp = myObj tmp.setX(5) tmp.y=6 This can be used to greatly simplify instantiating objects and can also enable creating domain specific languages in Python. In particular, we can make this much more powerful in Python (as opposed to Dart) because Python recognizes scope by indentation and therefore it would be possible to do something like gnuplot.newPlot() ..set("xrange [0:5]") ..set("yrange [0:20]") ..newPlot() ..addSeries("Linear", [1,2,3]) ..addSeries("Quadratic", [1,4,6]) ..run() or template = HtmlTemplate() ..head() ..script(src="xyz") ..body() ..div(class="main") ..paragraph("I am the first paragraph") ..paragraph("I am the second paragraph") Note that this is strictly better than method chaining, e.g. template = MyObject().setX(1).setY(2) because a) method chaining is hard to write on multiple lines, the only two options are template = MyObject() \ .setX(1) \ .setY(2) which is fragile (explicit line continuation is discouraged in Python) or (template = MyObject() .setX(1) .setY(2)) which looks weird. b) method chaining cannot take advantage of multiple scoping, e.g. the only way to write Gnuplot example is (Gnuplot() .set("...") .newPlot() .addSeries() .addSeries() *.endPlot()* .run() ) with added method *endPlot()* to the Plot class. Moreover, Plot class now needs a direct reference to its parent (e.g. Gnuplot) so it can return its reference. This adds unnecessary cyclic dependencies. c) method chaining needs specialized API, e.g. each method needs to return self d) method chaining cannot be used with attributes, e.g. there is no equivalent to obj = MyObject() ..x=1 ..y=2 Note that this proposal is different from "in" statement [ https://mail.python.org/pipermail/python-ideas/2012-November/017736.html] in the sense that this proposal does not bring anything new into the scope, e.g. obj1 ..obj2 ..x = y translates to tmp1 = obj1 tmp2 = obj2 tmp2.x = y or (simplified) obj1.obj2.x = y no matter if obj2 or obj1 contains variable y. In my opinion, such cascading operator would greatly help creating domain specific languages APIs in Python which are 1) easily readable. It is better than being lost in a repetitive stream of lines looking like obj.x(), obj.y(), obj.z(). Note that repetitive stream of lines (apart from being too verbose) also introduces a possibility of mistyping the name of the object 2) easy to implement (no special API considerations as returning self from each function call) What do you think? Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus at unterwaditzer.net Thu Nov 21 12:20:08 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Thu, 21 Nov 2013 12:20:08 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: Message-ID: <20e99a65-672a-43a5-b0ac-28e66b5bbbae@email.android.com> IMO this looks at least less ambiguous than my proposal. -- Markus "Pere??ni Peter" wrote: >Recently I fell in love with Dart method cascading operator .. to the >degree that I find it really convenient and I am missing it in the >Python. >The cascade operator > >myObj > ..setX(5) > ..y=6 > >is the syntactic sugar equivalent of the following: > >tmp = myObj >tmp.setX(5) >tmp.y=6 > >This can be used to greatly simplify instantiating objects and can also >enable creating domain specific languages in Python. In particular, we >can >make this much more powerful in Python (as opposed to Dart) because >Python >recognizes scope by indentation and therefore it would be possible to >do >something like > >gnuplot.newPlot() > ..set("xrange [0:5]") > ..set("yrange [0:20]") > ..newPlot() > ..addSeries("Linear", [1,2,3]) > ..addSeries("Quadratic", [1,4,6]) > ..run() > >or > >template = HtmlTemplate() > ..head() > ..script(src="xyz") > ..body() > ..div(class="main") > ..paragraph("I am the first paragraph") > ..paragraph("I am the second paragraph") > >Note that this is strictly better than method chaining, e.g. > >template = MyObject().setX(1).setY(2) > >because >a) method chaining is hard to write on multiple lines, the only two >options >are > >template = MyObject() \ > .setX(1) \ > .setY(2) > >which is fragile (explicit line continuation is discouraged in Python) >or > >(template = MyObject() > .setX(1) > .setY(2)) > >which looks weird. > >b) method chaining cannot take advantage of multiple scoping, e.g. the >only >way to write Gnuplot example is > >(Gnuplot() > .set("...") > .newPlot() > .addSeries() > .addSeries() > *.endPlot()* > .run() >) > >with added method *endPlot()* to the Plot class. Moreover, Plot class >now >needs a direct reference to its parent (e.g. Gnuplot) so it can return >its >reference. >This adds unnecessary cyclic dependencies. > >c) method chaining needs specialized API, e.g. each method needs to >return >self > >d) method chaining cannot be used with attributes, e.g. there is no >equivalent to > >obj = MyObject() > ..x=1 > ..y=2 > > >Note that this proposal is different from "in" statement [ >https://mail.python.org/pipermail/python-ideas/2012-November/017736.html] >in the sense that this proposal does not bring anything new into the >scope, >e.g. > >obj1 > ..obj2 > ..x = y > >translates to > >tmp1 = obj1 >tmp2 = obj2 >tmp2.x = y > >or (simplified) > >obj1.obj2.x = y > >no matter if obj2 or obj1 contains variable y. > >In my opinion, such cascading operator would greatly help creating >domain >specific languages APIs in Python which are > >1) easily readable. It is better than being lost in a repetitive stream >of >lines looking like obj.x(), obj.y(), obj.z(). Note that repetitive >stream >of lines (apart from being too verbose) also introduces a possibility >of >mistyping the name of the object > >2) easy to implement (no special API considerations as returning self >from >each function call) > >What do you think? > Peter > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Thu Nov 21 13:40:41 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 21 Nov 2013 22:40:41 +1000 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: Message-ID: On 21 November 2013 20:55, Pere??ni Peter wrote: > gnuplot.newPlot() > ..set("xrange [0:5]") > ..set("yrange [0:20]") > ..newPlot() > ..addSeries("Linear", [1,2,3]) > ..addSeries("Quadratic", [1,4,6]) > ..run() If you just want structural grouping of some code, you can already define an appropriate context manager: @contextlib.contextmanager def value(x) yield x with value(gnuplot.newPlot()) as p: p.set("xrange [0:5]") p.set("yrange [0:20]") with value(p.newPlot()) as n: n.addSeries("Linear", [1,2,3]) n.addSeries("Quadratic", [1,4,6]) p.run() It doesn't define a new scope, but it can sometimes help break up a large block of initialisation code (although often a helper function or two may be a better way to do that). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From masklinn at masklinn.net Thu Nov 21 14:34:11 2013 From: masklinn at masklinn.net (Masklinn) Date: Thu, 21 Nov 2013 14:34:11 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: Message-ID: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> On 2013-11-21, at 13:40 , Nick Coghlan wrote: > On 21 November 2013 20:55, Pere??ni Peter wrote: >> gnuplot.newPlot() >> ..set("xrange [0:5]") >> ..set("yrange [0:20]") >> ..newPlot() >> ..addSeries("Linear", [1,2,3]) >> ..addSeries("Quadratic", [1,4,6]) >> ..run() > > If you just want structural grouping of some code, you can already > define an appropriate context manager: > > @contextlib.contextmanager > def value(x) > yield x > > with value(gnuplot.newPlot()) as p: > p.set("xrange [0:5]") > p.set("yrange [0:20]") > with value(p.newPlot()) as n: > n.addSeries("Linear", [1,2,3]) > n.addSeries("Quadratic", [1,4,6]) > p.run() > > It doesn?t define a new scope And it requires naming things. An other drawback is that it is a statement, where the cascading operator yields an expression (at least in Smalltalk it did) (well of course more or less everything was an expression in smalltalk so that helped). I really liked message cascading when I played with it in Smalltalk, but: 1. I find Dart?s cascading syntax rather noisy, and examples on The Internets seem to show it regularly used in single-line expression which I find a detriment to readability: document.body.children.add(new ButtonElement()..id='awesome'..text='Click Me!?;); 2. Unless I missed it, the original suggestion failed to specify what the overall expression returns. In Smalltalk, it returns the value of the last message of the cascade, and Smalltalk has a `yourself` message which returns, well, its self. IIRC, Python has no such message ?built in?. 3. It encourages and facilitates APIs based on object mutation and incompletely initialised objects, the former being a slight dislike and the latter being something I loathe. 4. I doubt OP?s (a) would be fixed, it?s not an issue of attribute deref, so a cascading operator would have the exact same behaviour. From ppershing at gmail.com Thu Nov 21 15:08:20 2013 From: ppershing at gmail.com (=?UTF-8?B?UGVyZcWhw61uaSBQZXRlcg==?=) Date: Thu, 21 Nov 2013 15:08:20 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: On Thu, Nov 21, 2013 at 2:34 PM, Masklinn wrote: > On 2013-11-21, at 13:40 , Nick Coghlan wrote: > > > On 21 November 2013 20:55, Pere??ni Peter wrote: > >> gnuplot.newPlot() > >> ..set("xrange [0:5]") > >> ..set("yrange [0:20]") > >> ..newPlot() > >> ..addSeries("Linear", [1,2,3]) > >> ..addSeries("Quadratic", [1,4,6]) > >> ..run() > > > > If you just want structural grouping of some code, you can already > > define an appropriate context manager: > > > > @contextlib.contextmanager > > def value(x) > > yield x > > > > with value(gnuplot.newPlot()) as p: > > p.set("xrange [0:5]") > > p.set("yrange [0:20]") > > with value(p.newPlot()) as n: > > n.addSeries("Linear", [1,2,3]) > > n.addSeries("Quadratic", [1,4,6]) > > p.run() > > > > It doesn?t define a new scope > > And it requires naming things. > Exactly, point of the proposal is to avoid repetitive naming > > An other drawback is that it is a statement, where the cascading > operator yields an expression (at least in Smalltalk it did) (well of > course more or less everything was an expression in smalltalk so that > helped). > > I really liked message cascading when I played with it in Smalltalk, but: > > 1. I find Dart?s cascading syntax rather noisy, and examples on The > Internets seem to show it regularly used in single-line expression > which I find a detriment to readability: > > document.body.children.add(new > ButtonElement()..id='awesome'..text='Click Me!?;); > Agree, this example is extra bad and unreadable. If we want cascading (and especially nested cascading), I would force cascading operator to be the first token on a new (and indented) line as in my examples > > 2. Unless I missed it, the original suggestion failed to specify what > the overall expression returns. In Smalltalk, it returns the value of > the last message of the cascade, and Smalltalk has a `yourself` message > which returns, well, its self. IIRC, Python has no such message ?built > in?. > > It would return the result of the expression before cascading, e.g. o = MyObject() ..set(x) ..set(y) is a syntactic sugar for tmp = MyObject() tmp.set(x) tmp.set(y) o = tmp Note that we need a get the priority right though, e.g. cascading operator takes precedence over assignment in order to avoid surprises 3. It encourages and facilitates APIs based on object mutation and > incompletely initialised objects, the former being a slight dislike and > the latter being something I loathe. > > Agree. It may lead to a bit more sloppier API design than usual. > 4. I doubt OP?s (a) would be fixed, it?s not an issue of attribute deref, > so > a cascading operator would have the exact same behaviour. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Nov 21 15:19:57 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Nov 2013 01:19:57 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: On Fri, Nov 22, 2013 at 1:08 AM, Pere??ni Peter wrote: > Agree, this example is extra bad and unreadable. If we want cascading (and > especially nested cascading), I would force cascading operator to be the > first token on a new (and indented) line as in my examples In that case, why have a separate cascade operator? o = MyObject() .set(x) .set(y) There is one problem with this syntax, though (whether it's a separate operator or not): it makes parsing a little harder. The previous statement looks complete, and then there's an indented block. Trying to type this at the interactive interpreter will be awkward - there'll need to be a way to tell it "Hey, there's more coming, don't finish yet" even though there's nothing on that opening line that tells it so. Would it be worth putting a colon at the end, as per if/while/etc? o = MyObject(): .set(x) .set(y) The same considerations apply to editors that auto-indent, too; making it clear that there's more to come is, imho, a Good Thing. ChrisA From masklinn at masklinn.net Thu Nov 21 15:56:06 2013 From: masklinn at masklinn.net (Masklinn) Date: Thu, 21 Nov 2013 15:56:06 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: On 2013-11-21, at 15:19 , Chris Angelico wrote: > On Fri, Nov 22, 2013 at 1:08 AM, Pere??ni Peter wrote: >> Agree, this example is extra bad and unreadable. If we want cascading (and >> especially nested cascading), I would force cascading operator to be the >> first token on a new (and indented) line as in my examples > > In that case, why have a separate cascade operator? > > o = MyObject() > .set(x) > .set(y) The primary issue with it is that the API must be crafted specifically for chaining: everything must be done with methods, and methods must return their `self`. Not only does this mean mutator methods which could return something else can?t, it goes against the usual Python grain (at least that of the builtins and standard library) where mutator methods generally return None. Cascading adds ?chaining? to all (mutable) objects without having to alter them or build the API specifically to that end. Or duplicate setattr & setitem via additional methods. But yes, as noted if the situation of ?infix operators line breaks? is not changed, it will also affect chaining (not that I think it?s a big deal to put a chain in parens). From rosuav at gmail.com Thu Nov 21 16:00:40 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Nov 2013 02:00:40 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: On Fri, Nov 22, 2013 at 1:56 AM, Masklinn wrote: > The primary issue with it is that the API must be crafted specifically > for chaining: everything must be done with methods, and methods must > return their `self`. Not only does this mean mutator methods which could > return something else can?t, it goes against the usual Python grain > (at least that of the builtins and standard library) where mutator > methods generally return None. > > Cascading adds ?chaining? to all (mutable) objects without having to > alter them or build the API specifically to that end. Or duplicate > setattr & setitem via additional methods. > > But yes, as noted if the situation of ?infix operators line breaks? is > not changed, it will also affect chaining (not that I think it?s a big > deal to put a chain in parens). I get that. But what I'm saying is that this is needing some clear definitions in terms of indentation and beginnings of lines anyway, so it clearly cannot conflict with method chaining. It can simply use the dot, so it'll look like normal method invocation, but with an indent meaning "same as the previous" - like how a BIND file is often laid out: @ IN SOA .... blah blah .... IN NS ns1.blah.blah IN NS ns2.blah.blah IN MX 10 mail This would have things look pretty much the same: foo.bar(): .quux() .asdf() .qwer() ChrisA From ppershing at gmail.com Thu Nov 21 17:17:53 2013 From: ppershing at gmail.com (=?UTF-8?B?UGVyZcWhw61uaSBQZXRlcg==?=) Date: Thu, 21 Nov 2013 17:17:53 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: I really like the colon to show that the indentation is going to change -- it is Pythonic and consistent with the language plus helps both interactive console and parsers/editors to expect the indentation. However, I would still make cascading operator different from dot just to be more explicit (to avoid confusion of the programmers that do not know about this feature yet -- double dot will indicate that this is a new syntax) On Thu, Nov 21, 2013 at 4:00 PM, Chris Angelico wrote: > On Fri, Nov 22, 2013 at 1:56 AM, Masklinn wrote: > > The primary issue with it is that the API must be crafted specifically > > for chaining: everything must be done with methods, and methods must > > return their `self`. Not only does this mean mutator methods which could > > return something else can?t, it goes against the usual Python grain > > (at least that of the builtins and standard library) where mutator > > methods generally return None. > > > > Cascading adds ?chaining? to all (mutable) objects without having to > > alter them or build the API specifically to that end. Or duplicate > > setattr & setitem via additional methods. > > > > But yes, as noted if the situation of ?infix operators line breaks? is > > not changed, it will also affect chaining (not that I think it?s a big > > deal to put a chain in parens). > > I get that. But what I'm saying is that this is needing some clear > definitions in terms of indentation and beginnings of lines anyway, so > it clearly cannot conflict with method chaining. It can simply use the > dot, so it'll look like normal method invocation, but with an indent > meaning "same as the previous" - like how a BIND file is often laid > out: > > @ IN SOA .... blah blah .... > IN NS ns1.blah.blah > IN NS ns2.blah.blah > IN MX 10 mail > > This would have things look pretty much the same: > > foo.bar(): > .quux() > .asdf() > .qwer() > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Nov 21 17:28:01 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Nov 2013 03:28:01 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: On Fri, Nov 22, 2013 at 3:17 AM, Pere??ni Peter wrote: > However, I would still make cascading operator different from dot just to be > more explicit (to avoid confusion of the programmers that do not know about > this feature yet -- double dot will indicate that this is a new syntax) Good point. I tend to prefer using less syntactic elements rather than more; when there's no reason to distinguish, why distinguish? (Compare C++ with its two different object-member operators, dot for objects and arrow for pointer-to-object. There's no sensible meaning for pointer-dot-token, so why not merge the two operators?) I do see the value here in distinguishing, though. The colon makes parsing unambiguous, as a leading dot has no meaning. I'd say -0 on new syntax (double dot), it might be valuable for clarity but I don't know how necessary it is. ChrisA From markus at unterwaditzer.net Thu Nov 21 17:37:47 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Thu, 21 Nov 2013 17:37:47 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: IMO this: x = MyString \ .replace(blah, blub) \ .lower() And this: x = MyString: .replace(blah, blub) .lower() ...look way too similar and might introduce really bad bugs. So i am +1 on introducing more new syntax for this new feature. "Pere??ni Peter" wrote: >I really like the colon to show that the indentation is going to change >-- >it is Pythonic and consistent with the language plus helps both >interactive >console and parsers/editors to expect the indentation. > >However, I would still make cascading operator different from dot just >to >be more explicit (to avoid confusion of the programmers that do not >know >about this feature yet -- double dot will indicate that this is a new >syntax) > > >On Thu, Nov 21, 2013 at 4:00 PM, Chris Angelico >wrote: > >> On Fri, Nov 22, 2013 at 1:56 AM, Masklinn >wrote: >> > The primary issue with it is that the API must be crafted >specifically >> > for chaining: everything must be done with methods, and methods >must >> > return their `self`. Not only does this mean mutator methods which >could >> > return something else can?t, it goes against the usual Python grain >> > (at least that of the builtins and standard library) where mutator >> > methods generally return None. >> > >> > Cascading adds ?chaining? to all (mutable) objects without having >to >> > alter them or build the API specifically to that end. Or duplicate >> > setattr & setitem via additional methods. >> > >> > But yes, as noted if the situation of ?infix operators line breaks? >is >> > not changed, it will also affect chaining (not that I think it?s a >big >> > deal to put a chain in parens). >> >> I get that. But what I'm saying is that this is needing some clear >> definitions in terms of indentation and beginnings of lines anyway, >so >> it clearly cannot conflict with method chaining. It can simply use >the >> dot, so it'll look like normal method invocation, but with an indent >> meaning "same as the previous" - like how a BIND file is often laid >> out: >> >> @ IN SOA .... blah blah .... >> IN NS ns1.blah.blah >> IN NS ns2.blah.blah >> IN MX 10 mail >> >> This would have things look pretty much the same: >> >> foo.bar(): >> .quux() >> .asdf() >> .qwer() >> >> ChrisA >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Nov 21 18:26:58 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 21 Nov 2013 09:26:58 -0800 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: On Nov 21, 2013, at 5:34, Masklinn wrote: > On 2013-11-21, at 13:40 , Nick Coghlan wrote: > >> On 21 November 2013 20:55, Pere??ni Peter wrote: >>> gnuplot.newPlot() >>> ..set("xrange [0:5]") >>> ..set("yrange [0:20]") >>> ..newPlot() >>> ..addSeries("Linear", [1,2,3]) >>> ..addSeries("Quadratic", [1,4,6]) >>> ..run() >> >> If you just want structural grouping of some code, you can already >> define an appropriate context manager: >> >> @contextlib.contextmanager >> def value(x) >> yield x >> >> with value(gnuplot.newPlot()) as p: >> p.set("xrange [0:5]") >> p.set("yrange [0:20]") >> with value(p.newPlot()) as n: >> n.addSeries("Linear", [1,2,3]) >> n.addSeries("Quadratic", [1,4,6]) >> p.run() >> >> It doesn?t define a new scope > > And it requires naming things. Is it really that hard to name a plot "p"? Is typing "p.", or reading it, more work than ".."? The FAQ already suggests the "just give it a short name" answer (http://docs.python.org/3.3/faq/design.html#why-doesn-t-python-have-a-with-statement-for-attribute-assignments). Nick's suggestion does that plus an indent for readability. Do we actually need more than that? > An other drawback is that it is a statement, where the cascading > operator yields an expression (at least in Smalltalk it did) (well of > course more or less everything was an expression in smalltalk so that > helped). Cascading has to be a statement. It has statements inside it. It uses indentation. If it's an expression it will have all of the same ambiguity problems that a naive multiline lambda proposal does, and break the simplicity of the language. (That also means it can't be an operator. It has to be special syntax, like =.) So if this is any part of the argument for the proposal, I'm -1. > I really liked message cascading when I played with it in Smalltalk, but: > > 1. I find Dart?s cascading syntax rather noisy, and examples on The > Internets seem to show it regularly used in single-line expression > which I find a detriment to readability: > > document.body.children.add(new ButtonElement()..id='awesome'..text='Click Me!?;); > > 2. Unless I missed it, the original suggestion failed to specify what > the overall expression returns. In Smalltalk, it returns the value of > the last message of the cascade, and Smalltalk has a `yourself` message > which returns, well, its self. IIRC, Python has no such message ?built > in?. > > 3. It encourages and facilitates APIs based on object mutation and > incompletely initialised objects, the former being a slight dislike and > the latter being something I loathe. > > 4. I doubt OP?s (a) would be fixed, it?s not an issue of attribute deref, so > a cascading operator would have the exact same behaviour. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Thu Nov 21 18:48:56 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 21 Nov 2013 09:48:56 -0800 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: > Is it really that hard to name a plot "p"? Is typing "p.", or reading it, more work than ".."? Yes, reading `..` can be *considerably* less work than reading `p.`. - With `..` you know exactly where the thing is coming from (the preceding lines) whereas with `p.`, `p` could come from anywhere - when you're eyes are tired you better look closely to make sure you didn't accidentally write `b` which may mean something else in this scope and cause your program to silently malfunction. - You also better make sure you didn't have a variable somewhere else called `p` for Pressure in one of your equations which you just stomped over, or `p` for Momentum, or Price, or Probability. It's not inconceivable that you'd want to plot these things! Generally, having fewer things in the local namespace is good hygiene, and helps prevent name collisions. This is separate from the issue of the proposed syntax (which I don't really like) and whether it's useful enough to warrant special syntax (not sure) On Thu, Nov 21, 2013 at 9:26 AM, Andrew Barnert wrote: > On Nov 21, 2013, at 5:34, Masklinn wrote: > > On 2013-11-21, at 13:40 , Nick Coghlan wrote: > > On 21 November 2013 20:55, Pere??ni Peter wrote: > > gnuplot.newPlot() > > ..set("xrange [0:5]") > > ..set("yrange [0:20]") > > ..newPlot() > > ..addSeries("Linear", [1,2,3]) > > ..addSeries("Quadratic", [1,4,6]) > > ..run() > > > If you just want structural grouping of some code, you can already > > define an appropriate context manager: > > > @contextlib.contextmanager > > def value(x) > > yield x > > > with value(gnuplot.newPlot()) as p: > > p.set("xrange [0:5]") > > p.set("yrange [0:20]") > > with value(p.newPlot()) as n: > > n.addSeries("Linear", [1,2,3]) > > n.addSeries("Quadratic", [1,4,6]) > > p.run() > > > It doesn?t define a new scope > > > And it requires naming things. > > > Is it really that hard to name a plot "p"? Is typing "p.", or reading it, > more work than ".."? > > The FAQ already suggests the "just give it a short name" answer ( > http://docs.python.org/3.3/faq/design.html#why-doesn-t-python-have-a-with-statement-for-attribute-assignments). > Nick's suggestion does that plus an indent for readability. Do we actually > need more than that? > > An other drawback is that it is a statement, where the cascading > operator yields an expression (at least in Smalltalk it did) (well of > course more or less everything was an expression in smalltalk so that > helped). > > > Cascading has to be a statement. It has statements inside it. It uses > indentation. If it's an expression it will have all of the same ambiguity > problems that a naive multiline lambda proposal does, and break the > simplicity of the language. > > (That also means it can't be an operator. It has to be special syntax, > like =.) > > So if this is any part of the argument for the proposal, I'm -1. > > > I really liked message cascading when I played with it in Smalltalk, but: > > 1. I find Dart?s cascading syntax rather noisy, and examples on The > Internets seem to show it regularly used in single-line expression > which I find a detriment to readability: > > document.body.children.add(new > ButtonElement()..id='awesome'..text='Click Me!?;); > > 2. Unless I missed it, the original suggestion failed to specify what > the overall expression returns. In Smalltalk, it returns the value of > the last message of the cascade, and Smalltalk has a `yourself` message > which returns, well, its self. IIRC, Python has no such message ?built > in?. > > 3. It encourages and facilitates APIs based on object mutation and > incompletely initialised objects, the former being a slight dislike and > the latter being something I loathe. > > 4. I doubt OP?s (a) would be fixed, it?s not an issue of attribute deref, > so > a cascading operator would have the exact same behaviour. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Thu Nov 21 20:10:40 2013 From: masklinn at masklinn.net (Masklinn) Date: Thu, 21 Nov 2013 20:10:40 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: <9E6F7DA2-F9EA-4AE2-BF5F-C4EC81B249FE@masklinn.net> On 2013-11-21, at 18:26 , Andrew Barnert wrote: > Is it really that hard to name a plot "p"? Is typing ?p.", or reading it, more work than ".."? What?s the point of anything then? p = gnuplot.newPlot() p.set(?xrange [0:5]?) p.set(?xrange [0:20]?) n = p.newPlot() n.addSeries(?Linear?, [1, 2, 3]) n.addSeries(?Quadratic?, [1, 4, 6]) p.run() is terser and adds no more pressure to the local namespace, the only loss is the pseudo-nested formatting and that?s not really a core goal of cascading. > So if this is any part of the argument for the proposal, I'm -1. As far as there?s any interest to cascading it?s that it?s terser than sequencing calls with explicit receivers, and that it?s an expression allowing inline initialisation sequences and not requiring creating a binding. From ncoghlan at gmail.com Thu Nov 21 23:19:44 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 22 Nov 2013 08:19:44 +1000 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <9E6F7DA2-F9EA-4AE2-BF5F-C4EC81B249FE@masklinn.net> References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> <9E6F7DA2-F9EA-4AE2-BF5F-C4EC81B249FE@masklinn.net> Message-ID: On 22 Nov 2013 05:11, "Masklinn" wrote: > > On 2013-11-21, at 18:26 , Andrew Barnert wrote: > > Is it really that hard to name a plot "p"? Is typing ?p.", or reading it, more work than ".."? > > What?s the point of anything then? > > p = gnuplot.newPlot() > p.set(?xrange [0:5]?) > p.set(?xrange [0:20]?) > n = p.newPlot() > n.addSeries(?Linear?, [1, 2, 3]) > n.addSeries(?Quadratic?, [1, 4, 6]) > p.run() > > is terser and adds no more pressure to the local namespace, the only > loss is the pseudo-nested formatting and that?s not really a core goal > of cascading. If a statement local namespace is the goal, then see PEPs 403 and 3150. Given the problems with those, this far more limited suggestion has even less chance of approval. Cheers, Nick. > > > So if this is any part of the argument for the proposal, I'm -1. > > As far as there?s any interest to cascading it?s that it?s terser > than sequencing calls with explicit receivers, and that it?s an > expression allowing inline initialisation sequences and not requiring > creating a binding. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Thu Nov 21 23:52:58 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 21 Nov 2013 17:52:58 -0500 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: Message-ID: On 11/21/2013 5:55 AM, Pere??ni Peter wrote: > Recently I fell in love with Dart method cascading operator .. to the > degree that I find it really convenient and I am missing it in the Python. > The cascade operator > > myObj > ..setX(5) > ..y=6 > > is the syntactic sugar equivalent of the following: > > tmp = myObj > tmp.setX(5) > tmp.y=6 I far prefer t= t.setX(5) t.y=6 The proposal adds no new functionality. > can make this much more powerful in Python (as opposed to Dart) because > Python recognizes scope by indentation Not true. It recognizes compound statement suites by indentation, following a header line ending with ':'. Name scopes are indicated by class and def statements, and somewhat by comprehensions. > and therefore it would be possible to do something like > > gnuplot.newPlot() > ..set("xrange [0:5]") > ..set("yrange [0:20]") > ..newPlot() > ..addSeries("Linear", [1,2,3]) > ..addSeries("Quadratic", [1,4,6]) > ..run() This does not look like Python at all ;-). -- Terry Jan Reedy From bruce at leapyear.org Thu Nov 21 23:55:24 2013 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 21 Nov 2013 14:55:24 -0800 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: Message-ID: I like the basic idea. I agree with the sentiment that a colon should be used to introduce it. Let me throw out some monkey wrenches. The following uses :: as the syntax to begin the suite, that is: obj:: ..a = 1 is equivalent to temp = obj temp.a = 1 I'm assuming the .. must be followed by a statement (rewritten as above) except in the case where the line ends with :: in which case it must be an expression as in the example: gnuplot.newPlot():: ..newPlot():: ..adjustPlot('x') ### ..adjustPlot('y') is temp1 = gnuplot.newPlot() temp2 = temp1.newPlot() temp2.adjustPlot('x') ### temp1.adjustPlot('y') Is there any way on the line marked ### to reference temp1? Can I use .. in the middle of a statement/expression? obj:: x = ..x y = ..y + ..z ..f(..a, ..b) equivalent to temp = obj x = temp.x y = temp.y + temp.z temp.f(temp.a, temp.b) Is field/method access the only time where this is useful? Can I do this? obj:: .['a'] = 1 .['b'] = 2 equivalent to temp = obj temp['a'] = 1 temp['b'] = 2 or this? obj:: .(1, 2) equivalent to temp = obj obj(1, 2) What about this very common code: self.x = x self.y = y self.z = z + 1 it would be nice to have a shorthand for this that didn't require writing x and y twice, but this doesn't do it for me and I don't have a better suggestion: self:: .. = x .. = y ..z = z + 1 --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Thu Nov 21 23:53:48 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 21 Nov 2013 14:53:48 -0800 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <9E6F7DA2-F9EA-4AE2-BF5F-C4EC81B249FE@masklinn.net> References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> <9E6F7DA2-F9EA-4AE2-BF5F-C4EC81B249FE@masklinn.net> Message-ID: On Nov 21, 2013, at 11:10, Masklinn wrote: > >> So if this is any part of the argument for the proposal, I'm -1. > > As far as there?s any interest to cascading it?s that it?s terser > than sequencing calls with explicit receivers, and that it?s an > expression allowing inline initialisation sequences and not requiring > creating a binding. Are you actually using the word "expression" in it's usual programming-language meaning here? As in, you really do want an expression, not a statement, that has indentation and statements and the like inside of it? From masklinn at masklinn.net Fri Nov 22 00:09:31 2013 From: masklinn at masklinn.net (Masklinn) Date: Fri, 22 Nov 2013 00:09:31 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> <9E6F7DA2-F9EA-4AE2-BF5F-C4EC81B249FE@masklinn.net> Message-ID: <01D1FE83-2238-41E5-A51E-AA95FC343057@masklinn.net> On 2013-11-21, at 23:53 , Andrew Barnert wrote: > On Nov 21, 2013, at 11:10, Masklinn wrote: >>> So if this is any part of the argument for the proposal, I'm -1. >> >> As far as there?s any interest to cascading it?s that it?s terser >> than sequencing calls with explicit receivers, and that it?s an >> expression allowing inline initialisation sequences and not requiring >> creating a binding. > > Are you actually using the word ?expression" in it's usual programming-language meaning here? I fail to see what other sense there would be. From rosuav at gmail.com Fri Nov 22 00:15:36 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Nov 2013 10:15:36 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <01D1FE83-2238-41E5-A51E-AA95FC343057@masklinn.net> References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> <9E6F7DA2-F9EA-4AE2-BF5F-C4EC81B249FE@masklinn.net> <01D1FE83-2238-41E5-A51E-AA95FC343057@masklinn.net> Message-ID: On Fri, Nov 22, 2013 at 10:09 AM, Masklinn wrote: > On 2013-11-21, at 23:53 , Andrew Barnert wrote: >> On Nov 21, 2013, at 11:10, Masklinn wrote: >>>> So if this is any part of the argument for the proposal, I'm -1. >>> >>> As far as there?s any interest to cascading it?s that it?s terser >>> than sequencing calls with explicit receivers, and that it?s an >>> expression allowing inline initialisation sequences and not requiring >>> creating a binding. >> >> Are you actually using the word ?expression" in it's usual programming-language meaning here? > > I fail to see what other sense there would be. An expression, in programming languages, has a value - you can use it as a function parameter, etc. In Python, assignment is NOT an expression, although it kinda looks like one (having two operands and a symbol between them, chained assignment aside). The original proposal definitely worked with a suite of statements and was not restricted to expressions. ChrisA From cs at zip.com.au Fri Nov 22 00:25:02 2013 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 22 Nov 2013 10:25:02 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: <20131121232502.GA8186@cskk.homeip.net> On 21Nov2013 14:34, Masklinn wrote: > On 2013-11-21, at 13:40 , Nick Coghlan wrote: > > > On 21 November 2013 20:55, Pere??ni Peter wrote: > >> gnuplot.newPlot() > >> ..set("xrange [0:5]") > >> ..set("yrange [0:20]") > >> ..newPlot() > >> ..addSeries("Linear", [1,2,3]) > >> ..addSeries("Quadratic", [1,4,6]) > >> ..run() > > > > If you just want structural grouping of some code, you can already > > define an appropriate context manager: > > > > @contextlib.contextmanager > > def value(x) > > yield x > > > > with value(gnuplot.newPlot()) as p: > > p.set("xrange [0:5]") > > p.set("yrange [0:20]") > > with value(p.newPlot()) as n: > > n.addSeries("Linear", [1,2,3]) > > n.addSeries("Quadratic", [1,4,6]) > > p.run() > > > > It doesn?t define a new scope > > And it requires naming things. That's a feature. The last thing I want in a traceback is a recitation of the line: ..foo = 1 Hmm. Setting .foo on... what? Personally, because this is so easy to do with a context manager or even just a simple temporary variable, I'm -1 on the whole cascading idea. Cheers, -- Cameron Simpson The wireless music box has no imaginable commercial value. Who would pay for a message sent to nobody in particular? --David Sarnoff's associates in response to his urgings for investment in the radio in the 1920s. From steve at pearwood.info Fri Nov 22 03:16:39 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Nov 2013 13:16:39 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: Message-ID: <20131122021636.GG2085@ando> On Thu, Nov 21, 2013 at 11:55:46AM +0100, Pere??ni Peter wrote: > Recently I fell in love with Dart method cascading operator .. to the > degree that I find it really convenient and I am missing it in the Python. > The cascade operator > > myObj > ..setX(5) > ..y=6 > > is the syntactic sugar equivalent of the following: > > tmp = myObj > tmp.setX(5) > tmp.y=6 > > This can be used to greatly simplify instantiating objects and can also > enable creating domain specific languages in Python. Please give examples of how this might do such a thing, rather than just claim it will. "This syntax will end world hunger and bring peace to the Middle East..." I don't see how this syntax can simplify instantiating objects: obj = MyClass("spam", "eggs") is already pretty simple. Perhaps you're referring only to a small subset of (in my opinion) *poorly designed* if not outright buggy objects which aren't instantiated completely on creation and need to be tweaked by hand before being ready to use. Or those with excessively complicated APIs that could really do with a few helper functions. In my opinion, we shouldn't encourage classes that require manual instantiation like this: obj = MyClass() obj.foo = "spam" obj.setbar("eggs") obj.make_it_work() # now obj is fully instantiated and ready to be used... not even with your suggested syntax: obj = MyClass() ..foo = "spam" ..setbar("eggs") ..make_it_work() Helping people do the wrong thing is not, in my opinion, an advantage. The use-case of DSLs is perhaps more interesting, but a simple example would go a long way to support the idea. > In particular, we can > make this much more powerful in Python (as opposed to Dart) because Python > recognizes scope by indentation and therefore it would be possible to do > something like > > gnuplot.newPlot() > ..set("xrange [0:5]") > ..set("yrange [0:20]") > ..newPlot() > ..addSeries("Linear", [1,2,3]) > ..addSeries("Quadratic", [1,4,6]) > ..run() I had to read that multiple times before I was able to interpret what this is supposed to mean. It doesn't help that I'm not familiar enough with the gnuplot API to tell exactly what you're doing. I *think* that it would be the equivalent of this: p = gnuplot.newPlot() p.set("xrange [0:5]") p.set("yrange [0:20]") q = p.newPlot() q.addSeries("Linear", [1,2,3]) q.addSeries("Quadratic", [1,4,6]) p.run() If I'm wrong, I think that suggests that your extension to the syntax isn't as clear as you hoped. If I'm right, I wonder what the point of the inner sub-block is, since you don't appear to do anything with the q plot. What have I missed? [...] > Note that this is strictly better than method chaining, e.g. > > template = MyObject().setX(1).setY(2) > > because > a) method chaining is hard to write on multiple lines, the only two options > are [snip explicit line continuation] > (template = MyObject() > .setX(1) > .setY(2)) > > which looks weird. I suggest it only looks weird because you've put the opening bracket in the wrong place. I'd write it like this: template = (MyObject() .setX(1) .setY(2) ) a a variation of such, which apart from an extra set of parentheses is virtually exactly the same as your suggestion: template = MyObject() ..setX(1) ..setY(2) without the addition of a new double-dot syntax. To the extent that this suggested cascading dot operator is just used for method chaining, I'm against it because we can already easily chain methods with no new syntax provided the object supports chaining. To the extent that it adds chaining to those objects which don't support it, for example those with mutator methods that return None, that's a more interesting proposal: mylist = list(some_function(a, b, c)) ..sort() ..reverse() ..pop() # output of pop is discarded ..sort() # still operating on the list But I'm going to suggest that this is still not a good idea, since it goes against the Zen "Explicit is better than implicit". As I understand it, this cascade relies on there being an implicit "last object used", and that can be a bit problematic. For starters, you should explain precisely how the compiler will determine what is the "last object used" in practice. Your examples so far suggest it is based on the last line of source code, but that's tricky. For example: x = spam(); y = eggs() ..method() I would *assume* that this implicitly calls y.method rather than x.method, but that's not a given. How about this? y = spam() x = eggs() del y ..method() Does that call x.method()? And what do this do? myClass() for i in range(3): ..method() Is it a syntax error? Or does it call i.method()? Or perhaps even call method on the range object? Or does the myClass object still count as the implicit "last object used"? You're going to need to specify exactly where and under what circumstances the implicit object is set, and where you can use this implicit object cascading operator. > b) method chaining cannot take advantage of multiple scoping, e.g. the only > way to write Gnuplot example is Since I'm not sure I understand your Gnuplot example above, I'm not going to further comment on it here. > c) method chaining needs specialized API, e.g. each method needs to return > self This is a good argument in support of your proposal. > d) method chaining cannot be used with attributes, e.g. there is no > equivalent to > > obj = MyObject() > ..x=1 > ..y=2 That's a good argument in support of your proposal, but not a terribly powerful argument, since the above is trivially written as: obj = MyObject() obj.x = 1 obj.y = 2 and I for one prefer to see the object being operated on explicitly rather than implicitly. -- Steven From tjreedy at udel.edu Fri Nov 22 04:29:20 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 21 Nov 2013 22:29:20 -0500 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <20131122021636.GG2085@ando> References: <20131122021636.GG2085@ando> Message-ID: On 11/21/2013 9:16 PM, Steven D'Aprano wrote: > On Thu, Nov 21, 2013 at 11:55:46AM +0100, Pere??ni Peter wrote: > I don't see how this syntax can simplify instantiating > objects: > > obj = MyClass("spam", "eggs") > > is already pretty simple. Perhaps you're referring only to a small > subset of (in my opinion) *poorly designed* if not outright buggy > objects which aren't instantiated completely on creation and need to be > tweaked by hand before being ready to use. Or those with excessively > complicated APIs that could really do with a few helper functions. > > In my opinion, we shouldn't encourage classes that require manual > instantiation like this: > > obj = MyClass() > obj.foo = "spam" > obj.setbar("eggs") > obj.make_it_work() > # now obj is fully instantiated and ready to be used... > > not even with your suggested syntax: > > obj = MyClass() > ..foo = "spam" > ..setbar("eggs") > ..make_it_work() > > > Helping people do the wrong thing is not, in my opinion, an advantage. I had the same thought. > The use-case of DSLs is perhaps more interesting, but a simple example > would go a long way to support the idea. > > >> In particular, we can >> make this much more powerful in Python (as opposed to Dart) because Python >> recognizes scope by indentation and therefore it would be possible to do >> something like >> >> gnuplot.newPlot() >> ..set("xrange [0:5]") >> ..set("yrange [0:20]") >> ..newPlot() >> ..addSeries("Linear", [1,2,3]) >> ..addSeries("Quadratic", [1,4,6]) >> ..run() > > I had to read that multiple times before I was able to interpret what > this is supposed to mean. It doesn't help that I'm not familiar enough > with the gnuplot API to tell exactly what you're doing. I *think* that > it would be the equivalent of this: > > p = gnuplot.newPlot() > p.set("xrange [0:5]") > p.set("yrange [0:20]") > q = p.newPlot() > q.addSeries("Linear", [1,2,3]) > q.addSeries("Quadratic", [1,4,6]) > p.run() which I would want to at least partially be equivalent to something like p = gnuplot.newPlot(xrange=(0,5), yrange = (0,20) q = p.newplot(series = (("linear", [1,2,3]), ("quadratic", [1,4,6]))) p.run() -- Terry Jan Reedy From steve at pearwood.info Fri Nov 22 04:45:22 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Nov 2013 14:45:22 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: <20131122034521.GH2085@ando> On Thu, Nov 21, 2013 at 09:48:56AM -0800, Haoyi Li wrote: > > Is it really that hard to name a plot "p"? Is typing "p.", or reading it, > more work than ".."? > > Yes, reading `..` can be *considerably* less work than reading `p.`. > - With `..` you know exactly where the thing is coming from (the preceding > lines) whereas with `p.`, `p` could come from anywhere If this argument were good, it would be an argument against naming in general. But *naming things* is one of the simplest, most powerful techniques for understanding that human beings have in their mental toolbox, so much so that people tend to name things which don't even exist -- "cold", "dark", "death", "silence" etc. all of which are merely the absense of something rather than something. But I digress. Explicitly named variables are nearly always better than implicit, unnamed variables. It's either to talk about "do something with foo" than "do something with the thing that you last used". In order to reason about such an anonymous implicit block, I reckon that most people will mentally translate it into a named variable. When you have an implicit operand: ..method() # implicitly operates on some unnamed operand that's rather like having an (invisible) placeholder variable: (it)..method() where "it" is implicitly defined elsewhere. I've used such a language, Hypertalk, where you can write code like this: get the number of items of field "Counter" put it + 2 into field "Result" Kind of cool, right? I'm not entirely against the idea. But when you have longer blocks of code, it soon becomes painful to keep track of the implicit "it", and named variables become much more sensible. What is a shortcut for *writing* code becomes a longcut for *reading* and *comprehending* code. If you allow multiple levels of double-dot implicit variables, it becomes much harder to track what the implicit invisible variable is at any such time. The indentation helps, I agree, but even so, I expect it will be rather like dealing with one of those conversations where nobody is ever named directly: "Then he said that she told him that she went to the party he threw last weekend and saw her there, and he tells me that he's upset that he tried to hit on her and he reckons that she wasn't doing enough to discourage him, but I spoke to her and she reckons it was just a bit of harmless flirting and he's just being unreasonable, absolutely nothing happened between her and him and if he keeps on like this there's be nothing happening between her and him either..." Consider also Python tracebacks, you'll see something like this: Traceback (most recent call last): File "spam.py", line 123, in main() File "spam.py", line 97, in main func() File "spam.py", line 26, in func ..draw() TypeError: draw() takes at least 1 argument (0 given) In general, I'd much prefer to see the last two lines with an explicitly named variable: my_artist.draw() TypeError: draw() takes at least 1 argument (0 given) but of course that's not available with the .. proposed syntax. > - when you're eyes are tired you better look closely to make sure you > didn't accidentally write `b` which may mean something else in this scope > and cause your program to silently malfunction. "People might use the wrong variable name if they're coding while tired, or drunk, or simply careless" is not a good argument against using explicit variable names. People might type + when they mean -, or .. when they mean a single dot, or a dot when they mean >, or any error at all. I find it ironic that you are defending syntax that looks like a speck of dust on the monitor, or a dead pixel, on behalf of people who have tired eyes. > - You also better make sure you didn't have a variable somewhere else > called `p` for Pressure in one of your equations which you just stomped > over, or `p` for Momentum, or Price, or Probability. It's not inconceivable > that you'd want to plot these things! That's an argument against using dumb variable names, not an argument for using variable names. > Generally, having fewer things in the local namespace is good hygiene, and > helps prevent name collisions. This is separate from the issue of the > proposed syntax (which I don't really like) and whether it's useful enough > to warrant special syntax (not sure) This at least I agree with. -- Steven From haoyi.sg at gmail.com Fri Nov 22 05:44:14 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 21 Nov 2013 20:44:14 -0800 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <20131122034521.GH2085@ando> References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> <20131122034521.GH2085@ando> Message-ID: > But *naming things* is one of the simplest, most powerful techniques for understanding that human beings have in their mental toolbox On a philosophical level, there's another simple and powerful technique: putting things somewhere. Many things are defined entirely by their location and only incidentally by their name. The city center isn't defined by the thing called Main Street, but by some vague notion of "there". > Consider also Python tracebacks, you'll see something like this: Here's somewhere where I think we disagree. When I see tracebacks, I don't care what the snippet that caused it looks like in the traceback. I care about the line number and file name (I don't know how long you spend looking at tracebacks before going to the file/line). This is my point, that often the location alone is enough, and the explicit name doesn't help all that much. > I expect it will be rather like dealing with one of those conversations where nobody is ever named directly: Have you tried reading an essay/paper which doesn't use he/she/it? I have, and it melts the eyes. Explicitly naming everything instead of using "it" sometimes does *not* make things more clear! > People might type + when they mean -, or .. when they mean a single dot, or a dot when they mean >, or any error at all. Just because any error has a non-zero change of happening doesn't mean they're all equally probably. In particular, confusing variable names with some language-blessed syntax is almost unheard of. Nobody accidentally shadows *self* in python, or *this* in java, or "_" in scala. Yes, they can, but it just doesn't happen. n != 0, m != 0 doesn't mean n == m > That's an argument against using dumb variable names, not an argument for using variable names. We're talking in circles though: - You should name all your variables because naming them short isn't harder than not-naming them - We should name them long, because short names are dumb - We should name them short, because long names are verbose and obscure the logic. - GOTO 1 People keep saying "you should do this" and "you should do that", and of course both the suggestions can solve all problems, except that you can't possibly apply both suggestions at the same time -.- On Thu, Nov 21, 2013 at 7:45 PM, Steven D'Aprano wrote: > On Thu, Nov 21, 2013 at 09:48:56AM -0800, Haoyi Li wrote: > > > Is it really that hard to name a plot "p"? Is typing "p.", or reading > it, > > more work than ".."? > > > > Yes, reading `..` can be *considerably* less work than reading `p.`. > > - With `..` you know exactly where the thing is coming from (the > preceding > > lines) whereas with `p.`, `p` could come from anywhere > > If this argument were good, it would be an argument against naming in > general. But *naming things* is one of the simplest, most powerful > techniques for understanding that human beings have in their mental > toolbox, so much so that people tend to name things which don't even > exist -- "cold", "dark", "death", "silence" etc. all of which are merely > the absense of something rather than something. > > But I digress. Explicitly named variables are nearly always better than > implicit, unnamed variables. It's either to talk about "do something > with foo" than "do something with the thing that you last used". In > order to reason about such an anonymous implicit block, I reckon that > most people will mentally translate it into a named variable. > > When you have an implicit operand: > > ..method() # implicitly operates on some unnamed operand > > that's rather like having an (invisible) placeholder variable: > > (it)..method() > > where "it" is implicitly defined elsewhere. I've used such a language, > Hypertalk, where you can write code like this: > > get the number of items of field "Counter" > put it + 2 into field "Result" > > Kind of cool, right? I'm not entirely against the idea. But when you > have longer blocks of code, it soon becomes painful to keep track of the > implicit "it", and named variables become much more sensible. What is a > shortcut for *writing* code becomes a longcut for *reading* and > *comprehending* code. > > If you allow multiple levels of double-dot implicit variables, it > becomes much harder to track what the implicit invisible variable is > at any such time. The indentation helps, I agree, but even so, I expect > it will be rather like dealing with one of those conversations where > nobody is ever named directly: > > "Then he said that she told him that she went to the party he > threw last weekend and saw her there, and he tells me that he's > upset that he tried to hit on her and he reckons that she wasn't > doing enough to discourage him, but I spoke to her and she > reckons it was just a bit of harmless flirting and he's just > being unreasonable, absolutely nothing happened between her and > him and if he keeps on like this there's be nothing happening > between her and him either..." > > > Consider also Python tracebacks, you'll see something like this: > > Traceback (most recent call last): > File "spam.py", line 123, in > main() > File "spam.py", line 97, in main > func() > File "spam.py", line 26, in func > ..draw() > TypeError: draw() takes at least 1 argument (0 given) > > In general, I'd much prefer to see the last two lines with an explicitly > named variable: > > my_artist.draw() > TypeError: draw() takes at least 1 argument (0 given) > > > but of course that's not available with the .. proposed syntax. > > > > > - when you're eyes are tired you better look closely to make sure you > > didn't accidentally write `b` which may mean something else in this scope > > and cause your program to silently malfunction. > > "People might use the wrong variable name if they're coding while tired, > or drunk, or simply careless" is not a good argument against using > explicit variable names. People might type + when they mean -, or .. > when they mean a single dot, or a dot when they mean >, or any error at > all. > > I find it ironic that you are defending syntax that looks like a speck > of dust on the monitor, or a dead pixel, on behalf of people who have > tired eyes. > > > > - You also better make sure you didn't have a variable somewhere else > > called `p` for Pressure in one of your equations which you just stomped > > over, or `p` for Momentum, or Price, or Probability. It's not > inconceivable > > that you'd want to plot these things! > > That's an argument against using dumb variable names, not an argument > for using variable names. > > > > Generally, having fewer things in the local namespace is good hygiene, > and > > helps prevent name collisions. This is separate from the issue of the > > proposed syntax (which I don't really like) and whether it's useful > enough > > to warrant special syntax (not sure) > > This at least I agree with. > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Fri Nov 22 06:22:03 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 22 Nov 2013 08:22:03 +0300 Subject: [Python-ideas] CC0 for Python Documentation Message-ID: Adding python-legal-sig at python.org to CC. Please, follow up on python-ideas. CC0 is a way to free public works from legal burden: https://creativecommons.org/about/cc0 Here is the reasoning why people do this: https://creativecommons.org/tag/cc0 At first I thought about CC-BY, but then realized that no authorship is respected. As you may see here - http://docs.python.org/3/copyright.html - PSF is the sole owner of the docs with no reference to the work of people who have contributed. No wonder that there is not much motivation to collaborate. So, given all the above I'd like to propose using CC0 for Python documentation. Benefits: - you don't need to ask PSF for permissions and clarification of your rights - you can still count and credit contributions regardless of is there is copyright signature of the owner or not - this also makes it clear that docs are from community for community, you can fork and enhance - you don't have to sign exclusive CLA to make edits to documentation - you don't have to supply huge license file if you copy/paste relevant pieces from the docs Now the questions that needs to be answered. PSF is made to protect Python. How sitting on top of Python documentation copyright helps it to do so? What are consequences if Python Documentation is released with CC0 license? Do you think it will hurt Python? If yes, then how? Do you think that current CLA is impediment for contributing patches to documentation? Do you think that using CC0 will increase contributions and tools for working with Python docs? Do you think that current situation is better? Do you think that CC-BY is better? Do you think that CC-BY-SA is better? It looks like a poll. Maybe PSF should create one? -- anatoly t. From rosuav at gmail.com Fri Nov 22 06:43:18 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 22 Nov 2013 16:43:18 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> <20131122034521.GH2085@ando> Message-ID: On Fri, Nov 22, 2013 at 3:44 PM, Haoyi Li wrote: > - You should name all your variables because naming them short isn't harder > than not-naming them Converse to this: Sometimes it's a LOT clearer to show that you're working twice with the same object than to show exactly what object you're working with each time. Consider it a case of DRY. Chaining methods shows that you're doing three things to one object; naming the object each time separates them. Both have their uses. ChrisA From steve at pearwood.info Fri Nov 22 07:05:56 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Nov 2013 17:05:56 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> <20131122034521.GH2085@ando> Message-ID: <20131122060555.GI2085@ando> On Thu, Nov 21, 2013 at 08:44:14PM -0800, Haoyi Li wrote: > > But *naming things* is one of the simplest, most powerful techniques for > understanding that human beings have in their mental toolbox > > On a philosophical level, there's another simple and powerful technique: > putting things somewhere. Many things are defined entirely by their > location and only incidentally by their name. The city center isn't defined > by the thing called Main Street, but by some vague notion of "there". Perhaps, but that's not relevant to the proposed syntax. The proposed syntax has more to do with "then" rather than "there", that is, "the last object used", not "the object at location 1234". > > Consider also Python tracebacks, you'll see something like this: > > Here's somewhere where I think we disagree. When I see tracebacks, I don't > care what the snippet that caused it looks like in the traceback. I care > about the line number and file name (I don't know how long you spend > looking at tracebacks before going to the file/line). This is my point, > that often the location alone is enough, and the explicit name doesn't help > all that much. Funny you say that. I hardly ever pay attention to the line number in tracebacks. Most of the time, seeing the immediate call and the relevant line of code is enough for me to zero in on the relevant section ("ah, that failure is in the foo function...."). Almost the only time I care about the line number is when debugging code that I'm not familiar with. But that's okay. I'm not proposing that we strip line numbers just because they're of minimal use to me. I understand that they're of use to some people, and even of me sometimes. So line numbers help in debugging. So does printing the actual line of source code. In the interactive interpreter, where it is not available, I often miss it. Consider: gunslinger.draw() # looks obviously correct artist.draw() # looks obviously incorrect, a drawing needs a subject but: ..draw() # no idea One of the costs, and in my opinion a severe cost, of this proposed syntax is that it encourages a style of writing code which is optimized for writing instead of reading. We write code much more often than we write it. > > People might type + when they mean -, or .. when they mean a single dot, > or a dot when they mean >, or any error at all. > > Just because any error has a non-zero change of happening doesn't mean > they're all equally probably. In particular, confusing variable names with > some language-blessed syntax is almost unheard of. You've deleted the context of my objection. You objected to the obvious solution of a temporary variable name (say, p) on the basis that when the coder is tired, he might write: p = MyClass() p.spam() b.eggs() by mistake. True, but he also might write: p = MyClass( ..spam() .eggs() # and somewhere further down the code base ) The "tired programmer" objection can be applied to *anything*. I've written . instead of , when tired, **args instead of *args, length(mylist) instead of len(mylist), and just about every error a programmer can make. Out of the infinity of possible errors a tired person might make, why single out the risk of misusing a named temp variable? > Nobody accidentally > shadows *self* in python, I've seen plenty of code that shadows self. > We're talking in circles though: > > - You should name all your variables because naming them short isn't harder > than not-naming them > - We should name them long, because short names are dumb No. It depends on the name. "for i in range(100)" is fine, i as an integer variable is fine. i = MyClass() is not. Dumb names are dumb because they are misleading, not because they're short. > - We should name them short, because long names are verbose and obscure the > logic. I've certainly never made that argument. It is possible to have names which are too long, but that's a style issue. Don't call something the_variable_holding_an_integer_value when "myint" or even "i" will do. -- Steven From ppershing at gmail.com Fri Nov 22 09:23:29 2013 From: ppershing at gmail.com (=?UTF-8?B?UGVyZcWhw61uaSBQZXRlcg==?=) Date: Fri, 22 Nov 2013 09:23:29 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <20131122021636.GG2085@ando> References: <20131122021636.GG2085@ando> Message-ID: Thanks Steve for very useful comments > Please give examples of how this might do such a thing, rather than just > claim it will. "This syntax will end world hunger and bring peace to the > Middle East..." I don't see how this syntax can simplify instantiating > objects: > > obj = MyClass("spam", "eggs") > > is already pretty simple. Perhaps you're referring only to a small > subset of (in my opinion) *poorly designed* if not outright buggy > objects which aren't instantiated completely on creation and need to be > tweaked by hand before being ready to use. Or those with excessively > complicated APIs that could really do with a few helper functions. > I do not agree that all objects that are mutable are poorly designed. Anything which is tree-like (e.g. mentioned DSLs, API wrappers around complicated objects (e.g. xml), configurations, gnuplot example, ...) can be difficult to instantiate using normal syntax: tree = Tree() ..newChild(value=5) ..newChild(value=3) ..newChild() ..newChild(color="green") Now, that said, sometimes you can turn this instatiation to be more Pythonic, e.g. (using lists instead of repetitive calls of "addSomething") but it is hard to visually parse (distinguish between end of constructors vs end of lists): tree = Tree(children=[ Node(value=5, children=[ Node(value=3), Node() ]), Node(color="green") ]) Alternative is to do inverse of instantiation, e.g. tmp1 = [Node(value=3), Node()] tmp2 = [Node(value=5, children=tmp1), Node(color="green")] tree = Tree(children=tmp2) which is hard to comprehend what is going on there. > > In particular, we can > > make this much more powerful in Python (as opposed to Dart) because > Python > > recognizes scope by indentation and therefore it would be possible to do > > something like > > > > gnuplot.newPlot() > > ..set("xrange [0:5]") > > ..set("yrange [0:20]") > > ..newPlot() > > ..addSeries("Linear", [1,2,3]) > > ..addSeries("Quadratic", [1,4,6]) > > ..run() > > I had to read that multiple times before I was able to interpret what > this is supposed to mean. It doesn't help that I'm not familiar enough > with the gnuplot API to tell exactly what you're doing. I *think* that > it would be the equivalent of this: > > p = gnuplot.newPlot() > p.set("xrange [0:5]") > p.set("yrange [0:20]") > q = p.newPlot() > q.addSeries("Linear", [1,2,3]) > q.addSeries("Quadratic", [1,4,6]) > p.run() > > Yes, this is correct > If I'm wrong, I think that suggests that your extension to the syntax > isn't as clear as you hoped. If I'm right, I wonder what the point of > the inner sub-block is, since you don't appear to do anything with the q > plot. What have I missed? > I apologize for confusion. maybe newPlot() isn't the best name. In my head the newPlot() function would - add a new Plot to the Gnuplot object - return this Plot object so you can work with it (e.g. add series, customize, etc.) When I reflect about this example, the p.set("xrange") is also a bit misleading to people not familiar with gnuplot -- gnuplot has "set" command and p.set("xrange [0:1]") was meant to imitate writing "set xrange[0:1]" line in gnuplot > For starters, you should explain precisely how the compiler will > determine what is the "last object used" in practice. Your examples > so far suggest it is based on the last line of source code, but > that's tricky. For example: > > x = spam(); y = eggs() > ..method() > > I would *assume* that this implicitly calls y.method rather than > x.method, but that's not a given. How about this? > I would assume the same > > y = spam() > x = eggs() > del y > ..method() > > Does that call x.method()? And what do this do? > > I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) > myClass() > for i in range(3): > ..method() > > Is it a syntax error? Or does it call i.method()? Or perhaps even call > method on the range object? Or does the myClass object still count as > the implicit "last object used"? > I think this should be a syntax error. (as it anyway defeats the purpose of making things clear) -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Fri Nov 22 10:32:48 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 22 Nov 2013 18:32:48 +0900 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: <8761rkg4sf.fsf@uwakimon.sk.tsukuba.ac.jp> Haoyi Li writes: >>?Is it really that hard to name a plot "p"? Is typing "p.", >> or reading it, more work than ".."? > Yes, reading `..` can be *considerably* less work than reading `p.`.? Unless you're Tim Peters, in which case ".." looks like grit on your screen. Surely we wouldn't want *that*! Seriously, as far as I can see it's also possible that reading ".." could be much /more? work than reading "p.", especially if ".." can be nested. You may need to parse backward all the way to the beginning of the expression (suite?) to figure out what it refers to. > - You also better make sure you didn't have a variable somewhere else > called `p` for Pressure in one of your equations which you just > stomped over, or `p` for Momentum, or Price, or Probability. It's not > inconceivable that you'd want to plot these things! Well, that's *really* bad style. Such programmers can't be helped. They'll find a way to abuse "..", too. For example, people often recommend reserving something like "_" (or perhaps all single letter identifiers) only in this kind of context where it's used repeatedly in each of a suite of contiguous lines. With such a convention (especially once it becomes habit), the kind of problem you describe is just not going to arise. > Generally, having fewer things in the local namespace is good hygiene, > and helps prevent name collisions. True, but that is easy enough to get by defining a function. Steve From ppershing at gmail.com Fri Nov 22 10:32:13 2013 From: ppershing at gmail.com (=?UTF-8?B?UGVyZcWhw61uaSBQZXRlcg==?=) Date: Fri, 22 Nov 2013 10:32:13 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <20131122021636.GG2085@ando> Message-ID: On Fri, Nov 22, 2013 at 10:16 AM, Bruce Leban wrote: > > Steven D'Aprano: > > >> Perhaps, but that's not relevant to the proposed syntax. The proposed > syntax has more to do with "then" rather than "there", that is, "the > last object used", not "the object at location 1234". > > >> For starters, you should explain precisely how the compiler will > determine what is the "last object used" in practice. Your examples so far > suggest it is based on the last line of source code, but that's tricky. > > Pere??ni Peter: > > > I think the reasonable way to define it would be that the cascade > operator will act only if the last thing (on the previous line) was an > expression. In particular, all statements like "del y, for, while, if-else" > followed by a cascade should be a parse error with the exception of the > assignment which is a statement but we want to allow it. I am not sure > about yield -- it can be a bit difficult to agree on the exact semantic > there (e.g. should I yield first, then continue with the cascade or should > I execute the whole cascade and yield as a final step?) > > I don't think that's well-defined at all. Furthermore last object on > *previous* line is less useful than object on line that introduces the > suite. Consider: > > obj:: > ..x = b > # is last object b? That's not useful. obj.x? That may not be an object > ..y = 3 > # even more clearly not useful > ..sort() > # this always returns None > > In: > > :: > > Yes, exactly. As usual I did not write my thoughts clearly. By the last line I really meant "last object on the preceding line ending with :: having the smaller indent than the current line" > I know exactly what that does without reading each intervening line which > could change the object being operated on. I can use this with objects that > *don't* support chaining which is what makes it useful. I can reorder lines > worrying only about method execution order not breaking chaining. > > The traceback complaint is a red herring as it can easily inject the line > that has the expression being operated on. > > In summary I would only allow the first line to contain an expression or > assignment which by definition has a single explicit value. That values is > what all statements reference: > > obj:: > ..x = 3 > ..y() # operates on obj not obj.x or 3 > ..z(). # ditto > > a[index] = foo(...) :: # don't need to repeat a[index] below or use a > temporary name > ..x = 3 > ..y() > ..z() > >> >> I think the reasonable way to define it would be that the cascade >> operator will act only if the last thing (on the previous line) was an >> expression. In particular, all statements like "del y, for, while, if-else" >> followed by a cascade should be a parse error with the exception of the >> assignment which is a statement but we want to allow it. I am not sure >> about yield -- it can be a bit difficult to agree on the exact semantic >> there (e.g. should I yield first, then continue with the cascade or should >> I execute the whole cascade and yield as a final step?) >> >> > Actually, thinking about my own ideas -- the bigger problem than yield is the return. I am not sure if things like return list(range(10)) ..reverse() ..pop(0) should be allowed or not - having something being executed after the return statement line might be a bit confusing -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Fri Nov 22 10:16:48 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 22 Nov 2013 01:16:48 -0800 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <20131122021636.GG2085@ando> Message-ID: Steven D'Aprano: >> Perhaps, but that's not relevant to the proposed syntax. The proposed syntax has more to do with "then" rather than "there", that is, "the last object used", not "the object at location 1234". >> For starters, you should explain precisely how the compiler will determine what is the "last object used" in practice. Your examples so far suggest it is based on the last line of source code, but that's tricky. Pere??ni Peter: > I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) I don't think that's well-defined at all. Furthermore last object on *previous* line is less useful than object on line that introduces the suite. Consider: obj:: ..x = b # is last object b? That's not useful. obj.x? That may not be an object ..y = 3 # even more clearly not useful ..sort() # this always returns None In: :: I know exactly what that does without reading each intervening line which could change the object being operated on. I can use this with objects that *don't* support chaining which is what makes it useful. I can reorder lines worrying only about method execution order not breaking chaining. The traceback complaint is a red herring as it can easily inject the line that has the expression being operated on. In summary I would only allow the first line to contain an expression or assignment which by definition has a single explicit value. That values is what all statements reference: obj:: ..x = 3 ..y() # operates on obj not obj.x or 3 ..z(). # ditto a[index] = foo(...) :: # don't need to repeat a[index] below or use a temporary name ..x = 3 ..y() ..z() On Nov 22, 2013 12:24 AM, "Pere??ni Peter" wrote: > Thanks Steve for very useful comments > > >> Please give examples of how this might do such a thing, rather than just >> claim it will. "This syntax will end world hunger and bring peace to the >> Middle East..." I don't see how this syntax can simplify instantiating >> objects: >> >> obj = MyClass("spam", "eggs") >> >> is already pretty simple. Perhaps you're referring only to a small >> subset of (in my opinion) *poorly designed* if not outright buggy >> objects which aren't instantiated completely on creation and need to be >> tweaked by hand before being ready to use. Or those with excessively >> complicated APIs that could really do with a few helper functions. >> > > I do not agree that all objects that are mutable are poorly designed. > Anything which is tree-like (e.g. mentioned DSLs, API wrappers around > complicated objects (e.g. xml), configurations, gnuplot example, ...) can > be difficult to instantiate using normal syntax: > > tree = Tree() > ..newChild(value=5) > ..newChild(value=3) > ..newChild() > ..newChild(color="green") > > > Now, that said, sometimes you can turn this instatiation to be more > Pythonic, e.g. (using lists instead of repetitive calls of "addSomething") > but it is hard to visually parse (distinguish between end of constructors > vs end of lists): > > tree = Tree(children=[ > Node(value=5, children=[ > Node(value=3), > Node() > ]), > Node(color="green") > ]) > > Alternative is to do inverse of instantiation, e.g. > > tmp1 = [Node(value=3), > Node()] > tmp2 = [Node(value=5, children=tmp1), > Node(color="green")] > tree = Tree(children=tmp2) > > which is hard to comprehend what is going on there. > > >> > In particular, we can >> > make this much more powerful in Python (as opposed to Dart) because >> Python >> > recognizes scope by indentation and therefore it would be possible to do >> > something like >> > >> > gnuplot.newPlot() >> > ..set("xrange [0:5]") >> > ..set("yrange [0:20]") >> > ..newPlot() >> > ..addSeries("Linear", [1,2,3]) >> > ..addSeries("Quadratic", [1,4,6]) >> > ..run() >> >> I had to read that multiple times before I was able to interpret what >> this is supposed to mean. It doesn't help that I'm not familiar enough >> with the gnuplot API to tell exactly what you're doing. I *think* that >> it would be the equivalent of this: >> >> p = gnuplot.newPlot() >> p.set("xrange [0:5]") >> p.set("yrange [0:20]") >> q = p.newPlot() >> q.addSeries("Linear", [1,2,3]) >> q.addSeries("Quadratic", [1,4,6]) >> p.run() >> >> Yes, this is correct > > >> If I'm wrong, I think that suggests that your extension to the syntax >> isn't as clear as you hoped. If I'm right, I wonder what the point of >> the inner sub-block is, since you don't appear to do anything with the q >> plot. What have I missed? >> > > I apologize for confusion. maybe newPlot() isn't the best name. In my head > the newPlot() function would > - add a new Plot to the Gnuplot object > - return this Plot object so you can work with it (e.g. add series, > customize, etc.) > > When I reflect about this example, the p.set("xrange") is also a bit > misleading to people not familiar with gnuplot -- gnuplot has "set" command > and p.set("xrange [0:1]") was meant to imitate writing "set xrange[0:1]" > line in gnuplot > > > > >> For starters, you should explain precisely how the compiler will >> determine what is the "last object used" in practice. Your examples >> so far suggest it is based on the last line of source code, but >> that's tricky. For example: >> >> x = spam(); y = eggs() >> ..method() >> >> I would *assume* that this implicitly calls y.method rather than >> x.method, but that's not a given. How about this? >> > > I would assume the same > > >> >> y = spam() >> x = eggs() >> del y >> ..method() >> >> Does that call x.method()? And what do this do? >> >> > I think the reasonable way to define it would be that the cascade operator > will act only if the last thing (on the previous line) was an expression. > In particular, all statements like "del y, for, while, if-else" followed by > a cascade should be a parse error with the exception of the assignment > which is a statement but we want to allow it. I am not sure about yield -- > it can be a bit difficult to agree on the exact semantic there (e.g. should > I yield first, then continue with the cascade or should I execute the whole > cascade and yield as a final step?) > > >> myClass() >> for i in range(3): >> ..method() >> >> Is it a syntax error? Or does it call i.method()? Or perhaps even call >> method on the range object? Or does the myClass object still count as >> the implicit "last object used"? >> > > I think this should be a syntax error. (as it anyway defeats the purpose > of making things clear) > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Nov 22 12:33:09 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 22 Nov 2013 22:33:09 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <20131122021636.GG2085@ando> Message-ID: <20131122113308.GJ2085@ando> On Fri, Nov 22, 2013 at 10:32:13AM +0100, Pere??ni Peter wrote: > Actually, thinking about my own ideas -- the bigger problem than yield is > the return. I am not sure if things like > > return list(range(10)) > ..reverse() > ..pop(0) > > should be allowed or not - having something being executed after the return > statement line might be a bit confusing If Python gains this syntax, I don't see why that shouldn't be allowed. That some of the lines of code occur physically after the return keyword is no more a problem than here: return (spam() + eggs() + toast() + milk() + cookies() ) In your case, the suite list(range(10)) ..reverse() ..pop(0) is evaluated first, and then returned. -- Steven From mal at egenix.com Fri Nov 22 14:10:14 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 22 Nov 2013 14:10:14 +0100 Subject: [Python-ideas] CC0 for Python Documentation In-Reply-To: References: Message-ID: <528F57B6.5010302@egenix.com> On 22.11.2013 06:22, anatoly techtonik wrote: > Adding python-legal-sig at python.org to CC. Please, > follow up on python-ideas. Please don't cross post. > CC0 is a way to free public works from legal burden: > https://creativecommons.org/about/cc0 > > Here is the reasoning why people do this: > https://creativecommons.org/tag/cc0 > > At first I thought about CC-BY, but then realized that no > authorship is respected. As you may see here - > http://docs.python.org/3/copyright.html - PSF is the sole > owner of the docs with no reference to the work of people > who have contributed. No wonder that there is not much > motivation to collaborate. The documentation is distributed under the same license terms as Python itself. Credits are included in the Misc/ACKS file and the patch history is both on the tracker and the Mercurial log. We don't treat documentation as separate from the code itself. Both go together hand in hand. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 22 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From techtonik at gmail.com Fri Nov 22 15:10:22 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 22 Nov 2013 17:10:22 +0300 Subject: [Python-ideas] CC0 for Python Documentation In-Reply-To: <528F57B6.5010302@egenix.com> References: <528F57B6.5010302@egenix.com> Message-ID: On Fri, Nov 22, 2013 at 4:10 PM, M.-A. Lemburg wrote: > >> CC0 is a way to free public works from legal burden: >> https://creativecommons.org/about/cc0 >> >> Here is the reasoning why people do this: >> https://creativecommons.org/tag/cc0 >> >> At first I thought about CC-BY, but then realized that no >> authorship is respected. As you may see here - >> http://docs.python.org/3/copyright.html - PSF is the sole >> owner of the docs with no reference to the work of people >> who have contributed. No wonder that there is not much >> motivation to collaborate. > > The documentation is distributed under the same license > terms as Python itself. Credits are included in the > Misc/ACKS file and the patch history is both on the > tracker and the Mercurial log. This information is not accessible. Nobody knows where this Misc/ACKS file is located and nobody will go look for it. On the other hand, clicking copyright string is an easy action. Mercurial log also only makes sense if it is analysed http://www.red-bean.com/svnproject/contribulyzer/ > We don't treat documentation as separate from the code itself. > Both go together hand in hand. That's good only for reference part. All other parts are largely outdated, incomplete, lack tutorials and examples. That happens, because docs are written by coders the same way as code, and not by users for users. -- anatoly t. From solipsis at pitrou.net Fri Nov 22 16:02:21 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 16:02:21 +0100 Subject: [Python-ideas] CC0 for Python Documentation References: <528F57B6.5010302@egenix.com> Message-ID: <20131122160221.79690e99@fsol> On Fri, 22 Nov 2013 14:10:14 +0100 "M.-A. Lemburg" wrote: > > > CC0 is a way to free public works from legal burden: > > https://creativecommons.org/about/cc0 > > > > Here is the reasoning why people do this: > > https://creativecommons.org/tag/cc0 > > > > At first I thought about CC-BY, but then realized that no > > authorship is respected. As you may see here - > > http://docs.python.org/3/copyright.html - PSF is the sole > > owner of the docs with no reference to the work of people > > who have contributed. No wonder that there is not much > > motivation to collaborate. > > The documentation is distributed under the same license > terms as Python itself. Credits are included in the > Misc/ACKS file and the patch history is both on the > tracker and the Mercurial log. Anatoly has a point, though: why does the doc claim Python is "copyright PSF" while that is not true - and the LICENSE file doesn't make any such claim? (uh, I would prefer the followup to have been on the legal SIG - python-ideas is pretty much off-topic for this :-() Regards Antoine. From Andy.Henshaw at gtri.gatech.edu Fri Nov 22 16:24:22 2013 From: Andy.Henshaw at gtri.gatech.edu (Henshaw, Andy) Date: Fri, 22 Nov 2013 15:24:22 +0000 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: Message-ID: <97eeac3782ce4b7ba20f6e05332c9b22@APATLISDMAIL02.core.gtri.org> For what it's worth, if this idea ever gets to the bikeshedding phase, I think the following looks pretty nice in my editor: gnuplot.newPlot():: |set("xrange [0:5]") |set("yrange [0:20]") |newPlot():: |addSeries("Linear", [1,2,3]) |addSeries("Quadratic", [1,4,6]) |run() -- Andy Henshaw -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Fri Nov 22 16:44:34 2013 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 23 Nov 2013 02:44:34 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <97eeac3782ce4b7ba20f6e05332c9b22@APATLISDMAIL02.core.gtri.org> References: <97eeac3782ce4b7ba20f6e05332c9b22@APATLISDMAIL02.core.gtri.org> Message-ID: On Sat, Nov 23, 2013 at 2:24 AM, Henshaw, Andy wrote: > For what it?s worth, if this idea ever gets to the bikeshedding phase... It gets to that phase the moment it's posted on python-ideas. Have at it! ChrisA From greg at krypto.org Fri Nov 22 17:45:50 2013 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 22 Nov 2013 08:45:50 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: On Wed, Nov 20, 2013 at 7:44 PM, Nick Coghlan wrote: > > On 21 Nov 2013 13:02, "Ethan Furman" wrote: > > > > On 11/20/2013 05:57 PM, Andrew Barnert wrote: > > > >> On Nov 20, 2013, at 12:14, Eric Snow wrote: > >> > >>> In contrast, something like __metamodule__ would be an effective > >>> replacement. It would be similar in spirit and in syntax to > >>> __init_class__ in PEP 422 (and __metaclass__ in Python 2), defined at > >>> the top of the module and used for the module. The thing that appeals > >>> to me is that we could deprecate the sys.modules hack. :) > >> > >> > >> Given that __metaclass__ was removed in Python 3, doesn't "this is an > >> exact parallel to __metaclass__" argue against the idea, rather than > >> for? Or at least against the name? (Maybe __init_module__?) > >> > >> Anyway, I think a module replacing itself with something callable is > >> both more flexible and more in line with the way people are actually > >> doing things today, so maybe a "less hacky" way to do the sys.modules > >> hack is what people actually want here. > > > > > > sys.modules is a dictionary. > > > > dict[name] = something > > > > is the normal way to set values to keys. > > > > What is so horrible about this idiom? > > It potentially causes problems for module reloading and it definitely > causes problems for the import engine PEP (since sys.modules is process > global state). > > I expect we'll revisit this later in the 3.5 development cycle (we haven't > even merged the accepted PEP 451 for 3.4 yet), but formalising the current > module replacement idiom is already a more likely outcome than making > module instances callable. > Agreed, formalizing how to do the replacement trick sounds good. It'd be ideal for a module to not need to know its own name within the code doing the replacement and for it to not need to reimplement common interface bits in a replacement class so that it quacks like a module. Making it callable? well, that does just seem silly so I'm not actually worried about making that specifically easier itself. > Cheers, > Nick. > > > > > -- > > ~Ethan~ > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Fri Nov 22 18:01:17 2013 From: fuzzyman at gmail.com (Michael Foord) Date: Fri, 22 Nov 2013 17:01:17 +0000 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: On 22 November 2013 16:45, Gregory P. Smith wrote: > > > > On Wed, Nov 20, 2013 at 7:44 PM, Nick Coghlan wrote: > >> >> On 21 Nov 2013 13:02, "Ethan Furman" wrote: >> > >> > On 11/20/2013 05:57 PM, Andrew Barnert wrote: >> > >> >> On Nov 20, 2013, at 12:14, Eric Snow wrote: >> >> >> >>> In contrast, something like __metamodule__ would be an effective >> >>> replacement. It would be similar in spirit and in syntax to >> >>> __init_class__ in PEP 422 (and __metaclass__ in Python 2), defined at >> >>> the top of the module and used for the module. The thing that appeals >> >>> to me is that we could deprecate the sys.modules hack. :) >> >> >> >> >> >> Given that __metaclass__ was removed in Python 3, doesn't "this is an >> >> exact parallel to __metaclass__" argue against the idea, rather than >> >> for? Or at least against the name? (Maybe __init_module__?) >> >> >> >> Anyway, I think a module replacing itself with something callable is >> >> both more flexible and more in line with the way people are actually >> >> doing things today, so maybe a "less hacky" way to do the sys.modules >> >> hack is what people actually want here. >> > >> > >> > sys.modules is a dictionary. >> > >> > dict[name] = something >> > >> > is the normal way to set values to keys. >> > >> > What is so horrible about this idiom? >> >> It potentially causes problems for module reloading and it definitely >> causes problems for the import engine PEP (since sys.modules is process >> global state). >> >> I expect we'll revisit this later in the 3.5 development cycle (we >> haven't even merged the accepted PEP 451 for 3.4 yet), but formalising the >> current module replacement idiom is already a more likely outcome than >> making module instances callable. >> > Agreed, formalizing how to do the replacement trick sounds good. It'd be > ideal for a module to not need to know its own name within the code doing > the replacement and for it to not need to reimplement common interface bits > in a replacement class so that it quacks like a module. > > Well, it just has to use __name__. Any formalisation that doesn't get passed the name is going to have to use frame trickery to get the name which is just smelly. Michael > Making it callable? well, that does just seem silly so I'm not actually > worried about making that specifically easier itself. > > >> Cheers, >> Nick. >> >> > >> > -- >> > ~Ethan~ >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > https://mail.python.org/mailman/listinfo/python-ideas >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Fri Nov 22 18:12:53 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Fri, 22 Nov 2013 09:12:53 -0800 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <97eeac3782ce4b7ba20f6e05332c9b22@APATLISDMAIL02.core.gtri.org> Message-ID: > Dumb names are dumb because they are misleading, not because they're short. p for momentum, pressure or price is not misleading at all. It's standard terminology in their fields, and when you're writing heavy algebraic code, writing "momentum" all over the place obscures the logic of your equations. They're no dumber than `i` as an iterator variable, which suffers the exact same problem =( I'm not complaining that the names are misleading, I'm saying they pollute the namespace and cause collisions, unless they're misleading *because* of their collidiness, in which case I'm accidentally dumb many times a day. > The proposed syntax has more to do with "then" rather than "there", that is, "the last object used", not "the object at location 1234". I mean "there" meaning "there in the code", not "there in the computer". It's the lexically-scoped "there", the "there" you're looking at when you're looking at the source code of the program. I don't care what the memory location is. > Out of the infinity of possible errors a tired person might make, why single out the risk of misusing a named temp variable? - It's hard to spot "yeah, f = p * a looks right" (p in this scope means price). You often basically have to do a local dataflow analysis on your function to spot the problematic scopes. - Lots of the other errors you brought up can be found by linters, but I don't know of any linter smart enough to identify misuse of a temp variable. If it's harder for linters to spot, it's an indication that it's harder for humans to spot too. - The interpreter won't throw an exception to tell you you made a mistake, since it's not a SyntaxError or (often) a TypeError/ValueError - Often temp variables with the same name often have similar contents, meaning the program will work fine but silently produce incorrect results = data corruption yay I mean, tired people making SyntaxErrors and crashing their process (the example you gave), or ValueErrors/TypeErrors and crashing production services, aren't that big a deal v.s. silently corrupting large quantities of data by sneakily giving wrong answers. > Funny you say that. I hardly ever pay attention to the line number in tracebacks I guess we're different. You don't need to keep repeating to me "code is read more times than it's written" just because I read code differently than you =( > You may need to parse backward all the way to the beginning of the expression (suite?) to figure out what it refers to. But with block of code all referring to (setting attributes, calling methods on) an actual name (e.g. p), you may need to: - Parse *the entire file and all its imports* to figure out where it comes from! - Look closely to make sure the intermediate values of `p` aren''t being passed around where they shouldn't - Make sure nobody accidentally sets `p` in between (e.g. [abs(p) for p in prices] in one of the right-hand-sides) - Trace p up the block to see how it's being used; is it only having attributes assigned to? Having methods called on it? Is it being passed in as a function parameter and the return value being used to replace itself? With a cascading syntax, you don't need to do any of these things. Yeah, you could argue that the name should tell you exactly what it does. No, I don't think that's realistic in practice. > True, but that is easy enough to get by defining a function. We'll just have to disagree about the "enough" part. I guess I'm far lazier than you are =D. Furthermore, the non-locality of a function call ("where is this function defined") combined with the visibility ("is this function only being called here, or at multiple callsites") is a disadvantage when trying to read code. I'm not wedding for the current ".." syntax; I agree it looks terrible too, including when python's own RestructuredText does it. In general though, the fact that cascading is so limited compared to "just assigning a temporary var" is something I consider a good thing. When I see a cascading block/expression/whatever, I know immediately what the dataflow looks like: start |/ / / / |/ / / |/ / |/ end With a bunch of inputs coming from the top, joining up with the one constant implicit "this" on the right. There are no other places where data is combined. On the other hand, when I see a long block with temporary variables, maybe the dataflow looks like that, but it could also look like this: start |\ /| / |/|\|/ |\|/| |/|\ \ end With a bunch of variables coming in the top, a bunch of variables going out the bottom, and a jumble of ad-hoc recombination of everything in the middle. I then have to construct the dataflow graph in my head before I can figure out what changes will affect which downstream values, whereas in the cascading case such truths are self evident. You could always say "yeah, but that code is sucky spaghetti, it should be broken up into functions" but you can't say "i never write that", because in the end we all contribute to the communal pool of sucky spaghetti. What would be nice are tools to not just express intent (e.g. the identity context-manager) but enforce it, and I think a cascading operator is one of those useful tools. No matter how hard a deadline you're under, you can't just insert extra edges in the cascading-block's dataflow graph without breaking it up. Someone looking at the cascading-block has a hard-guarantee that the section of code he's looking at obeys certain strict (and reasonably intuitive) properties, and doesn't need to spend time tracing dataflow graphs in his head. That's why I consider them easier to read. On Fri, Nov 22, 2013 at 7:44 AM, Chris Angelico wrote: > On Sat, Nov 23, 2013 at 2:24 AM, Henshaw, Andy > wrote: > > For what it?s worth, if this idea ever gets to the bikeshedding phase... > > It gets to that phase the moment it's posted on python-ideas. Have at it! > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Fri Nov 22 19:02:00 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Fri, 22 Nov 2013 19:02:00 +0100 Subject: [Python-ideas] CC0 for Python Documentation In-Reply-To: References: <528F57B6.5010302@egenix.com> Message-ID: <528F9C18.7030003@egenix.com> On 22.11.2013 15:10, anatoly techtonik wrote: > On Fri, Nov 22, 2013 at 4:10 PM, M.-A. Lemburg wrote: >> >>> CC0 is a way to free public works from legal burden: >>> https://creativecommons.org/about/cc0 >>> >>> Here is the reasoning why people do this: >>> https://creativecommons.org/tag/cc0 >>> >>> At first I thought about CC-BY, but then realized that no >>> authorship is respected. As you may see here - >>> http://docs.python.org/3/copyright.html - PSF is the sole >>> owner of the docs with no reference to the work of people >>> who have contributed. No wonder that there is not much >>> motivation to collaborate. >> >> The documentation is distributed under the same license >> terms as Python itself. Credits are included in the >> Misc/ACKS file and the patch history is both on the >> tracker and the Mercurial log. > > This information is not accessible. Nobody knows where this > Misc/ACKS file is located and nobody will go look for it. On > the other hand, clicking copyright string is an easy action. > Mercurial log also only makes sense if it is analysed > http://www.red-bean.com/svnproject/contribulyzer/ > >> We don't treat documentation as separate from the code itself. >> Both go together hand in hand. > > That's good only for reference part. All other parts are largely > outdated, incomplete, lack tutorials and examples. That > happens, because docs are written by coders the same way > as code, and not by users for users. There are plenty alternatives around in form of books, websites with tutorials, videos, podcasts, Q&As, then there are the cookbook, the topic guides, blogs, etc. if you don't like to use the official documentation. https://wiki.python.org/moin/Documentation I'm sure some of those don't require to sign a contrib agreement, so you can contribute there. Or you can setup your own website for this purpose. Plenty of options, I'd say, for someone who wants to contribute something. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 22 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ericsnowcurrently at gmail.com Fri Nov 22 19:20:43 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 11:20:43 -0700 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: On Fri, Nov 22, 2013 at 10:01 AM, Michael Foord wrote: > On 22 November 2013 16:45, Gregory P. Smith wrote: >> Agreed, formalizing how to do the replacement trick sounds good. It'd be >> ideal for a module to not need to know its own name within the code doing >> the replacement and for it to not need to reimplement common interface bits >> in a replacement class so that it quacks like a module. >> > Well, it just has to use __name__. Any formalisation that doesn't get passed > the name is going to have to use frame trickery to get the name which is > just smelly. With a special-case for __name__ == "__main__" of course. :) -eric From greg at krypto.org Fri Nov 22 21:44:25 2013 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 22 Nov 2013 12:44:25 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: It'd be nice to formalize a way to get rid of the __name__ == '__main__' idiom as well in the long long run. Sure everyone's editor types that for them now but it's still a wart. Anyways, digressing... ;) -- blame half the typos on my phone. -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Fri Nov 22 22:02:56 2013 From: flying-sheep at web.de (Philipp A.) Date: Fri, 22 Nov 2013 22:02:56 +0100 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: we?re all accustomed to it, but objectively, it?s horribly implicit and unobvious. i?d be all for a decorator: #@mainfunction? or just @main? maybe @entrypoint? @mainfunctiondef main(args): assert args == sys.argv[1:] 2013/11/22 Gregory P. Smith > It'd be nice to formalize a way to get rid of the __name__ == '__main__' > idiom as well in the long long run. Sure everyone's editor types that for > them now but it's still a wart. Anyways, digressing... ;) > > -- > blame half the typos on my phone. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Fri Nov 22 22:06:50 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Fri, 22 Nov 2013 13:06:50 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: > we?re all accustomed to it, but objectively, it?s horribly implicit and unobvious. +100. On Fri, Nov 22, 2013 at 1:02 PM, Philipp A. wrote: > we?re all accustomed to it, but objectively, it?s horribly implicit and > unobvious. > > i?d be all for a decorator: > > #@mainfunction? or just @main? maybe @entrypoint? > @mainfunctiondef main(args): > assert args == sys.argv[1:] > > > > 2013/11/22 Gregory P. Smith > >> It'd be nice to formalize a way to get rid of the __name__ == >> '__main__' idiom as well in the long long run. Sure everyone's editor types >> that for them now but it's still a wart. Anyways, digressing... ;) >> >> -- >> blame half the typos on my phone. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> https://mail.python.org/mailman/listinfo/python-ideas >> >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Nov 22 22:14:15 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 22:14:15 +0100 Subject: [Python-ideas] making a module callable References: <528D7287.7040305@stoneleaf.us> Message-ID: <20131122221415.65093fe1@fsol> On Fri, 22 Nov 2013 22:02:56 +0100 "Philipp A." wrote: > we?re all accustomed to it, but objectively, it?s horribly implicit and > unobvious. It's funny, when I first learned Python, I actually found it quite simple and elegant (leveraging the power of built-in introspection metadata). Regards Antoine. From ericsnowcurrently at gmail.com Fri Nov 22 22:40:40 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 14:40:40 -0700 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) Message-ID: On Fri, Nov 22, 2013 at 1:44 PM, Gregory P. Smith wrote: > It'd be nice to formalize a way to get rid of the __name__ == '__main__' > idiom as well in the long long run. Sure everyone's editor types that for > them now but it's still a wart. Anyways, digressing... ;) This has come up before and is the subject of several PEPS. [1][2] The current idiom doesn't bother me too much as I try not to have files that are both scripts and modules. However, Python doesn't make the distinction all that clear nor does it do much to encourage people to keep the two separate. I'd prefer improvements in both those instead, but haven't had the time for any concrete proposal. FWIW, aside from the idiom there are other complications that arise from a module that also gets loaded in __main__ (run as a script). See PEP 395 [3]. -erc [1] http://www.python.org/dev/peps/pep-0299/ [2] http://www.python.org/dev/peps/pep-3122/ [3] http://www.python.org/dev/peps/pep-0395/ (sort of related) From mertz at gnosis.cx Fri Nov 22 23:02:27 2013 From: mertz at gnosis.cx (David Mertz) Date: Fri, 22 Nov 2013 14:02:27 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: Message-ID: I'm not in love with the *spelling* of " if __name__=='__main__': ", but I very frequently use the overall pattern. Much--or even most--of the time when I write a module, I like to allow it to either do a minimal case of its basic functionality and/or have the module run some basic unit tests as a quick check against breakage. So in contrast to Eric Snow, I try *to* make my files both scripts and modules. I know this isn't the only possible approach, but I don't think it's bad or uncommon. On Fri, Nov 22, 2013 at 1:40 PM, Eric Snow wrote: > On Fri, Nov 22, 2013 at 1:44 PM, Gregory P. Smith wrote: > > It'd be nice to formalize a way to get rid of the __name__ == '__main__' > > idiom as well in the long long run. Sure everyone's editor types that for > > them now but it's still a wart. Anyways, digressing... ;) > > This has come up before and is the subject of several PEPS. [1][2] > The current idiom doesn't bother me too much as I try not to have > files that are both scripts and modules. However, Python doesn't make > the distinction all that clear nor does it do much to encourage people > to keep the two separate. I'd prefer improvements in both those > instead, but haven't had the time for any concrete proposal. > > FWIW, aside from the idiom there are other complications that arise > from a module that also gets loaded in __main__ (run as a script). > See PEP 395 [3]. > > -erc > > > [1] http://www.python.org/dev/peps/pep-0299/ > [2] http://www.python.org/dev/peps/pep-3122/ > [3] http://www.python.org/dev/peps/pep-0395/ (sort of related) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Nov 22 23:19:52 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 23:19:52 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: Message-ID: <20131122231952.30dc0a04@fsol> On Fri, 22 Nov 2013 14:02:27 -0800 David Mertz wrote: > I'm not in love with the *spelling* of " if __name__=='__main__': ", but I > very frequently use the overall pattern. > > Much--or even most--of the time when I write a module, I like to allow it > to either do a minimal case of its basic functionality and/or have the > module run some basic unit tests as a quick check against breakage. So in > contrast to Eric Snow, I try *to* make my files both scripts and modules. > I know this isn't the only possible approach, but I don't think it's bad > or uncommon. I agree with this, and we actually use it quite a bit in the stdlib (try e.g. "python -m zipfile -h"). Regards Antoine. From ericsnowcurrently at gmail.com Fri Nov 22 23:27:58 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 15:27:58 -0700 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: Message-ID: On Fri, Nov 22, 2013 at 3:02 PM, David Mertz wrote: > I'm not in love with the *spelling* of " if __name__=='__main__': ", but I > very frequently use the overall pattern. > > Much--or even most--of the time when I write a module, I like to allow it to > either do a minimal case of its basic functionality and/or have the module > run some basic unit tests as a quick check against breakage. So in contrast > to Eric Snow, I try *to* make my files both scripts and modules. I know > this isn't the only possible approach, but I don't think it's bad or > uncommon. You're right and I think it's a good pattern too. That is something we do in the stdlib (and increasingly so). It slipped my mind. I've also seen the idiom used for initiating tests (not that I necessarily condone that practice), though less so in large projects. It would be nice if we could address the issues outlined in PEP 395. One nice approach would be to first import the module separately, copy the namespace into __main__, and then look for some special function (in the module) like __main__() and run it. That function would also be available to use programmatically. That's pretty similar to the PEPs I mentioned before. Who knows. Maybe the time has come for the idea. -eric From tjreedy at udel.edu Fri Nov 22 23:39:20 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 22 Nov 2013 17:39:20 -0500 Subject: [Python-ideas] CC0 for Python Documentation In-Reply-To: <20131122160221.79690e99@fsol> References: <528F57B6.5010302@egenix.com> <20131122160221.79690e99@fsol> Message-ID: On 11/22/2013 10:02 AM, Antoine Pitrou wrote: >> Techtonic wrote: >>> http://docs.python.org/3/copyright.html - PSF is the sole >>> owner of the docs False. It lists 4 legal entities as the owner of the combined software and documentation. "Python and this documentation is: Copyright ? 2001-2013 Python Software Foundation. All rights reserved. Copyright ? 2000 BeOpen.com. All rights reserved. Copyright ? 1995-2000 Corporation for National Research Initiatives. All rights reserved. Copyright ? 1991-1995 Stichting Mathematisch Centrum. All rights reserved." Each has copyright on the contributions made during the listed periods. PSF only owns contributions since 2001, much as it wishes otherwise. I believe that the previous sponsors have be asked but have declined to assigned their copyrights to PSF. If they were to, http://docs.python.org/3/license.html could be simplified a bit. > Anatoly has a point, though: why does the doc claim Python is > "copyright PSF" while that is not true - and the LICENSE file doesn't > make any such claim? Anatoly misstated the copyright claim, certainly with respect to 'Python'. Both files say the same thing, with the same dates. The license file just gives more details as to versions. Perhaps he was confused by the web page copyright notice at the bottom right of *every* page, not just the Copyright page. ? Copyright 1990-2013, Python Software Foundation. I believe that listing just PSF is ok because the formatted web pages are derived works using .rst and Sphinx, starting with 2.6. In any case, the only people with a right to complain are the other three underlying copyright holders. The beginning date should perhaps be 1991 instead of 1990, though there might also be a good reason for that related to public notices Guido posted before the first release. But I see no reason for *us* to fuss about that. -- Terry Jan Reedy From solipsis at pitrou.net Fri Nov 22 23:45:36 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 22 Nov 2013 23:45:36 +0100 Subject: [Python-ideas] CC0 for Python Documentation References: <528F57B6.5010302@egenix.com> <20131122160221.79690e99@fsol> Message-ID: <20131122234536.4f4b4112@fsol> On Fri, 22 Nov 2013 17:39:20 -0500 Terry Reedy wrote: > [snip] I think it's pointless discussing this on python-ideas, I'm spawning a new thread on the legal SIG. Regards Antoine. From greg.ewing at canterbury.ac.nz Sat Nov 23 00:21:41 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 23 Nov 2013 12:21:41 +1300 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: <528FE705.10506@canterbury.ac.nz> Philipp A. wrote: > > |#@mainfunction? or just @main? maybe @entrypoint? Or to make Java immigrants feel at home, @publicstaticvoid. :-) > @mainfunction > def main(args): > assert args == sys.argv[1:] Now we can have an argument about whether args should include sys.argv[0] or not... -- Greg From greg.ewing at canterbury.ac.nz Sat Nov 23 00:35:39 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 23 Nov 2013 12:35:39 +1300 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: <528FEA4B.8050504@canterbury.ac.nz> Eric Snow wrote: > With a special-case for __name__ == "__main__" of course. :) I don't see how that's a special case. The module really is called "__main__". That's the name it appears under in sys.modules, which is what matters here. -- Greg From rob.cliffe at btinternet.com Sat Nov 23 00:47:57 2013 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Fri, 22 Nov 2013 23:47:57 +0000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: Message-ID: <528FED2D.6090507@btinternet.com> On 22/11/2013 22:27, Eric Snow wrote: > On Fri, Nov 22, 2013 at 3:02 PM, David Mertz wrote: >> I'm not in love with the *spelling* of " if __name__=='__main__': ", but I >> very frequently use the overall pattern. >> >> Much--or even most--of the time when I write a module, I like to allow it to >> either do a minimal case of its basic functionality and/or have the module >> run some basic unit tests as a quick check against breakage. So in contrast >> to Eric Snow, I try *to* make my files both scripts and modules. I know >> this isn't the only possible approach, but I don't think it's bad or >> uncommon. > You're right and I think it's a good pattern too. That is something > we do in the stdlib (and increasingly so). It slipped my mind. I've > also seen the idiom used for initiating tests (not that I necessarily > condone that practice), though less so in large projects. It would be > nice if we could address the issues outlined in PEP 395. Yes. Having functionality and some test of that functionality in the same module simplifies file organisation/maintenance. The test typically also provides extra documentation of the functionality ("this is how you use it"). Rob Cliffe From ericsnowcurrently at gmail.com Sat Nov 23 01:26:33 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 17:26:33 -0700 Subject: [Python-ideas] making a module callable In-Reply-To: <528FEA4B.8050504@canterbury.ac.nz> References: <528D7287.7040305@stoneleaf.us> <528FEA4B.8050504@canterbury.ac.nz> Message-ID: On Fri, Nov 22, 2013 at 4:35 PM, Greg Ewing wrote: > Eric Snow wrote: >> >> With a special-case for __name__ == "__main__" of course. :) > > > I don't see how that's a special case. The module really is > called "__main__". That's the name it appears under in > sys.modules, which is what matters here. It's relevant because no one should be replacing __main__ in sys.modules. :) -eric From mcquillan.sean at gmail.com Sat Nov 23 02:45:22 2013 From: mcquillan.sean at gmail.com (Sean McQuillan) Date: Fri, 22 Nov 2013 17:45:22 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <528FED2D.6090507@btinternet.com> References: <528FED2D.6090507@btinternet.com> Message-ID: Adding a mainfunction decorator was mentioned in the other thread, seemed interesting so I coded up a simple example to see how it works. https://github.com/objcode/mainfunction/blob/master/mainfunction/mainfunction.py >From the README: @mainfunction def main(): print "Hello, World." vs: if __name__ == '__main__': print "Hello, World." After playing with it briefly, I'm not sure it's a clear spelling win as decorators are a fairly advanced topic to a new programmer. If statements are one of the first programming constructs a new programmer learns - and they don't leave a "magic taste". Thoughts? On Fri, Nov 22, 2013 at 3:47 PM, Rob Cliffe wrote: > > On 22/11/2013 22:27, Eric Snow wrote: > >> On Fri, Nov 22, 2013 at 3:02 PM, David Mertz wrote: >> >>> I'm not in love with the *spelling* of " if __name__=='__main__': ", but >>> I >>> very frequently use the overall pattern. >>> >>> Much--or even most--of the time when I write a module, I like to allow >>> it to >>> either do a minimal case of its basic functionality and/or have the >>> module >>> run some basic unit tests as a quick check against breakage. So in >>> contrast >>> to Eric Snow, I try *to* make my files both scripts and modules. I know >>> this isn't the only possible approach, but I don't think it's bad or >>> uncommon. >>> >> You're right and I think it's a good pattern too. That is something >> we do in the stdlib (and increasingly so). It slipped my mind. I've >> also seen the idiom used for initiating tests (not that I necessarily >> condone that practice), though less so in large projects. It would be >> nice if we could address the issues outlined in PEP 395. >> > Yes. Having functionality and some test of that functionality in the same > module simplifies file organisation/maintenance. The test typically also > provides extra documentation of the functionality ("this is how you use > it"). > > Rob Cliffe > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -- Sean McQuillan 415.990.0854 -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Nov 23 02:51:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 23 Nov 2013 02:51:53 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <528FED2D.6090507@btinternet.com> Message-ID: <20131123025153.747af182@fsol> On Fri, 22 Nov 2013 17:45:22 -0800 Sean McQuillan wrote: > Adding a mainfunction decorator was mentioned in the other thread, seemed > interesting so I coded up a simple example to see how it works. > > https://github.com/objcode/mainfunction/blob/master/mainfunction/mainfunction.py > > From the README: > > @mainfunction > def main(): > print "Hello, World." > > vs: > > if __name__ == '__main__': > print "Hello, World." > > > After playing with it briefly, I'm not sure it's a clear spelling win as > decorators are a fairly advanced topic to a new programmer. If statements > are one of the first programming constructs a new programmer learns - and > they don't leave a "magic taste". Indeed, I think the current idiom is much better. Regards Antoine. From steve at pearwood.info Sat Nov 23 05:06:09 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Nov 2013 15:06:09 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> Message-ID: <20131123040608.GN2085@ando> On Fri, Nov 22, 2013 at 05:45:22PM -0800, Sean McQuillan wrote: > Adding a mainfunction decorator was mentioned in the other thread, seemed > interesting so I coded up a simple example to see how it works. > > https://github.com/objcode/mainfunction/blob/master/mainfunction/mainfunction.py I haven't actually tried it, but by reading the code I don't think that works the way I expect a main function to work. Your decorator simply does a bit of book-keeping, then immediately calls the main function. So it doesn't so much define a main function as just execute it straight away, which means it works with simple cases like this: @mainfunction def main(): print "Hello, World." but not for cases like this: @mainfunction def main(): do_stuff("Hello, World!") def do_stuff(msg): print msg since do_stuff doesn't exist at the time main() is executed. In contrast, the standard idiom is typically the very last thing at the bottom of the file, and so by the time it is called all the support functions are in place. My aesthetic sense tells me that there are two reasonable approaches here: 1) The current idiom, with an explicit "if __name__" test. This is the most flexible. 2) An implicit main function, which I think should be spelled __main__ to emphasise that it is special. This, I think, would require support from the compiler. (At least, I can't see any easy way to bootstrap it without using the "if __name__" test above.) python some_file.py python -m some_file would both execute the file as normal. The only addition would be, after the file has been run the compiler checks whether there is a name "__main__" in the global namespace, and if so, calls that __main__ object with sys.argv as argument. This second is analogous to the situation with packages. If a package has a __main__.py file, it is run when you call python -m package This second approach is purely a convenience for writing scripts, it doesn't given you anything you can't already do with the current idiom, but I think that defining a special main function def __main__(argv): ... is a cleaner (but less flexible) idiom that the current if __name__ business, and simpler for people to learn. -- Steven From steve at pearwood.info Sat Nov 23 05:07:29 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Nov 2013 15:07:29 +1100 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> <528FEA4B.8050504@canterbury.ac.nz> Message-ID: <20131123040729.GO2085@ando> On Fri, Nov 22, 2013 at 05:26:33PM -0700, Eric Snow wrote: > It's relevant because no one should be replacing __main__ in sys.modules. :) I see your smiley, which confuses me. Are you serious that replacing __main__ is a bad idea? If so, can you explain why? -- Steven From steve at pearwood.info Sat Nov 23 05:14:49 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Nov 2013 15:14:49 +1100 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> Message-ID: <20131123041449.GP2085@ando> On Fri, Nov 22, 2013 at 10:02:56PM +0100, Philipp A. wrote: > we?re all accustomed to it, but objectively, it?s horribly implicit and > unobvious. Certainly you are correct that it is unobvious, but the "if __name__" idiom is anything but implicit. It's the opposite, you are *explicitly* testing whether the module is being run as the main module (__name__ == "__main__") and if so, you explicitly run some code. Of course you can also run code only when *not* the main module: if __name__ != '__main__': print "Module is being imported" else: print "Module is being executed" And you aren't limited to a single "main function", you can dot such tests all throughout your code, including inside functions. Aside: perhaps it would have been better to have an explicit ismain() function that returns True when running as the main module, that's more discoverable. -- Steven From steve at pearwood.info Sat Nov 23 05:16:16 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Nov 2013 15:16:16 +1100 Subject: [Python-ideas] making a module callable In-Reply-To: <20131122221415.65093fe1@fsol> References: <528D7287.7040305@stoneleaf.us> <20131122221415.65093fe1@fsol> Message-ID: <20131123041616.GQ2085@ando> On Fri, Nov 22, 2013 at 10:14:15PM +0100, Antoine Pitrou wrote: > On Fri, 22 Nov 2013 22:02:56 +0100 > "Philipp A." wrote: > > we?re all accustomed to it, but objectively, it?s horribly implicit and > > unobvious. > > It's funny, when I first learned Python, I actually found it quite > simple and elegant (leveraging the power of built-in introspection > metadata). +1 It is simple and elegant. Just not obvious and easily discoverable. -- Steven From steve at pearwood.info Sat Nov 23 05:21:00 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 23 Nov 2013 15:21:00 +1100 Subject: [Python-ideas] making a module callable In-Reply-To: <528FE705.10506@canterbury.ac.nz> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> Message-ID: <20131123042100.GR2085@ando> On Sat, Nov 23, 2013 at 12:21:41PM +1300, Greg Ewing wrote: > >@mainfunction > >def main(args): > > assert args == sys.argv[1:] > > Now we can have an argument about whether args should include > sys.argv[0] or not... Of course it should :-) I'm serious, by the way. It's a nice Unix trick to have a single executable do different things depending on what name it is called by. Inspecting argv[0] lets you do that. -- Steven From ericsnowcurrently at gmail.com Sat Nov 23 06:48:31 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Fri, 22 Nov 2013 22:48:31 -0700 Subject: [Python-ideas] making a module callable In-Reply-To: <20131123040729.GO2085@ando> References: <528D7287.7040305@stoneleaf.us> <528FEA4B.8050504@canterbury.ac.nz> <20131123040729.GO2085@ando> Message-ID: On Fri, Nov 22, 2013 at 9:07 PM, Steven D'Aprano wrote: > On Fri, Nov 22, 2013 at 05:26:33PM -0700, Eric Snow wrote: > >> It's relevant because no one should be replacing __main__ in sys.modules. :) > > I see your smiley, which confuses me. Are you serious that replacing > __main__ is a bad idea? If so, can you explain why? I do think it's a bad idea. It would be like replacing a builtin module in sys.modules, which is inadvisable (particularly key ones like sys). Like builtins, __main__ is a special module, one created during interpreter startup. It plays a special part in the REPL. Various parts of the stdlib have special-casing for __main__, which could be affected by replacement. Replacing __main__ in sys.modules is, to me, just as inadvisable as replacing sys. The catch is that a script is exec'ed into the __main__ module's namespace, so during execution (nearly) all the import-related attributes relate to __main__. In contrast, the equivalent module from the same file would be loaded into its own namespace, with its own import-related attributes, and cached independently at sys.modules[module_name]. This duality causes all sorts of grief (PEP 395 is a response to some of the pain points). A key hangup is that __name__ is different depending on run-as-script or imported-as-module. That brings us back to the idea of a more formal replace-module-in-sys-modules API. Any solution to that which uses __name__ to determine the module's name has to take into account that it may have been run as a script (where __name__ will be "__main__"). If we simply used __name__ staight up we might end up replacing __main__ in sys.modules, which I suggest is a bad idea. Hence the point of special-casing __main__. Sorry I wasn't clear. Hopefully this was more so. -eric From stephen at xemacs.org Sat Nov 23 12:17:34 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 23 Nov 2013 20:17:34 +0900 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <20131122021636.GG2085@ando> Message-ID: <87y54fe59t.fsf@uwakimon.sk.tsukuba.ac.jp> Pere??ni Peter writes: > Anything which is tree-like (e.g. mentioned DSLs, API wrappers > around complicated objects (e.g. xml), configurations, gnuplot > example, ...) can be difficult to instantiate using normal syntax: > tree = Tree() > ? ..newChild(value=5) > ? ? ? ..newChild(value=3) > ? ? ? ..newChild() > ? ..newChild(color="green") This particular application just screams for positional parameters, though: tree = Tree( Node( Node(value=3), ? ? ? ?Node(), value=5), ? ?Node(color="green")) (YMMV about positioning of the parens, to me this is the one that says "TREE!! and here are the Nodes" most clearly.) From techtonik at gmail.com Sat Nov 23 12:18:02 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 23 Nov 2013 14:18:02 +0300 Subject: [Python-ideas] CC0 for Python Documentation In-Reply-To: <20131122234536.4f4b4112@fsol> References: <528F57B6.5010302@egenix.com> <20131122160221.79690e99@fsol> <20131122234536.4f4b4112@fsol> Message-ID: Right. The thread is moved into discussing of "rightfulness of copyright notice on documentation page", because it is simple, easy and more interesting to discuss as everyone has some kind of different opinion about that. It also doesn't require too much time and effort to think about. What I really would like to see is the discussion of pure idea of Python documentation in CC0, is it good or bad. Current "vendor lock-in" is very much related, but separate question that indeed belongs to legal SIG. Here I expected to see opinions from all people why CC0 is bad or good. So far I haven't see any counter argument against, except one that I may vague interpret both as: 1. it works, so no reason to touch it, or spend time on or even discuss alternatives 2. python doc is for python code, so it's ok that you should sign the CLA to edit them As for the first, I still would like to discuss all points addressed - that's why it is python-ideas, where there ordinary people can talk about things they like and don't like. As for the second. I don't think it is ok, and that's why I wrote the proposal. I made my arguments, and I'd want to put them on scales. -- anatoly t. On Sat, Nov 23, 2013 at 1:45 AM, Antoine Pitrou wrote: > On Fri, 22 Nov 2013 17:39:20 -0500 > Terry Reedy wrote: >> [snip] > > I think it's pointless discussing this on python-ideas, I'm > spawning a new thread on the legal SIG. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From masklinn at masklinn.net Sat Nov 23 12:53:08 2013 From: masklinn at masklinn.net (Masklinn) Date: Sat, 23 Nov 2013 12:53:08 +0100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <87y54fe59t.fsf@uwakimon.sk.tsukuba.ac.jp> References: <20131122021636.GG2085@ando> <87y54fe59t.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <466FDA1D-7B4F-4E27-AFD2-1A471DEACEF9@masklinn.net> On 2013-11-23, at 12:17 , Stephen J. Turnbull wrote: > Pere??ni Peter writes: > >> Anything which is tree-like (e.g. mentioned DSLs, API wrappers >> around complicated objects (e.g. xml), configurations, gnuplot >> example, ...) can be difficult to instantiate using normal syntax: > >> tree = Tree() >> ..newChild(value=5) >> ..newChild(value=3) >> ..newChild() >> ..newChild(color="green") > > This particular application just screams for positional parameters, > though: > > tree = Tree( > Node( > Node(value=3), > Node(), > value=5), > Node(color="green")) > > (YMMV about positioning of the parens, to me this is the one that says > "TREE!! and here are the Nodes? most clearly.) Doesn?t work on third-party libraries on which you don?t have control[0] though, and when there are many customisation points having subsequent methods can make for a nicer API than 50 constructor parameters[1]. [0] such as the standard library?s, unless I am mistaken neither minidom nor elementtree allow complete element configuration ? including subtree ? in a single expression. The third-party lxml does to an extent, if one uses the builder APIs rather than the ET ones. [1] or the existing API may make it difficult to impossible to add such an extension without breaking From markus at unterwaditzer.net Sat Nov 23 15:40:31 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Sat, 23 Nov 2013 15:40:31 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131123040608.GN2085@ando> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> Message-ID: <9f8728ca-bb97-4a6c-8631-8c4e3431676f@email.android.com> +1 for the second approach, because 1.) It'd be more familiar to programmers coming from C, Java, etc. But I don't see how it would be easier to learn to completely new programmers. 2.) Forces people not to pollute the global namespace by making them write their code inside a function. That's usually a good idea anyways, which is why people already do if __name__ == "__main__": main() -- Markus Steven D'Aprano wrote: >On Fri, Nov 22, 2013 at 05:45:22PM -0800, Sean McQuillan wrote: >> Adding a mainfunction decorator was mentioned in the other thread, >seemed >> interesting so I coded up a simple example to see how it works. >> >> >https://github.com/objcode/mainfunction/blob/master/mainfunction/mainfunction.py > >I haven't actually tried it, but by reading the code I don't think that > >works the way I expect a main function to work. Your decorator simply >does a bit of book-keeping, then immediately calls the main function. >So >it doesn't so much define a main function as just execute it straight >away, which means it works with simple cases like this: > > >@mainfunction >def main(): > print "Hello, World." > > >but not for cases like this: > > >@mainfunction >def main(): > do_stuff("Hello, World!") > >def do_stuff(msg): > print msg > > >since do_stuff doesn't exist at the time main() is executed. In >contrast, the standard idiom is typically the very last thing at the >bottom of the file, and so by the time it is called all the support >functions are in place. > >My aesthetic sense tells me that there are two reasonable approaches >here: > >1) The current idiom, with an explicit "if __name__" test. This is the >most flexible. > >2) An implicit main function, which I think should be spelled __main__ >to emphasise that it is special. This, I think, would require support >from the compiler. (At least, I can't see any easy way to bootstrap it >without using the "if __name__" test above.) > >python some_file.py > >python -m some_file > >would both execute the file as normal. The only addition would be, >after >the file has been run the compiler checks whether there is a name >"__main__" in the global namespace, and if so, calls that __main__ >object with sys.argv as argument. > >This second is analogous to the situation with packages. If a package >has a __main__.py file, it is run when you call > >python -m package > > >This second approach is purely a convenience for writing scripts, it >doesn't given you anything you can't already do with the current idiom, > >but I think that defining a special main function > >def __main__(argv): > ... > > >is a cleaner (but less flexible) idiom that the current if __name__ >business, and simpler for people to learn. > > >-- >Steven >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Sat Nov 23 17:14:42 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 23 Nov 2013 19:14:42 +0300 Subject: [Python-ideas] solution to cross-platform path handling problems Message-ID: I will talk about separating "mount"s and "path" concepts in path handling. On the great talk about writing cross-platform applications back in 2010 there is a good point about Python's cross-platform abstraction to path issues. http://clanmills.com/files/dist/doc/cross_platform.html#python-batteries-included Recent noize around new pathlib and my own experience with os.path made me change my mind that Python has a convenient library for cross-platform path handling. It is much better than dealing with slashed strs (true), but there are still hidden issues (that I can not even summarize, because I don't know what tracker query should I run to get it). While criticizing "pathlib" to see what I dislike about it, I realized that there is a lot of ambiguity in the world of filesystem/resource paths. Every platform-specific path library fails, because from one side people don't know differences between all operating systems, probably because they don't want, don't have time or info. On the other side people need to write cross-platform apps. "pathlib" does a good job by providing PEP with info, but I think that architecturally it doesn't solve the problem of path handling complexity. Syntax sugar - yes, explicit approach - yes, time savings - no, more readable code - "no > yes", code that frees you from thinking how "these three lines" will work on MacOS/Unix/Windows - no. The root of the problem is in traditional "relative" vs "absolute" path approach. Take "Definitions" from PEP 428. """ 1. All paths can have a drive and a root. For POSIX paths, the drive is always empty. 2. A relative path has neither drive nor root. 3. A POSIX path is absolute if it has a root. A Windows path is absolute if it has both a drive and a root. A Windows UNC path (e.g.\\host\share\myfile.txt) always has a drive and a root (here, \\host\share and \, respectively). 4. A path which has either a drive or a root is said to be anchored. Its anchor is the concatenation of the drive and root. Under POSIX, "anchored" is the same as "absolute". """ Good decomposition and problem overview, but hardly a solution or a "correct" representation as I see it. All terminology above can be reduced to just two cross-platform terms: "mount point" and "path". "path" is always relative to "mount point". Either can be missing. "mount point" is system-dependent. 1. All paths may have the mount point. 2. All paths without mount point are relative. 3. Default mount point for POSIX to make path absolute is '/'. Default mount point on Windows is current drive (e.g. 'c:/'), or UNC server address (e.g.\\host\). 4. Any (absolute?) path may be the mount point itself 5. path without mount point is called "relative" I don't know that should be API for that, but I'd be interesting to try it. One of the reasons I want to do this terminology is that semantically I do work more with URL paths than with file system paths and I don't see difference between them. When I move application from www.com site root to some www.com/endpoint, my app doesn't stop working, because it is written to work with any www.com/endpoint - not just with absolute paths that point to site root. I think that is the value that Python can provide to help build apps that are architecturally more "correct" and system-independent. -- anatoly t. From haoyi.sg at gmail.com Sat Nov 23 17:53:08 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Sat, 23 Nov 2013 08:53:08 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> <528FEA4B.8050504@canterbury.ac.nz> <20131123040729.GO2085@ando> Message-ID: > Are you serious that replacing __main__ is a bad idea? If so, can you explain why? It also doesn't work, last I tried =( Strange things start breaking when you replace __main__ On Fri, Nov 22, 2013 at 9:48 PM, Eric Snow wrote: > On Fri, Nov 22, 2013 at 9:07 PM, Steven D'Aprano > wrote: > > On Fri, Nov 22, 2013 at 05:26:33PM -0700, Eric Snow wrote: > > > >> It's relevant because no one should be replacing __main__ in > sys.modules. :) > > > > I see your smiley, which confuses me. Are you serious that replacing > > __main__ is a bad idea? If so, can you explain why? > > I do think it's a bad idea. It would be like replacing a builtin > module in sys.modules, which is inadvisable (particularly key ones > like sys). Like builtins, __main__ is a special module, one created > during interpreter startup. It plays a special part in the REPL. > Various parts of the stdlib have special-casing for __main__, which > could be affected by replacement. Replacing __main__ in sys.modules > is, to me, just as inadvisable as replacing sys. > > The catch is that a script is exec'ed into the __main__ module's > namespace, so during execution (nearly) all the import-related > attributes relate to __main__. In contrast, the equivalent module > from the same file would be loaded into its own namespace, with its > own import-related attributes, and cached independently at > sys.modules[module_name]. This duality causes all sorts of grief (PEP > 395 is a response to some of the pain points). A key hangup is that > __name__ is different depending on run-as-script or > imported-as-module. > > That brings us back to the idea of a more formal > replace-module-in-sys-modules API. Any solution to that which uses > __name__ to determine the module's name has to take into account that > it may have been run as a script (where __name__ will be "__main__"). > If we simply used __name__ staight up we might end up replacing > __main__ in sys.modules, which I suggest is a bad idea. Hence the > point of special-casing __main__. > > Sorry I wasn't clear. Hopefully this was more so. > > -eric > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Sat Nov 23 19:00:56 2013 From: bruce at leapyear.org (Bruce Leban) Date: Sat, 23 Nov 2013 10:00:56 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131123040608.GN2085@ando> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> Message-ID: On Nov 22, 2013 8:06 PM, "Steven D'Aprano" wrote: > def __main__(argv): > ... > > is a cleaner (but less flexible) idiom that the current if __name__ > business, and simpler for people to learn. If I had a time machine I would either do the __main__ function or alternatively: if __main__: .... but alas I can't time travel and changing it will *not* be simpler to learn because people will be seeing both the old idiom and the new idiom for a long time. While it's interesting to discuss from the standpoint of what's the best design, it's just not worth changing, IMHO. --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sat Nov 23 20:01:21 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Nov 2013 11:01:21 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> Message-ID: On Sat, Nov 23, 2013 at 10:00 AM, Bruce Leban wrote: > > On Nov 22, 2013 8:06 PM, "Steven D'Aprano" wrote: > > > def __main__(argv): > > ... > > > > is a cleaner (but less flexible) idiom that the current if __name__ > > business, and simpler for people to learn. > > If I had a time machine I would either do the __main__ function or > alternatively: > > if __main__: > .... > > but alas I can't time travel and changing it will *not* be simpler to > learn because people will be seeing both the old idiom and the new idiom > for a long time. While it's interesting to discuss from the standpoint of > what's the best design, it's just not worth changing, IMHO. > That was why I said it'd be for the long long term... If we ever do anything, I like this simple __main__ bool idea best. Decorators are too much unnecessary complexity for the task. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat Nov 23 21:20:55 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 23 Nov 2013 12:20:55 -0800 Subject: [Python-ideas] solution to cross-platform path handling problems In-Reply-To: References: Message-ID: On Nov 23, 2013, at 8:14, anatoly techtonik wrote: > 1. All paths may have the mount point. > 2. All paths without mount point are relative. > 3. Default mount point for POSIX to make path absolute is '/'. Default > mount point on Windows is current drive (e.g. 'c:/'), or UNC server > address (e.g.\\host\). > 4. Any (absolute?) path may be the mount point itself > 5. path without mount point is called "relative" The first problem with this is that there is already an established meaning for "mount point" in the POSIX world that this is very different from Meanwhile, treating \\host\ as a root doesn't work. If you just skim the docs on UNC pathnames, or actually use them for anything, this is obvious. It's the \\host\share\ that's a root. \\host\ is not a usable path, and doesn't refer to anything path-like at the NT objects level, or the SMB/CIFS protocol. You can't .. above the share. You can't treat it as a drive. You can't mount it. (Yes, there are various places, especially in the msvcrt posix-like wrapper functions, where \\host\share\..\othershare works, but that's only because those functions are treating paths as plain strings and ignoring the semantics. The same functions also let you do \\host\..\otherhost\share, so if they imply that the host alone makes a path, they also imply that \\ alone makes a path, so the host still isn't a root.) Also, this completely ignores the problem with pathlib that you're trying to solve: Windows paths don't come in just two forms, relative and absolute; they also have two intermediate forms, D:foo (which is relative to the current working directory on drive D rather than the current drive), and \foo (which is absolute on the current working drive). To handle these paths, you need to go beyond the notion of a current working directory and represent the notion Windows actually uses: a current working drive, and a current working directory on each drive. And to deal with the way cd'ing to a UNC path interacts with this... It's too complicated to spell out in one sentence, but if you read the MSDN docs they make it pretty clear. (Or just play around with Internet Explorer's file:// paths, which are Microsoft's answer to how to fit their filesystem into something cross-platform. If you think you can do better than them, you'll probably want to understand what they did and why.) From rosuav at gmail.com Sat Nov 23 23:04:59 2013 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 24 Nov 2013 09:04:59 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> Message-ID: On Sun, Nov 24, 2013 at 5:00 AM, Bruce Leban wrote: > If I had a time machine I would either do the __main__ function or > alternatively: > > if __main__: > .... > > but alas I can't time travel and changing it will *not* be simpler to learn > because people will be seeing both the old idiom and the new idiom for a > long time. While it's interesting to discuss from the standpoint of what's > the best design, it's just not worth changing, IMHO. +1. I quite like the simplicity of that, and it wouldn't even be backward incompatible, but it wouldn't be backported. Just a really crazy idea... Does Python let you go "one level outside" and tinker with the code that imports __main__? I haven't looked into all that mechanism, but I know quite a bit of it is now implemented in Python, so it's theoretically possible... could you, in effect, add a line of code *after* that import that effectively calls __main__.__main__(sys.argv) ? That would do most of what you want. ChrisA From abarnert at yahoo.com Sat Nov 23 23:18:56 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 23 Nov 2013 14:18:56 -0800 (PST) Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <20131122021636.GG2085@ando> Message-ID: <1385245136.56367.YahooMailNeo@web184705.mail.ne1.yahoo.com> The biggest problem with most of the variants of this idea suggested so far is that they refer to things that don't exist--in particular, they treat the assignment statement, the assignment target, and/or the suite as something with a value. None of them have values, or could have values, without some radical change like a way to treat a suite of statements as an expression. So, in the interests of having something to debate, here's an attempt at a concrete, implementable syntax and semantics without any of these problems that I think offers everything that people want that's actually possible: ? ? compound_expression_stmt ::= expression_list ":" suite ? ? compound_assignment_stmt ::= (target_list "=")+ (expression_list | yield_expression) ":" suite The only difference to simple expression statements and assignment statements, as defined in 7.1 and 7.2, is that the suite is executed immediately after evaluating the expression list (or yield expression), before assignment of the value (of the expression list or yield expression, not some other value) to the target lists or writing of the repr to the interactive output. ? ? dot_attribute_ref ::= ".." identifier A dot attribute reference in the suite of a compound expression or assignment statement is evaluated exactly like a simple attribute reference, as defined in 6.3.1 and in 7.2, except that the value of the expression list or yield expression is used as the value of the primary. I believe a dot attribute reference outside of could raise a SyntaxError at parse time. If not, it raises a ValueError at runtime, because the dot-value of the current suite is None. I believe this handles all of the examples given so far. It can be nested in the obvious way, without needing any new semantics. Assignment statements with expression lists of more than one expression are useless (you can't mutate a tuple), but not illegal. It could be extended to allow dot-subscripting and dot-slicing trivially, if those are desirable. And it's a very simple change to the syntax and semantics of the language, that doesn't have any radical undesirable consequences. I still don't really like the idea, but it's nice to know that it is actually doable. I'm maybe -.5 instead of -1 if it doesn't require statements having values and magic guessing of which statements in a suite replace a value and which don't and so on. >________________________________ > From: Pere??ni Peter >To: Bruce Leban >Cc: Python-Ideas >Sent: Friday, November 22, 2013 1:32 AM >Subject: Re: [Python-ideas] Dart-like method cascading operator in Python > > > > > > > > >On Fri, Nov 22, 2013 at 10:16 AM, Bruce Leban wrote: > > >>Steven D'Aprano: >> >>>> Perhaps, but that's not relevant to the proposed syntax. The proposed >>syntax has more to do with "then" rather than "there", that is, "the >>last object used", not "the object at location 1234". >>>> For starters, you should explain precisely how the compiler will determine what is the "last object used" in practice. Your examples so far suggest it is based on the last line of source code, but that's tricky. >>Pere??ni Peter: >> >>> I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) >>I don't think that's well-defined at all. Furthermore last object on *previous* line is less useful than object on line that introduces the suite. Consider: >>obj:: >>??? ..x = b >>??? # is last object b? That's not useful. obj.x? That may not be an object >>??? ..y = 3 >>??? # even more clearly not useful >>??? ..sort() >>??? # this always returns None >>In: >>:: >>??? >Yes, exactly. As usual I did not write my thoughts clearly. By the last line I really meant "last object on the preceding line ending with :: having the smaller indent than the current line"? >I know exactly what that does without reading each intervening line which could change the object being operated on. I can use this with objects that *don't* support chaining which is what makes it useful. I can reorder lines worrying only about method execution order not breaking chaining. >>The traceback complaint is a red herring as it can easily inject the line that has the expression being operated on. >>In summary I would only allow the first line to contain an expression or assignment which by definition has a single explicit value. That values is what all statements reference: >>obj:: >>??? ..x = 3 >>??? ..y()? # operates on obj not obj.x or 3 >>??? ..z(). # ditto >>a[index] = foo(...) ::? # don't need to repeat a[index] below or use a temporary name >>??? ..x = 3 >>??? ..y() >>??? ..z() >>? >>>I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) >>>? >Actually, thinking about my own ideas -- the bigger problem than yield is the return. I am not sure if things like > > >return list(range(10)) >? ? ..reverse() >? ? ..pop(0) > > >should be allowed or not - having something being executed after the return statement line might be a bit confusing >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Nov 23 23:53:15 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 23 Nov 2013 17:53:15 -0500 Subject: [Python-ideas] CC0 for Python Documentation In-Reply-To: References: <528F57B6.5010302@egenix.com> <20131122160221.79690e99@fsol> <20131122234536.4f4b4112@fsol> Message-ID: On 11/23/2013 6:18 AM, anatoly techtonik wrote: > What I really would like to see is the discussion of pure idea of > Python documentation in CC0, is it good or bad. Since the docs are partially owned by 4 different entities, such discussion would be pointless. > Here I expected to see opinions from all people > why CC0 is bad or good. This list is for improving future Python versions. It is not a Creative Commons discussion group. [the current situation] > 1. it works, yes > so no reason to touch it, or spend time on or even > discuss alternatives and pointless for the reason given above. > 2. python doc is for python code, so it's ok that you should sign the > CLA to edit them Yes. Many patches affect both code and docs. Code includes docs in the form of docstrings. There is intentional duplication between docstrings in the code and entries in the docs. They are not separable. > As for the first, I still would like to discuss all points addressed - > that's why it is python-ideas, where there ordinary people can talk > about things they like and don't like. python-list is for discussing changes to Python the language and the CPython implementation and distribution thereof. Other things that people like or not are off-topic. -- Terry Jan Reedy From abarnert at yahoo.com Sun Nov 24 00:15:21 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 23 Nov 2013 15:15:21 -0800 (PST) Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: References: <20131122021636.GG2085@ando> Message-ID: <1385248521.86684.YahooMailNeo@web184705.mail.ne1.yahoo.com> The biggest problem with most of the variants of this idea suggested so far is that they refer to things that don't exist--in particular, they treat the assignment statement, the assignment target, and/or the suite as something with a value. None of them have values, or could have values, without some radical change like a way to treat a suite of statements as an expression. So, in the interests of having something to debate, here's an attempt at a concrete, implementable syntax and semantics without any of these problems that I think offers everything that people want that's actually possible: ? ? compound_expression_stmt ::= expression_list ":" suite ? ? compound_assignment_stmt ::= (target_list "=")+ (expression_list | yield_expression) ":" suite The only difference to simple expression statements and assignment statements, as defined in 7.1 and 7.2, is that the suite is executed immediately after evaluating the expression list (or yield expression), before assignment of the value (of the expression list or yield expression, not some other value) to the target lists or writing of the repr to the interactive output. ? ? dot_attribute_ref ::= ".." identifier A dot attribute reference in the suite of a compound expression or assignment statement is evaluated exactly like a simple attribute reference, as defined in 6.3.1 and in 7.2, except that the value of the expression list or yield expression is used as the value of the primary. I believe a dot attribute reference outside of could raise a SyntaxError at parse time. If not, it raises a ValueError at runtime, because the dot-value of the current suite is None. I believe this handles all of the examples given so far. It can be nested in the obvious way, without needing any new semantics. Assignment statements with expression lists of more than one expression are useless (you can't mutate a tuple), but not illegal. It could be extended to allow dot-subscripting and dot-slicing trivially, if those are desirable. And it's a very simple change to the syntax and semantics of the language, that doesn't have any radical undesirable consequences. I still don't really like the idea, but it's nice to know that it is actually doable. I'm maybe -.5 instead of -1 if it doesn't require statements having values and magic guessing of which statements in a suite replace a value and which don't and so on. >________________________________ > From: Pere??ni Peter >To: Bruce Leban >Cc: Python-Ideas >Sent: Friday, November 22, 2013 1:32 AM >Subject: Re: [Python-ideas] Dart-like method cascading operator in Python > > > > > > > > >On Fri, Nov 22, 2013 at 10:16 AM, Bruce Leban wrote: > > >>Steven D'Aprano: >> >>>> Perhaps, but that's not relevant to the proposed syntax. The proposed >>syntax has more to do with "then" rather than "there", that is, "the >>last object used", not "the object at location 1234". >>>> For starters, you should explain precisely how the compiler will determine what is the "last object used" in practice. Your examples so far suggest it is based on the last line of source code, but that's tricky. >>Pere??ni Peter: >> >>> I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) >>I don't think that's well-defined at all. Furthermore last object on *previous* line is less useful than object on line that introduces the suite. Consider: >>obj:: >>??? ..x = b >>??? # is last object b? That's not useful. obj.x? That may not be an object >>??? ..y = 3 >>??? # even more clearly not useful >>??? ..sort() >>??? # this always returns None >>In: >>:: >>??? >Yes, exactly. As usual I did not write my thoughts clearly. By the last line I really meant "last object on the preceding line ending with :: having the smaller indent than the current line"? >I know exactly what that does without reading each intervening line which could change the object being operated on. I can use this with objects that *don't* support chaining which is what makes it useful. I can reorder lines worrying only about method execution order not breaking chaining. >>The traceback complaint is a red herring as it can easily inject the line that has the expression being operated on. >>In summary I would only allow the first line to contain an expression or assignment which by definition has a single explicit value. That values is what all statements reference: >>obj:: >>??? ..x = 3 >>??? ..y()? # operates on obj not obj.x or 3 >>??? ..z(). # ditto >>a[index] = foo(...) ::? # don't need to repeat a[index] below or use a temporary name >>??? ..x = 3 >>??? ..y() >>??? ..z() >>? >>>I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) >>>? >Actually, thinking about my own ideas -- the bigger problem than yield is the return. I am not sure if things like > > >return list(range(10)) >? ? ..reverse() >? ? ..pop(0) > > >should be allowed or not - having something being executed after the return statement line might be a bit confusing >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sun Nov 24 00:25:10 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sat, 23 Nov 2013 15:25:10 -0800 (PST) Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <1385248521.86684.YahooMailNeo@web184705.mail.ne1.yahoo.com> References: <20131122021636.GG2085@ando> <1385248521.86684.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: <1385249110.42014.YahooMailNeo@web184704.mail.ne1.yahoo.com> Sorry for the double-post. Yahoo said there was an error sending the message and forced me to log in again? and then I saw it twice in my "sent" box, so I assume it also went twice to the list. >________________________________ > From: Andrew Barnert >To: Pere??ni Peter ; Bruce Leban >Cc: Python-Ideas >Sent: Saturday, November 23, 2013 3:15 PM >Subject: Re: [Python-ideas] Dart-like method cascading operator in Python > > > >The biggest problem with most of the variants of this idea suggested so far is that they refer to things that don't exist--in particular, they treat the assignment statement, the assignment target, and/or the suite as something with a value. None of them have values, or could have values, without some radical change like a way to treat a suite of statements as an expression. > > >So, in the interests of having something to debate, here's an attempt at a concrete, implementable syntax and semantics without any of these problems that I think offers everything that people want that's actually possible: > > >? ? compound_expression_stmt ::= expression_list ":" suite > > >? ? compound_assignment_stmt ::= (target_list "=")+ (expression_list | yield_expression) ":" suite > > >The only difference to simple expression statements and assignment statements, as defined in 7.1 and 7.2, is that the suite is executed immediately after evaluating the expression list (or yield expression), before assignment of the value (of the expression list or yield expression, not some other value) to the target lists or writing of the repr to the interactive output. > > >? ? dot_attribute_ref ::= ".." identifier > > >A dot attribute reference in the suite of a compound expression or assignment statement is evaluated exactly like a simple attribute reference, as defined in 6.3.1 and in 7.2, except that the value of the expression list or yield expression is used as the value of the primary. > > >I believe a dot attribute reference outside of could raise a SyntaxError at parse time. If not, it raises a ValueError at runtime, because the dot-value of the current suite is None. > > >I believe this handles all of the examples given so far. It can be nested in the obvious way, without needing any new semantics. Assignment statements with expression lists of more than one expression are useless (you can't mutate a tuple), but not illegal. It could be extended to allow dot-subscripting and dot-slicing trivially, if those are desirable. And it's a very simple change to the syntax and semantics of the language, that doesn't have any radical undesirable consequences. > > >I still don't really like the idea, but it's nice to know that it is actually doable. I'm maybe -.5 instead of -1 if it doesn't require statements having values and magic guessing of which statements in a suite replace a value and which don't and so on. > > > >>________________________________ >> From: Pere??ni Peter >>To: Bruce Leban >>Cc: Python-Ideas >>Sent: Friday, November 22, 2013 1:32 AM >>Subject: Re: [Python-ideas] Dart-like method cascading operator in Python >> >> >> >> >> >> >> >> >>On Fri, Nov 22, 2013 at 10:16 AM, Bruce Leban wrote: >> >> >>>Steven D'Aprano: >>> >>>>> Perhaps, but that's not relevant to the proposed syntax. The proposed >>>syntax has more to do with "then" rather than "there", that is, "the >>>last object used", not "the object at location 1234". >>>>> For starters, you should explain precisely how the compiler will determine what is the "last object used" in practice. Your examples so far suggest it is based on the last line of source code, but that's tricky. >>>Pere??ni Peter: >>> >>>> I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) >>>I don't think that's well-defined at all. Furthermore last object on *previous* line is less useful than object on line that introduces the suite. Consider: >>>obj:: >>>??? ..x = b >>>??? # is last object b? That's not useful. obj.x? That may not be an object >>>??? ..y = 3 >>>??? # even more clearly not useful >>>??? ..sort() >>>??? # this always returns None >>>In: >>>:: >>>??? >>Yes, exactly. As usual I did not write my thoughts clearly. By the last line I really meant "last object on the preceding line ending with :: having the smaller indent than the current line"? >>I know exactly what that does without reading each intervening line which could change the object being operated on. I can use this with objects that *don't* support chaining which is what makes it useful. I can reorder lines worrying only about method execution order not breaking chaining. >>>The traceback complaint is a red herring as it can easily inject the line that has the expression being operated on. >>>In summary I would only allow the first line to contain an expression or assignment which by definition has a single explicit value. That values is what all statements reference: >>>obj:: >>>??? ..x = 3 >>>??? ..y()? # operates on obj not obj.x or 3 >>>??? ..z(). # ditto >>>a[index] = foo(...) ::? # don't need to repeat a[index] below or use a temporary name >>>??? ..x = 3 >>>??? ..y() >>>??? ..z() >>>? >>>>I think the reasonable way to define it would be that the cascade operator will act only if the last thing (on the previous line) was an expression. In particular, all statements like "del y, for, while, if-else" followed by a cascade should be a parse error with the exception of the assignment which is a statement but we want to allow it. I am not sure about yield -- it can be a bit difficult to agree on the exact semantic there (e.g. should I yield first, then continue with the cascade or should I execute the whole cascade and yield as a final step?) >>>>? >>Actually, thinking about my own ideas -- the bigger problem than yield is the return. I am not sure if things like >> >> >>return list(range(10)) >>? ? ..reverse() >>? ? ..pop(0) >> >> >>should be allowed or not - having something being executed after the return statement line might be a bit confusing >>_______________________________________________ >>Python-ideas mailing list >>Python-ideas at python.org >>https://mail.python.org/mailman/listinfo/python-ideas >> >> >> >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sun Nov 24 01:37:20 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 24 Nov 2013 11:37:20 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> Message-ID: <20131124003720.GU2085@ando> On Sun, Nov 24, 2013 at 09:04:59AM +1100, Chris Angelico wrote: > Just a really crazy idea... Does Python let you go "one level outside" > and tinker with the code that imports __main__? I haven't looked into > all that mechanism, but I know quite a bit of it is now implemented in > Python, so it's theoretically possible... could you, in effect, add a > line of code *after* that import that effectively calls > __main__.__main__(sys.argv) ? That would do most of what you want. It sounds like you're describing an import hook, although such things are completely opaque to me. I know they exist, but I've got no idea how they work or what they can do. -- Steven From steve at pearwood.info Sun Nov 24 01:40:37 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 24 Nov 2013 11:40:37 +1100 Subject: [Python-ideas] Dart-like method cascading operator in Python In-Reply-To: <1385245136.56367.YahooMailNeo@web184705.mail.ne1.yahoo.com> References: <20131122021636.GG2085@ando> <1385245136.56367.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: <20131124004037.GV2085@ando> On Sat, Nov 23, 2013 at 02:18:56PM -0800, Andrew Barnert wrote: > So, in the interests of having something to debate, here's an attempt > at a concrete, implementable syntax and semantics Thank you! -- Steven From haoyi.sg at gmail.com Sun Nov 24 04:07:43 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Sat, 23 Nov 2013 19:07:43 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131124003720.GU2085@ando> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> Message-ID: Import hooks don't work with __main__ =( I spent a while trying to get it to work when working on MacroPy, to no avail. On Sat, Nov 23, 2013 at 4:37 PM, Steven D'Aprano wrote: > On Sun, Nov 24, 2013 at 09:04:59AM +1100, Chris Angelico wrote: > > > Just a really crazy idea... Does Python let you go "one level outside" > > and tinker with the code that imports __main__? I haven't looked into > > all that mechanism, but I know quite a bit of it is now implemented in > > Python, so it's theoretically possible... could you, in effect, add a > > line of code *after* that import that effectively calls > > __main__.__main__(sys.argv) ? That would do most of what you want. > > It sounds like you're describing an import hook, although such things > are completely opaque to me. I know they exist, but I've got no idea how > they work or what they can do. > > > -- > Steven > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Nov 24 04:31:30 2013 From: rosuav at gmail.com (Chris Angelico) Date: Sun, 24 Nov 2013 14:31:30 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131124003720.GU2085@ando> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> Message-ID: On Sun, Nov 24, 2013 at 11:37 AM, Steven D'Aprano wrote: > On Sun, Nov 24, 2013 at 09:04:59AM +1100, Chris Angelico wrote: > >> Just a really crazy idea... Does Python let you go "one level outside" >> and tinker with the code that imports __main__? I haven't looked into >> all that mechanism, but I know quite a bit of it is now implemented in >> Python, so it's theoretically possible... could you, in effect, add a >> line of code *after* that import that effectively calls >> __main__.__main__(sys.argv) ? That would do most of what you want. > > It sounds like you're describing an import hook, although such things > are completely opaque to me. I know they exist, but I've got no idea how > they work or what they can do. I was more trying to get to the surrounding code. When a binary is executed, you fork a new process and exec it (on Unix). When a Python script is executed, something somewhere effectively goes: sys.argv = argv[1:] # trim off the 'python' and keep the rest import argv[1] as __main__ sys.exit() If that code exists in Python somewhere, or if you can in some way tinker with it, it would be possible to insert a call: import argv[1] as __main__ try: main = __main__.__main__ except NameError: main = lambda: None main() sys.exit() or similar. I could easily spin up a little C program that embeds Python, and then tinker with the exact startup sequence; is there a convenient way to do that in Python itself? Of course, it'd always be possible to rig something that gets invoked as: $ python3 x.py my_script_file.py but I'd rather replace the current startup code rather than augment it with another layer of indirection (and another file of code in the current directory). This would be a convenient way to experiment with theories like this, without fundamentally changing anything. What I'm thinking of here is how Pike (yeah, borrowing ideas again!) allows you to choose a different master; the default master does certain setups and then calls on your code, and by changing the master you can tinker with that. Normally you wouldn't need to, but it does make certain types of experimentation very convenient. ChrisA From greg.ewing at canterbury.ac.nz Sun Nov 24 04:35:20 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Nov 2013 16:35:20 +1300 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> Message-ID: <529173F8.4070006@canterbury.ac.nz> Haoyi Li wrote: > Import hooks don't work with __main__ =( I spent a while trying to get > it to work when working on MacroPy, to no avail. It's hard to see how they could. How would you *install* an import hook before __main__ got imported? -- Greg From haoyi.sg at gmail.com Sun Nov 24 04:44:41 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Sat, 23 Nov 2013 19:44:41 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <529173F8.4070006@canterbury.ac.nz> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: > It's hard to see how they could. How would you *install* an import hook before __main__ got imported? By doing crazy things like trying to reload __main__, or by deleting __main__ from the system modules and trying to re-import it, or by using introspection to load the file containing __main__ and stuffing it into sys.modules, or doing other nasty things. It's hard to see how any of these would work. I just wanted to confirm "they don't". On Sat, Nov 23, 2013 at 7:35 PM, Greg Ewing wrote: > Haoyi Li wrote: > >> Import hooks don't work with __main__ =( I spent a while trying to get it >> to work when working on MacroPy, to no avail. >> > > It's hard to see how they could. How would you *install* > an import hook before __main__ got imported? > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Sun Nov 24 07:25:40 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 23 Nov 2013 22:25:40 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <529173F8.4070006@canterbury.ac.nz> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: On Sat, Nov 23, 2013 at 7:35 PM, Greg Ewing wrote: > Haoyi Li wrote: > >> Import hooks don't work with __main__ =( I spent a while trying to get it >> to work when working on MacroPy, to no avail. >> > > It's hard to see how they could. How would you *install* > an import hook before __main__ got imported? *Don't do this*... but you can abuse sitecustomize: http://docs.python.org/3.4/library/site.html -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Nov 24 08:24:20 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 24 Nov 2013 17:24:20 +1000 Subject: [Python-ideas] making a module callable In-Reply-To: References: <528D7287.7040305@stoneleaf.us> <528FEA4B.8050504@canterbury.ac.nz> <20131123040729.GO2085@ando> Message-ID: On 24 November 2013 02:53, Haoyi Li wrote: >> Are you serious that replacing __main__ is a bad idea? If so, can you >> explain why? > > It also doesn't work, last I tried =( Strange things start breaking when you > replace __main__ The main thing that makes __main__ special is that it's a builtin module, but we then use its namespace to run Python code. Various parts of the interpreter assume that __main__ will always be the same module that was initialized during interpreter startup, so they don't have to keep re-initializing it (or checking if it has been replaced). It's not quite as intertwined with the interpreter internals as sys, since there's no direct reference to it from the interpreter state, but the case can certainly made that there *should* be such a reference if we're going to assume consistency over the the lifetime of the process. However, while I can't vouch for earlier versions, replacing __main__ in 3.3+ shouldn't cause any major issues, although it does mean certain things may not behave as expected (such as the -i switch and the PYTHONINSPECT option). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ben+python at benfinney.id.au Sun Nov 24 09:07:13 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 24 Nov 2013 19:07:13 +1100 Subject: [Python-ideas] making a module callable References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> Message-ID: <7wiovite8e.fsf@benfinney.id.au> Steven D'Aprano writes: > I'm serious, by the way. It's a nice Unix trick to have a single > executable do different things depending on what name it is called by. > Inspecting argv[0] lets you do that. Even if the executable only does one thing, it's still good to be able to *rename* the program and not have to change the usage and error messages:: import os import sys progname = os.path.basename(__file__) # ? sys.stdout.write( "{progname}: Couldn't frobnicate the spangule.\n".format( progname=progname)) So, definitely ?sys.argv? needs to continue having all command-line arguments, including the command name used to invoke the program. -- \ ?We must respect the other fellow's religion, but only in the | `\ sense and to the extent that we respect his theory that his | _o__) wife is beautiful and his children smart.? ?Henry L. Mencken | Ben Finney From greg.ewing at canterbury.ac.nz Sun Nov 24 10:19:54 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 24 Nov 2013 22:19:54 +1300 Subject: [Python-ideas] making a module callable In-Reply-To: <7wiovite8e.fsf@benfinney.id.au> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> Message-ID: <5291C4BA.2020606@canterbury.ac.nz> Ben Finney wrote: > So, definitely ?sys.argv? needs to continue having all command-line > arguments, including the command name used to invoke the program. That doesn't necessarily mean it has to be passed along with the arguments to a __main__() function, though. You can always extract it from sys.argv if you need it. Arguably it's more convenient to get it from there, since you're most often to want it for things like formatting error messages, which are probably not happening right inside the main function. -- Greg From flying-sheep at web.de Sun Nov 24 17:26:54 2013 From: flying-sheep at web.de (Philipp A.) Date: Sun, 24 Nov 2013 17:26:54 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: i?m all for a special method name or decorator, because of the namespace issue. once you do more on your main function than print(?Hello World?), say define variables, you tend to do: def main(): ... if __name__ == '__main__': main() in order not to pollute the namespace of the module. if __main__: ... looks nice, but will result in the same definition as the old behavior for that reason. so i?d propose one of the following API-wise: def __main__(): ... @mainfunctiondef main(): ... and implementation-wise, both should result in the function being called once the module is loaded, no matter where it is in the code. and both would AFAIK need a change to the module loader as AFAIK you can?t create a after-load hook for a module (if you can, the decorator would work without a change to the loader) 2013/11/24 Gregory P. Smith > > On Sat, Nov 23, 2013 at 7:35 PM, Greg Ewing wrote: > >> Haoyi Li wrote: >> >>> Import hooks don't work with __main__ =( I spent a while trying to get >>> it to work when working on MacroPy, to no avail. >>> >> >> It's hard to see how they could. How would you *install* >> an import hook before __main__ got imported? > > > *Don't do this*... but you can abuse sitecustomize: > http://docs.python.org/3.4/library/site.html > > -gps > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Nov 24 17:31:46 2013 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 25 Nov 2013 03:31:46 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: On Mon, Nov 25, 2013 at 3:26 AM, Philipp A. wrote: > so i?d propose one of the following API-wise: > > def __main__(): > ... > > @mainfunction > def main(): > ... The decorator minorly worries me; what happens if you use it on two functions? Presumably both would have to be called, in the order they're in the file (or rather, the order the decorators are called), but it'd be extremely confusing to try to read that code. ChrisA From flying-sheep at web.de Sun Nov 24 17:46:23 2013 From: flying-sheep at web.de (Philipp A.) Date: Sun, 24 Nov 2013 17:46:23 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: 2013/11/24 Chris Angelico The decorator minorly worries me; what happens if you use it on two > functions? Presumably both would have to be called, in the order > they're in the file (or rather, the order the decorators are called), > but it'd be extremely confusing to try to read that code. > > ChrisA > people can also do more than one if __name__ == '__main__' blocks ATM? -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sun Nov 24 17:55:38 2013 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 25 Nov 2013 03:55:38 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: On Mon, Nov 25, 2013 at 3:46 AM, Philipp A. wrote: > 2013/11/24 Chris Angelico >> >> The decorator minorly worries me; what happens if you use it on two >> functions? Presumably both would have to be called, in the order >> they're in the file (or rather, the order the decorators are called), >> but it'd be extremely confusing to try to read that code. >> >> ChrisA > > people can also do more than one if __name__ == '__main__' blocks ATM? True, but the concept of a "main function" is going to look far more like there should be only one. And it'd be extremely weird if: @mainfunction def main(): print("First function!") @mainfunction def main(): print("Second function!") called both, even though the second one shadows the first; and it'd be just as weird if: @mainfunction def main(): print("First function!") def main(): print("Second function!") called the second, even though it wasn't decorated as a main function. ChrisA From g.brandl at gmx.net Sun Nov 24 18:09:28 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 24 Nov 2013 18:09:28 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: On 11/24/2013 05:26 PM, Philipp A. wrote: > i?m all for a special method name or decorator, because of the namespace issue. > > once you do more on your main function than print(?Hello World?), say define > variables, you tend to do: > > | > def main(): > ... > > if __name__ == '__main__': > main()| > > in order not to pollute the namespace of the module. I don't get that argument: when the module is executed as the main script, you shouldn't need to care about the "module namespace" since it isn't really one. Georg From guido at python.org Sun Nov 24 20:10:43 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Nov 2013 11:10:43 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: Haven't followed all of this, but perhaps the simplest thing would be to define a new builtin function that returns True in the namespace of the main module and false everywhere else. It could be implemented by pulling '__name__' out of the caller's local namespace and comparing it to '__main__'. We could name this function __main__(), or perhaps less dramatic, is_main(). Then you could write if is_main(): For people who want to use this idiom on older platforms too they can easily implement it themselves using sys._getframe(). This is less magical than introducing a @mainfunction decorator, not really more typing, and easily understood by people who already know the old "if __name__ == '__main__'" idiom. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sun Nov 24 22:21:12 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 24 Nov 2013 16:21:12 -0500 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: On 11/24/2013 2:10 PM, Guido van Rossum wrote: > Haven't followed all of this, but perhaps the simplest thing would be to > define a new builtin function that returns True in the namespace of the > main module and false everywhere else. It could be implemented by > pulling '__name__' out of the caller's local namespace and comparing it > to '__main__'. We could name this function __main__(), or perhaps less > dramatic, is_main(). Then you could write > > if is_main(): > Writing 'is_main' (or anything else) instead of '__name__=='__main__' seems silly if it would be the only thing breaking back compatibility. > For people who want to use this idiom on older platforms too they can > easily implement it themselves using sys._getframe. ...if one know about sys._getframe and how to use it. The leading underscore implies that it is subject to change in each version. Writing an import (from somewhere dependable) for such a function will be as many keystrokes as writing the idiom. For my main project, I have a template file that includes 'if __name__.. ' along with other project boilerplate. -- Terry Jan Reedy From solipsis at pitrou.net Sun Nov 24 22:26:07 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 24 Nov 2013 22:26:07 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: <20131124222607.271fdc46@fsol> On Sun, 24 Nov 2013 11:10:43 -0800 Guido van Rossum wrote: > Haven't followed all of this, but perhaps the simplest thing would be to > define a new builtin function that returns True in the namespace of the > main module and false everywhere else. It could be implemented by pulling > '__name__' out of the caller's local namespace and comparing it to > '__main__'. We could name this function __main__(), or perhaps less > dramatic, is_main(). Then you could write > > if is_main(): > Why not make it so that a module function named __main__, if it exists, gets executed when the module is run as a script? (this would also mimick the __main__.py convention for packages) Regards Antoine. From ben+python at benfinney.id.au Sun Nov 24 22:26:07 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 25 Nov 2013 08:26:07 +1100 Subject: [Python-ideas] making a module callable References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> Message-ID: <7weh65trtc.fsf@benfinney.id.au> Greg Ewing writes: > Ben Finney wrote: > > So, definitely ?sys.argv? needs to continue having all command-line > > arguments, including the command name used to invoke the program. > > That doesn't necessarily mean it has to be passed along with the > arguments to a __main__() function, though. You can always extract it > from sys.argv if you need it. Yet the ?__main__? function needs to get the arguments as a parameter:: def __main__(argv): (or at least, that's how I've seen it done most commonly, and I agree that it makes a good intreface for ?__main__? functions). Now you're saying there is one command-line parameter which chould not come through that interface? Why the special case? > Arguably it's more convenient to get it from there, since you're most > often to want it for things like formatting error messages, which are > probably not happening right inside the main function. It's more convenient to look in ?the sequence of command-line parameters? for all the command-line parameters, without special-casing the command name. -- \ ?If [a technology company] has confidence in their future | `\ ability to innovate, the importance they place on protecting | _o__) their past innovations really should decline.? ?Gary Barnett | Ben Finney From ncoghlan at gmail.com Sun Nov 24 23:21:26 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 08:21:26 +1000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131124222607.271fdc46@fsol> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On 25 Nov 2013 07:26, "Antoine Pitrou" wrote: > > On Sun, 24 Nov 2013 11:10:43 -0800 > Guido van Rossum wrote: > > Haven't followed all of this, but perhaps the simplest thing would be to > > define a new builtin function that returns True in the namespace of the > > main module and false everywhere else. It could be implemented by pulling > > '__name__' out of the caller's local namespace and comparing it to > > '__main__'. We could name this function __main__(), or perhaps less > > dramatic, is_main(). Then you could write > > > > if is_main(): > > > > Why not make it so that a module function named __main__, if it exists, > gets executed when the module is run as a script? I consider the fact that the semantics of __main__ execution are largely the same as those of any other module import to be a feature rather than a bug. Keep in mind that we *can't* stop the current idiom from working (since we have to run the top level code to build the module in the first place), and that "run this script from disk" is just one way of executing __main__. For example, the REPL loop is a statement-by-statement interactive rendition of __main__, while __main__.py files in zipfiles, directories and packages don't bother with the "if __name__ == '__main__'" guard at all. Any "define a function with this special name" idiom would require making a decision on what it means in the REPL (implicit atexit() function? Automatically called when declared?), and changes not only to pythonrun.c, but also to runpy, IDLE, and various other IDE's. pdb, profile, coverage tools, etc would also all need to change (unless this was made an implicit feature of exec() and execfile(), which really doesn't sound like a good idea). Whether or not runpy's module API should trigger __main__ function execution becomes a tricky question, as does the fact that many of the PyRun_* helpers execute code in the __main__ namespace. Should they trigger execution of special __main__ functions as well? I don't have good answers to many of those questions, which is why I think the whole idea of introducing "main functions" to Python as anything more than a conventional idiom isn't worth the hassle. I consider the desire for such a feature just a relic of people's experience with languages like C and Java where the top level module code is just a declaration of program structure to the compiler rather than a full fledged execution environment. (Another point of confusion: C etc will complain if multiple main function declarations are linked into the same program, while Python would silently ignore any that weren't in the main module). > (this would also mimick the __main__.py convention for packages) Not really - that's still just normal script execution, and it has a couple of very clear triggers analogous to the "if __name__ == '__main__'" idiom for ordinary scripts (the import system indicating the specified name refers to a package rather than a simple module for module execution, or to a valid sys.path entry for direct execution). Since those modules don't include the idiomatic guard, running them directly in IDLE (or elsewhere) will still typically do the right thing. I don't mind Guido's idea of an "is_main()" builtin for 3.5, though. It should be less confusing for beginners, and for those that subsequently want to understand the details, it can be explained in terms of the existing, more explicit idiom. But trying to level shift from "the main module is special" to "the already special main module may optionally declare a special main function"? That's *way* more complicated than many folks seem to realise. Cheers, Nick. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Nov 24 23:27:35 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 25 Nov 2013 11:27:35 +1300 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: <52927D57.1070708@canterbury.ac.nz> Guido van Rossum wrote: > Haven't followed all of this, but perhaps the simplest thing would be to > define a new builtin function that returns True in the namespace of the > main module and false everywhere else. Can someone remind me why there's so much resistance to simply blessing a __main__() function? It would be straightforward and intuitive and in line with what just about every other language does. -- Greg From ncoghlan at gmail.com Sun Nov 24 23:37:21 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 08:37:21 +1000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <52927D57.1070708@canterbury.ac.nz> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <52927D57.1070708@canterbury.ac.nz> Message-ID: On 25 Nov 2013 08:28, "Greg Ewing" wrote: > > Guido van Rossum wrote: >> >> Haven't followed all of this, but perhaps the simplest thing would be to define a new builtin function that returns True in the namespace of the main module and false everywhere else. > > > Can someone remind me why there's so much resistance to > simply blessing a __main__() function? It would be > straightforward and intuitive and in line with what > just about every other language does. See my other email. Implementing it is not straightforward at all, and it's only intuitive to people that have been trained to think that way by C, Java, et al. Cheers, Nick. > > -- > Greg > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sun Nov 24 23:35:55 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 24 Nov 2013 14:35:55 -0800 (PST) Subject: [Python-ideas] making a module callable In-Reply-To: <7weh65trtc.fsf@benfinney.id.au> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> Message-ID: <1385332555.48756.YahooMailNeo@web184704.mail.ne1.yahoo.com> From: Ben Finney >Greg Ewing writes: > >> Ben Finney wrote: >> > So, definitely ?sys.argv? needs to continue having all command-line >> > arguments, including the command name used to invoke the program. >> >> That doesn't necessarily mean it has to be passed along with the >> arguments to a __main__() function, though. You can always extract it >> from sys.argv if you need it. > >Yet the ?__main__? function needs to get the arguments as a parameter:: > >? ? def __main__(argv): > >(or at least, that's how I've seen it done most commonly, and I agree >that it makes a good intreface for ?__main__? functions). Or to get the arguments as separate parameters: ? ? def __main__(*argv): ? or, more realistically: ? ? def __main__(inpath, outpath): Then: ? ? if __name__ == '__main__': ? ? ? ? __main__(*sys.argv[1:]) The benefit is that the names document what the arguments mean, and also give you better error messages if the script is called wrong.?Obviously any serious script is going to have a real usage error, etc., but then any serious script is going to use argparse anyway. For quick & dirty scripts, the first error below is obviously nicer than the second, and no more work. ? ? $ ./script.py ? ??TypeError: main() missing 2 required positional arguments: 'inpath' and 'outpath' ? ? $ ./script.py ? ? IndexError: list index out of range There's no reason you _couldn't_ write the idiom with main(argv0, inpath, outpath); I just haven't seen it that way. Scripts that explode argv always seem to do *argv[1:], while those that use it as a list usually seem to do all of argv. From greg.ewing at canterbury.ac.nz Sun Nov 24 23:37:39 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 25 Nov 2013 11:37:39 +1300 Subject: [Python-ideas] making a module callable In-Reply-To: <7weh65trtc.fsf@benfinney.id.au> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> Message-ID: <52927FB3.9030703@canterbury.ac.nz> Ben Finney wrote: > It's more convenient to look in ?the sequence of command-line > parameters? for all the command-line parameters, without special-casing > the command name. But you hardly ever want to process argv[0] the same way as the rest of the arguments, so you end up treating it as a special case anyway. It seems to me we only think of it as a command line argument because C traditionally presents it that way. I don't think it's something that would naturally come to mind otherwise. I know I found it quite surprising when I first encountered it. -- Greg From solipsis at pitrou.net Sun Nov 24 23:48:22 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 24 Nov 2013 23:48:22 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: <20131124234822.674874dc@fsol> On Mon, 25 Nov 2013 08:21:26 +1000 Nick Coghlan wrote: > On 25 Nov 2013 07:26, "Antoine Pitrou" wrote: > > > > On Sun, 24 Nov 2013 11:10:43 -0800 > > Guido van Rossum wrote: > > > Haven't followed all of this, but perhaps the simplest thing would be to > > > define a new builtin function that returns True in the namespace of the > > > main module and false everywhere else. It could be implemented by > pulling > > > '__name__' out of the caller's local namespace and comparing it to > > > '__main__'. We could name this function __main__(), or perhaps less > > > dramatic, is_main(). Then you could write > > > > > > if is_main(): > > > > > > > Why not make it so that a module function named __main__, if it exists, > > gets executed when the module is run as a script? > > I consider the fact that the semantics of __main__ execution are largely > the same as those of any other module import to be a feature rather than a > bug. Yes, I'm quite happy with the current idiom myself. I was merely suggesting this in case many people start opposing the statu quo and start demanding another idiom :-) Regards Antoine. From abarnert at yahoo.com Mon Nov 25 00:02:18 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 24 Nov 2013 15:02:18 -0800 (PST) Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <52927D57.1070708@canterbury.ac.nz> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <52927D57.1070708@canterbury.ac.nz> Message-ID: <1385334138.73758.YahooMailNeo@web184705.mail.ne1.yahoo.com> From: Greg Ewing >G uido van Rossum wrote: >> Haven't followed all of this, but perhaps the simplest thing would be > to define a new builtin function that returns True in the namespace of the main > module and false everywhere else. > > Can someone remind me why there's so much resistance to > simply blessing a __main__() function? It would be > straightforward and intuitive and in line with what > just about every other language does. Which languages, other than C and its direct descendants? The other major scripting languages?Ruby, Perl, JavaScript, PHP, Tcl, etc.?have no such thing; they all just execute all the top-level code, and provide some variable you can check to see if you're the "main module", just like Python. The conventional idioms in those languages are mostly borrowed from Python's. For example: ? ? # Ruby ? ? if __FILE__ == $0 ? ? ? ? exit(main(ARGV)) # or exit(MainClass.new(ARGV).run) ? ? end ? ? # Node.js ? ??if (require.main === module) { ? ? ? ? main(process.argv) ? ? } Most functional languages don't have any special main function or main environment.?Other procedural languages like Pascal, Fortran, etc. that have a special main environment do it with different syntax?e.g., the code goes in BEGIN PROGRAM instead of in a function/procedure.?The only one I can think of that uses a special main function like C is Visual BASIC (which borrowed it from C). From guido at python.org Mon Nov 25 00:11:04 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Nov 2013 15:11:04 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On Sun, Nov 24, 2013 at 2:21 PM, Nick Coghlan wrote: > > On 25 Nov 2013 07:26, "Antoine Pitrou" wrote: > > > > On Sun, 24 Nov 2013 11:10:43 -0800 > > Guido van Rossum wrote: > > > Haven't followed all of this, but perhaps the simplest thing would be > to > > > define a new builtin function that returns True in the namespace of the > > > main module and false everywhere else. It could be implemented by > pulling > > > '__name__' out of the caller's local namespace and comparing it to > > > '__main__'. We could name this function __main__(), or perhaps less > > > dramatic, is_main(). Then you could write > > > > > > if is_main(): > > > > > > > Why not make it so that a module function named __main__, if it exists, > > gets executed when the module is run as a script? > > I consider the fact that the semantics of __main__ execution are largely > the same as those of any other module import to be a feature rather than a > bug. > Right! > Keep in mind that we *can't* stop the current idiom from working (since we > have to run the top level code to build the module in the first place), and > that "run this script from disk" is just one way of executing __main__. For > example, the REPL loop is a statement-by-statement interactive rendition of > __main__, while __main__.py files in zipfiles, directories and packages > don't bother with the "if __name__ == '__main__'" guard at all. > > Any "define a function with this special name" idiom would require making > a decision on what it means in the REPL (implicit atexit() function? > Automatically called when declared?), and changes not only to pythonrun.c, > but also to runpy, IDLE, and various other IDE's. pdb, profile, coverage > tools, etc would also all need to change (unless this was made an implicit > feature of exec() and execfile(), which really doesn't sound like a good > idea). > > Whether or not runpy's module API should trigger __main__ function > execution becomes a tricky question, as does the fact that many of the > PyRun_* helpers execute code in the __main__ namespace. Should they trigger > execution of special __main__ functions as well? > I'm not sure that the REPL behavior w.r.t. such a special function should be the deciding factor. > I don't have good answers to many of those questions, which is why I think > the whole idea of introducing "main functions" to Python as anything more > than a conventional idiom isn't worth the hassle. I consider the desire for > such a feature just a relic of people's experience with languages like C > and Java where the top level module code is just a declaration of program > structure to the compiler rather than a full fledged execution environment. > > (Another point of confusion: C etc will complain if multiple main function > declarations are linked into the same program, while Python would silently > ignore any that weren't in the main module). > I think that explicit is better than implicit, and that's where a special function name loses. Also, there's no reasonable way to backport a special function. (Whereas backporting is_main() is trivial, since the behavior of sys._getframe() is past releases is stable.) A decorator is perhaps a little better, but it immediately brings up the question of when the decorated function should be called, if this *is* the main module. There are two options: after the import is complete (matching how a special function name would work) or at the point where the decorated function occurs. If we want people to be able to backport the decorator it would have to be the latter. But the decorator is more verbose than even the current idiom: @mainfunction def main(): vs. if __name__ == '__main__': whereas my proposal is shorter *and* more readable: if is_main(): > > (this would also mimick the __main__.py convention for packages) > > Not really - that's still just normal script execution, and it has a > couple of very clear triggers analogous to the "if __name__ == '__main__'" > idiom for ordinary scripts (the import system indicating the specified name > refers to a package rather than a simple module for module execution, or to > a valid sys.path entry for direct execution). > > Since those modules don't include the idiomatic guard, running them > directly in IDLE (or elsewhere) will still typically do the right thing. > > I don't mind Guido's idea of an "is_main()" builtin for 3.5, though. It > should be less confusing for beginners, and for those that subsequently > want to understand the details, it can be explained in terms of the > existing, more explicit idiom. But trying to level shift from "the main > module is special" to "the already special main module may optionally > declare a special main function"? That's *way* more complicated than many > folks seem to realise. > Yup. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Mon Nov 25 00:55:48 2013 From: bruce at leapyear.org (Bruce Leban) Date: Sun, 24 Nov 2013 15:55:48 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <52927D57.1070708@canterbury.ac.nz> References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <52927D57.1070708@canterbury.ac.nz> Message-ID: On Sun, Nov 24, 2013 at 2:27 PM, Greg Ewing wrote: > Can someone remind me why there's so much resistance to > simply blessing a __main__() function? It would be > straightforward and intuitive and in line with what > just about every other language does. > What's not intuitive is that __main__ would be called sometimes and not other times. Not knowing the idiom, seeing def __main__, why would I assume it's going to get executed automatically and that this is only true of one of the files being executed? The first time I saw __name__ == '__main__', I thought "Huh?" and went and looked it up and learned how it worked. If I had seen def __main__, I probably would have thought it worked just like C's main function which can be in any file. --- Bruce I'm hiring: http://www.cadencemd.com/info/jobs Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Mon Nov 25 04:08:04 2013 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 25 Nov 2013 12:08:04 +0900 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: I want to have callable main function to allow script can be executed directly or from other function. def main(): if is_main(): main() is not shorter than @mainfunction def main(): But I like: def __main__(): On Mon, Nov 25, 2013 at 8:11 AM, Guido van Rossum wrote: > On Sun, Nov 24, 2013 at 2:21 PM, Nick Coghlan wrote: > >> >> On 25 Nov 2013 07:26, "Antoine Pitrou" wrote: >> > >> > On Sun, 24 Nov 2013 11:10:43 -0800 >> > Guido van Rossum wrote: >> > > Haven't followed all of this, but perhaps the simplest thing would be >> to >> > > define a new builtin function that returns True in the namespace of >> the >> > > main module and false everywhere else. It could be implemented by >> pulling >> > > '__name__' out of the caller's local namespace and comparing it to >> > > '__main__'. We could name this function __main__(), or perhaps less >> > > dramatic, is_main(). Then you could write >> > > >> > > if is_main(): >> > > >> > >> > Why not make it so that a module function named __main__, if it exists, >> > gets executed when the module is run as a script? >> >> I consider the fact that the semantics of __main__ execution are largely >> the same as those of any other module import to be a feature rather than a >> bug. >> > Right! > >> Keep in mind that we *can't* stop the current idiom from working (since >> we have to run the top level code to build the module in the first place), >> and that "run this script from disk" is just one way of executing __main__. >> For example, the REPL loop is a statement-by-statement interactive >> rendition of __main__, while __main__.py files in zipfiles, directories and >> packages don't bother with the "if __name__ == '__main__'" guard at all. >> >> Any "define a function with this special name" idiom would require making >> a decision on what it means in the REPL (implicit atexit() function? >> Automatically called when declared?), and changes not only to pythonrun.c, >> but also to runpy, IDLE, and various other IDE's. pdb, profile, coverage >> tools, etc would also all need to change (unless this was made an implicit >> feature of exec() and execfile(), which really doesn't sound like a good >> idea). >> >> Whether or not runpy's module API should trigger __main__ function >> execution becomes a tricky question, as does the fact that many of the >> PyRun_* helpers execute code in the __main__ namespace. Should they trigger >> execution of special __main__ functions as well? >> > > I'm not sure that the REPL behavior w.r.t. such a special function should > be the deciding factor. > >> I don't have good answers to many of those questions, which is why I >> think the whole idea of introducing "main functions" to Python as anything >> more than a conventional idiom isn't worth the hassle. I consider the >> desire for such a feature just a relic of people's experience with >> languages like C and Java where the top level module code is just a >> declaration of program structure to the compiler rather than a full fledged >> execution environment. >> >> (Another point of confusion: C etc will complain if multiple main >> function declarations are linked into the same program, while Python would >> silently ignore any that weren't in the main module). >> > I think that explicit is better than implicit, and that's where a special > function name loses. Also, there's no reasonable way to backport a special > function. (Whereas backporting is_main() is trivial, since the behavior of > sys._getframe() is past releases is stable.) > > A decorator is perhaps a little better, but it immediately brings up the > question of when the decorated function should be called, if this *is* the > main module. There are two options: after the import is complete (matching > how a special function name would work) or at the point where the decorated > function occurs. If we want people to be able to backport the decorator it > would have to be the latter. > > But the decorator is more verbose than even the current idiom: > > @mainfunction > def main(): > > > vs. > > if __name__ == '__main__': > > > whereas my proposal is shorter *and* more readable: > > if is_main(): > > >> > (this would also mimick the __main__.py convention for packages) >> >> Not really - that's still just normal script execution, and it has a >> couple of very clear triggers analogous to the "if __name__ == '__main__'" >> idiom for ordinary scripts (the import system indicating the specified name >> refers to a package rather than a simple module for module execution, or to >> a valid sys.path entry for direct execution). >> >> Since those modules don't include the idiomatic guard, running them >> directly in IDLE (or elsewhere) will still typically do the right thing. >> >> I don't mind Guido's idea of an "is_main()" builtin for 3.5, though. It >> should be less confusing for beginners, and for those that subsequently >> want to understand the details, it can be explained in terms of the >> existing, more explicit idiom. But trying to level shift from "the main >> module is special" to "the already special main module may optionally >> declare a special main function"? That's *way* more complicated than many >> folks seem to realise. >> > Yup. > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Nov 25 04:16:32 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 24 Nov 2013 19:16:32 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On Sun, Nov 24, 2013 at 7:08 PM, INADA Naoki wrote: > I want to have callable main function to allow script can be executed > directly or from other function. > > def main(): > > > if is_main(): > main() > > is not shorter than > > @mainfunction > def main(): > > > But I like: > > def __main__(): > > I had a few other arguments also against the decorator. (Like less magic. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ben+python at benfinney.id.au Mon Nov 25 04:58:11 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Mon, 25 Nov 2013 14:58:11 +1100 Subject: [Python-ideas] making a module callable References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> Message-ID: <7w38mlt9nw.fsf@benfinney.id.au> Greg Ewing writes: > Ben Finney wrote: > > It's more convenient to look in ?the sequence of command-line > > parameters? for all the command-line parameters, without > > special-casing the command name. > > But you hardly ever want to process argv[0] the same way as the rest > of the arguments, so you end up treating it as a special case anyway. This isn't about *processing* that argument; it's about *receiving* it in the first place to the function. Having it omitted by default means there's a special case just to *get at* the first command-line argument:: def __main__(argv_without_first_arg=None): if argv is None: argv = sys.argv[1:] first_arg = sys.argv[0] Which still sucks, because how do I then pass a different command name to ?__main__? since it now expects to get it from elsewhere? Much better to have the interface just accept the *whole* sequence of command line arguments: def __main__(argv=None): if argv is None: argv = sys.argv Now it's completely up to the caller what the command-line looks like, which means the ?__main__? code needs no special cases for using the module as a library or for unit tests etc. You just construct the command-line as you need it to look, and pass it in to the function. > It seems to me we only think of it as a command line argument because > C traditionally presents it that way. What C does isn't relevant here. I think of the whole command line as a sequence of arguments because that's how the program receives the command line from the Python interpreter. Mangling it further just makes a common use case more difficult for no good reason. > I don't think it's something that would naturally come to mind > otherwise. I know I found it quite surprising when I first encountered > it. Many useful things are surprising when one first encounters them :-) -- \ ?I don't accept the currently fashionable assertion that any | `\ view is automatically as worthy of respect as any equal and | _o__) opposite view.? ?Douglas Adams | Ben Finney From ron3200 at gmail.com Mon Nov 25 05:14:31 2013 From: ron3200 at gmail.com (Ron Adam) Date: Sun, 24 Nov 2013 22:14:31 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On 11/24/2013 05:11 PM, Guido van Rossum wrote: > > Why not make it so that a module function named __main__, if it exists, > > gets executed when the module is run as a script? > > I consider the fact that the semantics of __main__ execution are > largely the same as those of any other module import to be a feature > rather than a bug. > > Right! > > Keep in mind that we *can't* stop the current idiom from working (since > we have to run the top level code to build the module in the first > place), and that "run this script from disk" is just one way of > executing __main__. For example, the REPL loop is a > statement-by-statement interactive rendition of __main__, while > __main__.py files in zipfiles, directories and packages don't bother > with the "if __name__ == '__main__'" guard at all. Right, it prevents that section from running in the case where it's not the main module. But spelling that explicitly isn't as nice. if not __name__ != "__main__": # Don't do this if this module isn't named "__main__". Regarding the is_main()... seems like it should take an argument. Most is_something() functions do. Is there a way to make something like the following work? def is_main(name=None): if name == None: name = get_attr_from_caller() # dynamic lookup return (name == "__main__") And how about just a global name __main__ that is always set to "__main__"? if __name__ is __main__: ... Not a big change, but it reads nice and maybe users will like it well enough not to keep suggesting changing it. ;-) Down the road (python6) you could change both of those to the real main modules name and they would still work. Just a few thoughts, Ron From rosuav at gmail.com Mon Nov 25 05:21:47 2013 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 25 Nov 2013 15:21:47 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On Mon, Nov 25, 2013 at 3:14 PM, Ron Adam wrote: > And how about just a global name __main__ that is always set to "__main__"? > > if __name__ is __main__: > ... > > Not a big change, but it reads nice and maybe users will like it well enough > not to keep suggesting changing it. ;-) But then you have to explain why you're using 'is' to compare strings, which shouldn't normally be done. Why not just use == as per current? ChrisA From cs at zip.com.au Mon Nov 25 08:19:32 2013 From: cs at zip.com.au (Cameron Simpson) Date: Mon, 25 Nov 2013 18:19:32 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: Message-ID: <20131125071932.GA65531@cskk.homeip.net> On 24Nov2013 17:26, Philipp A. wrote: > i?m all for a special method name or decorator, because of the namespace > issue. > > once you do more on your main function than print(?Hello World?), say > define variables, you tend to do: > > def main(): > ... > if __name__ == '__main__': > main() > > in order not to pollute the namespace of the module. [...] This is what I do, almost word for word. When I do this, the main() function is the first function in the module and the code at the bottom goes: if __name__ == '__main__': import sys sys.exit(main(sys.argv)) This make the main() function obvious when looking at the code, and makes things work intuitively on the command line, with a meaningful exit status. Modules without a main() run unit tests and I have an aspirational goal to fold a selftest mode into the module with a main(). A magic name? It seems a little overkill if the code is written in a fashion like the above: the main() is obvious. Cheers, -- Cameron Simpson SCCS, the source motel! Programs check in and never check out! - Ken Thompson From ncoghlan at gmail.com Mon Nov 25 11:20:23 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 25 Nov 2013 20:20:23 +1000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125071932.GA65531@cskk.homeip.net> References: <20131125071932.GA65531@cskk.homeip.net> Message-ID: On 25 November 2013 17:19, Cameron Simpson wrote: > On 24Nov2013 17:26, Philipp A. wrote: >> i?m all for a special method name or decorator, because of the namespace >> issue. >> >> once you do more on your main function than print(?Hello World?), say >> define variables, you tend to do: >> >> def main(): >> ... >> if __name__ == '__main__': >> main() >> >> in order not to pollute the namespace of the module. > [...] > > This is what I do, almost word for word. When I do this, the main() > function is the first function in the module and the code at the > bottom goes: > > if __name__ == '__main__': > import sys > sys.exit(main(sys.argv)) > > This make the main() function obvious when looking at the code, and makes > things work intuitively on the command line, with a meaningful exit status. > > Modules without a main() run unit tests and I have an aspirational goal to > fold a selftest mode into the module with a main(). > > A magic name? It seems a little overkill if the code is written in a fashion > like the above: the main() is obvious. So, rather than the low level "is_main", perhaps a higher level builtin would be appropriate? >>> def run_if_main(f): ... import sys ... if sys._getframe(-1).f_globals.get("__name__") == "__main__": ... sys.exit(f(sys.argv)) ... Starting to get a little too magical for my taste at that point, although with a small tweak (to return "f" in the "not main" case) it *does* allow the idiom: @run_if_main def main(argv): ... Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From fuzzyman at gmail.com Mon Nov 25 13:26:40 2013 From: fuzzyman at gmail.com (Michael Foord) Date: Mon, 25 Nov 2013 12:26:40 +0000 Subject: [Python-ideas] making a module callable In-Reply-To: <7w38mlt9nw.fsf@benfinney.id.au> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> Message-ID: On 25 November 2013 03:58, Ben Finney wrote: > Greg Ewing writes: > > > Ben Finney wrote: > > > It's more convenient to look in ?the sequence of command-line > > > parameters? for all the command-line parameters, without > > > special-casing the command name. > > > > But you hardly ever want to process argv[0] the same way as the rest > > of the arguments, so you end up treating it as a special case anyway. > > This isn't about *processing* that argument; it's about *receiving* it > in the first place to the function. Having it omitted by default means > there's a special case just to *get at* the first command-line > argument:: > > The name of the script is not an argument *to* the script. Having it there in the first place is the special case, not removing it. It's only an old C convention (and now an old Python convention) that makes you think it is. Michael Foord > def __main__(argv_without_first_arg=None): > if argv is None: > argv = sys.argv[1:] > first_arg = sys.argv[0] > > Which still sucks, because how do I then pass a different command name > to ?__main__? since it now expects to get it from elsewhere? > > Much better to have the interface just accept the *whole* sequence of > command line arguments: > > def __main__(argv=None): > if argv is None: > argv = sys.argv > > Now it's completely up to the caller what the command-line looks like, > which means the ?__main__? code needs no special cases for using the > module as a library or for unit tests etc. You just construct the > command-line as you need it to look, and pass it in to the function. > > > It seems to me we only think of it as a command line argument because > > C traditionally presents it that way. > > What C does isn't relevant here. I think of the whole command line as a > sequence of arguments because that's how the program receives the > command line from the Python interpreter. Mangling it further just makes > a common use case more difficult for no good reason. > > > I don't think it's something that would naturally come to mind > > otherwise. I know I found it quite surprising when I first encountered > > it. > > Many useful things are surprising when one first encounters them :-) > > -- > \ ?I don't accept the currently fashionable assertion that any | > `\ view is automatically as worthy of respect as any equal and | > _o__) opposite view.? ?Douglas Adams | > Ben Finney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Nov 25 15:12:21 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 26 Nov 2013 01:12:21 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> Message-ID: <20131125141220.GE2085@ando> On Mon, Nov 25, 2013 at 08:20:23PM +1000, Nick Coghlan wrote: > So, rather than the low level "is_main", perhaps a higher level > builtin would be appropriate? > > >>> def run_if_main(f): > ... import sys > ... if sys._getframe(-1).f_globals.get("__name__") == "__main__": > ... sys.exit(f(sys.argv)) > ... > > Starting to get a little too magical for my taste at that point, > although with a small tweak (to return "f" in the "not main" case) it > *does* allow the idiom: > > @run_if_main > def main(argv): > ... I love decorators, I really do, but I strongly feel that a decorator is completely the wrong API for this functionality. I can't quite put it into words, it's an aesthetic thing, but there is one concrete objection I can give. To understand the current "if __name__" idiom, the user only needs to understand the if statement and the rule that Python defines a special global variable __name__. To understand this run_if_main idiom, the user has to understand decorators. It's hard to believe, but I've come across people -- even quite experienced Python developers -- who avoid decorators because they don't quite get how they work. And of course beginners have no idea what a decorator is. These means explaining about higher-order functions. For something as basic as running a main function when called as a script, one shouldn't have to understand functional programming, higher order functions and decorators. Here are the alternatives, as I see them: (1) Keep the status quo. I don't actually dislike the status quo, even though after 15 years I still write "if __name__ is '__main__'" and have to go back and correct it :-) This is a bit magical, so although it's a perfectly acceptable solution, perhaps we can make something a bit less magical? (Or at least, stick the magic in the implementation, rather than in the user's code.) This has suited Python well for 20-odd years, and there's nothing really wrong with it. +0.5 on this. (2) Have a special function, called "main" or more likely "__main__", which is automagically called by Python if and only if the module is being run as the main module. I still have a soft-spot for this, but I'm now satisfied that it is the wrong solution. Arguments against: - Explicit is better than implicit. - Because it needs support from the compiler, you can't back-port this to older versions of Python. - People will be confused by the fact that __main__ is sometimes automatically called and sometimes not. -1 on this one. (3) Add an is_main() function to simplify the idiom to: if is_main(): ... Pros: - the magic of deciding whether we're running in the main module is hidden behind an abstraction layer; - even more easily understood than the current "if __name__" idiom; - easily backported. Cons: - one more built-in. I give this one a +1. (4) Like #3 above, but make it a (read-only?) global variable, like __debug__. Possibly spelled "__main__". The idiom becomes: if is_main: ... if __main__: ... Pros: - Some people might feel that "this is the main module" feels more like a global variable than a function call. Cons: - If read-only, that requires some magic behind the scenes. - If not read-only, then people will mess about with it and get confused. +0.5 if read-only, -0.5 if not. (5) A decorator-based solution. Pros: - More explicit than automagically calling a function based on its name. Cons: - Feels wrong to me. I realise that's entirely subjective. - To me, "run the main function" seems far too basic an operation to justify requiring the user learn about decorators first. - Have to import the decorator first. - Unless it's a built-in, in which case, yet another built-in. - If we go with this solution, the bike-shedding. Oh the bike- shedding! * what should it pass to the decorated function? ~ sys.argv ~ a copy of sys.argv ~ sys.argv[1:] ~ nothing at all * should you be allowed to decorate more than one function? * should the function(s) be called immediately, or queued up and then run after the module is fully loaded? * call sys.exit? I'm -1 on this. There are too many slightly different flavours of this solution, and none of them are really hard to solve once you know whether or not your in the main module. -- Steven From ncoghlan at gmail.com Mon Nov 25 15:22:12 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 26 Nov 2013 00:22:12 +1000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125141220.GE2085@ando> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> Message-ID: On 26 November 2013 00:12, Steven D'Aprano wrote: I'm wondering if we should add a link to Steven's post from http://www.python.org/dev/peps/pep-0299/ (and perhaps even update the PEP text itself) As the status of PEP 299 shows, Guido has rejected the idea of a special main function before, but I think Steven's post does a better job of spelling out "Why not?" than any of the previous discussions. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ron3200 at gmail.com Mon Nov 25 15:55:18 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 25 Nov 2013 08:55:18 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On 11/24/2013 10:21 PM, Chris Angelico wrote: > But then you have to explain why you're using 'is' to compare strings, > which shouldn't normally be done. Why not just use == as per current? In the case of generally parsing strings, that is correct as there may be more than one of something that can match. But when there is only *one* of something, and you aren't manipulating the parts, then "is" should be ok. __name__ would be bound directly to __main__'s string object, so "is" or "==" should always work for that case. Cheers, Ron From ron3200 at gmail.com Mon Nov 25 16:06:47 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 25 Nov 2013 09:06:47 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On 11/24/2013 10:14 PM, Ron Adam wrote: > Regarding the is_main()... seems like it should take an argument. Most > is_something() functions do. Is there a way to make something like the > following work? > > def is_main(name=None): > if name == None: > name = get_attr_from_caller() # dynamic lookup > return (name == "__main__") This should have been... def is_main(name=None): if name == None: name = get_attr_from_caller("__name__") # dynamic lookup return (name == "__main__") -Ron From rob.cliffe at btinternet.com Mon Nov 25 16:10:43 2013 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Mon, 25 Nov 2013 15:10:43 +0000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125141220.GE2085@ando> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> Message-ID: <52936873.9000305@btinternet.com> On 25/11/2013 14:12, Steven D'Aprano wrote: > (4) Like #3 above, but make it a (read-only?) global variable, like > __debug__. Possibly spelled "__main__". The idiom becomes: > > > if is_main: > ... > > > if __main__: > ... > > > Pros: > > - Some people might feel that "this is the main module" feels > more like a global variable than a function call. > > Cons: > > - If read-only, that requires some magic behind the scenes. > > - If not read-only, then people will mess about with it and > get confused. > > +0.5 if read-only, -0.5 if not. > I like this idea (a global variable called __main__), provided that it can be implemented without slowing down access to other variables. Say __main__ did not exist as a real variable, so attempting to read it triggered a NameError behind the scenes. The error handling code could then check for a name of '__main__' as a special case. Does this make sense? Con: A variable which changes its value without being assigned to is more magical/unexpected than a function (is_main()) which returns different values at different times. But I think people would soon get used to it, and find it convenient. The behaviour could be overridden (if the run-time allowed it) by defining a "real" variable called __main__. Well, among consenting adults, why not? It might even be useful. Rob Cliffe From rosuav at gmail.com Mon Nov 25 16:15:36 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 26 Nov 2013 02:15:36 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On Tue, Nov 26, 2013 at 1:55 AM, Ron Adam wrote: > But when there is only *one* of something, and you aren't manipulating the > parts, then "is" should be ok. __name__ would be bound directly to > __main__'s string object, so "is" or "==" should always work for that case. Really? Is this something that's guaranteed? I know the CPython compiler will optimize a lot of constants, but can you really be sure that this is safe? ChrisA From ron3200 at gmail.com Mon Nov 25 16:35:46 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 25 Nov 2013 09:35:46 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On 11/25/2013 09:15 AM, Chris Angelico wrote: > On Tue, Nov 26, 2013 at 1:55 AM, Ron Adam wrote: >> But when there is only *one* of something, and you aren't manipulating the >> parts, then "is" should be ok. __name__ would be bound directly to >> __main__'s string object, so "is" or "==" should always work for that case. > > Really? Is this something that's guaranteed? I know the CPython > compiler will optimize a lot of constants, but can you really be sure > that this is safe? It seems to me, that you can't do this ... if __main__ is "__main__": That depends on the compiler to optimise that case so they are the same object. But in this case... __main__ = "__main__" __name__ = __main__ assert __main__ is __name__ That should always work. I would think it was a bug if it didn't. Cheers, Ron From guido at python.org Mon Nov 25 16:43:46 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Nov 2013 07:43:46 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> Message-ID: I still don't like the decorator nor the magic function name. They don't offer the same functionality. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Nov 25 16:56:22 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 26 Nov 2013 02:56:22 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On Tue, Nov 26, 2013 at 2:35 AM, Ron Adam wrote: > But in this case... > > __main__ = "__main__" > __name__ = __main__ > assert __main__ is __name__ > > That should always work. I would think it was a bug if it didn't. Of course, that will work. But this isn't guaranteed to: __main__ = "__main__" __name__ = "__main__" assert __main__ is __name__ It quite possibly will, but it's not guaranteed by the language. And this is more what's being done here - a separate literal. ChrisA From solipsis at pitrou.net Mon Nov 25 17:05:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 25 Nov 2013 17:05:18 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> Message-ID: <20131125170518.4700e664@fsol> On Tue, 26 Nov 2013 01:12:21 +1100 Steven D'Aprano wrote: > > (1) Keep the status quo. +1. The status quo has another benefit: it teaches beginners about __name__ and, with it, the fact that many Python objects have useful introspection data. > (3) Add an is_main() function to simplify the idiom to: > > if is_main(): > ... > > Pros: > > - the magic of deciding whether we're running in the main module > is hidden behind an abstraction layer; > > - even more easily understood than the current "if __name__" idiom; > > - easily backported. > > Cons: > > - one more built-in. Other con: the is_main() implementation would have to be hackish (use of sys._getframe() or a similar trick). I'm personally -0.5. > (5) A decorator-based solution. -1. Much too complicated for a simple need. Regards Antoine. From steve at pearwood.info Mon Nov 25 17:26:51 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 26 Nov 2013 03:26:51 +1100 Subject: [Python-ideas] OT p [was Re: Dart-like method cascading operator in Python] In-Reply-To: References: <0CEA25DD-DF71-45CD-B9B7-EB5A5BE0BE0D@masklinn.net> Message-ID: <20131125162651.GG2085@ando> On Thu, Nov 21, 2013 at 09:48:56AM -0800, Haoyi Li wrote: > - You also better make sure you didn't have a variable somewhere else > called `p` for Pressure in one of your equations which you just stomped > over, or `p` for Momentum, or Price, or Probability. It's not inconceivable > that you'd want to plot these things! I stumbled across this and thought it was relevent and amusing: http://www.johndcook.com/blog/2008/02/23/everything-begins-with-p/ -- Steven From abarnert at yahoo.com Mon Nov 25 18:14:31 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Nov 2013 09:14:31 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125141220.GE2085@ando> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> Message-ID: On Nov 25, 2013, at 6:12, Steven D'Aprano wrote: > To understand the current "if __name__" idiom, the user only needs to > understand the if statement and the rule that Python defines a special > global variable __name__. To understand this run_if_main idiom, the user > has to understand decorators. It's hard to believe, but I've come across > people -- even quite experienced Python developers -- who avoid > decorators because they don't quite get how they work. And of course > beginners have no idea what a decorator is. These means explaining about > higher-order functions. No, it really doesn't. In order to write a decorator, or read it's code, you have to understand HOFs. But to use a decorator? I've seen many people successfully using @property, @classmethod, etc. without even knowing that they're functions. (Never mind that most of them have no need for those decorators, the point is that they have no problem figuring out how to use them.) I don't like the decorator idea either, but I think this is the wrong argument. Guido's points that it's too magical, and ends up providing less power with that magic than you already have without, seem better. But Steven's bikesheddability argument (right next to the "have to understand decorators" argument) is what really sells me that this is the wrong answer. All those options means that people will have to learn or look up which option Python chose--and, as usual, many novices won't even realize there was an option to choose from, assume wrong, and have to ask someone "Why does my code after the @is_main function never get run, while the same code after an old-style if __name__ block does?" because they don't realize that the main function's result is passed to exit because they never even realized such a thing was possible. From abarnert at yahoo.com Mon Nov 25 18:27:18 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 25 Nov 2013 09:27:18 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125170518.4700e664@fsol> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125170518.4700e664@fsol> Message-ID: On Nov 25, 2013, at 8:05, Antoine Pitrou wrote: > On Tue, 26 Nov 2013 01:12:21 +1100 > Steven D'Aprano wrote: >> >> (1) Keep the status quo. > > +1. > > The status quo has another benefit: it teaches beginners about __name__ > and, with it, the fact that many Python objects have useful > introspection data. This is a great point. In fact, I even took advantage of this fact a few days ago, explaining the __self__ attribute on (bound) methods by saying "Almost everything in Python has useful attributes, like the __name__ attribute on modules that you use every day", so I'm not sure why I didn't notice this benefit until you pointed it out... But I'm glad you did. >> (3) Add an is_main() function to simplify the idiom to: >> >> if is_main(): >> ... >> >> Pros: >> >> - the magic of deciding whether we're running in the main module >> is hidden behind an abstraction layer; >> >> - even more easily understood than the current "if __name__" idiom; >> >> - easily backported. >> >> Cons: >> >> - one more built-in. > > Other con: the is_main() implementation would have to be hackish (use > of sys._getframe() or a similar trick). Agreed. If Python had a (non-hacky) way to declare a variable as coming from the caller's namespace instead of the defining namespace this would be a great answer. But Python doesn't (and probably shouldn't) have any such feature, and a function that everyone learns as a novice that implies such a feature exists is probably a bad idea. > I'm personally -0.5. > >> (5) A decorator-based solution. > > -1. Much too complicated for a simple need. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From barry at python.org Mon Nov 25 20:42:44 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 25 Nov 2013 14:42:44 -0500 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> Message-ID: <20131125144244.1cb160f0@anarchist> On Nov 26, 2013, at 01:12 AM, Steven D'Aprano wrote: >(1) Keep the status quo. I'd have no problem with this. The current idiom doesn't seem broken to me, nor is it that hard to type. I also don't think it's very hard to discover given how common it is. >if __main__: If we *had* to make this easier to type, this would be my choice. It doesn't even have to be read-only, given that __name__ can be messed with, but usually isn't. Why then worry about __main__ getting messed with? -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From guido at python.org Mon Nov 25 20:50:15 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Nov 2013 11:50:15 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125144244.1cb160f0@anarchist> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> Message-ID: On Mon, Nov 25, 2013 at 11:42 AM, Barry Warsaw wrote: > On Nov 26, 2013, at 01:12 AM, Steven D'Aprano wrote: > > >(1) Keep the status quo. > > I'd have no problem with this. The current idiom doesn't seem broken to > me, > nor is it that hard to type. I also don't think it's very hard to discover > given how common it is. > All agreed, yet it is not that easy to type either (underscores and quotes require the shift key). Perhaps more important, it causes everyone who sees it first to wonder why the idiom isn't simpler. > >if __main__: > > If we *had* to make this easier to type, this would be my choice. It > doesn't > even have to be read-only, given that __name__ can be messed with, but > usually > isn't. Why then worry about __main__ getting messed with? > The problem with this is, how would you implement this? You can either make __main__ a builtin object with a magic __bool__() method, or you can plunk a bool named __main__ in every module namespace. I don't much like either. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Nov 25 22:10:27 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 26 Nov 2013 07:10:27 +1000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> Message-ID: On 26 Nov 2013 05:51, "Guido van Rossum" wrote: > > On Mon, Nov 25, 2013 at 11:42 AM, Barry Warsaw wrote: >> >> On Nov 26, 2013, at 01:12 AM, Steven D'Aprano wrote: >> >> >(1) Keep the status quo. >> >> I'd have no problem with this. The current idiom doesn't seem broken to me, >> nor is it that hard to type. I also don't think it's very hard to discover >> given how common it is. > > > All agreed, yet it is not that easy to type either (underscores and quotes require the shift key). Perhaps more important, it causes everyone who sees it first to wonder why the idiom isn't simpler. > >> >> >if __main__: >> >> If we *had* to make this easier to type, this would be my choice. It doesn't >> even have to be read-only, given that __name__ can be messed with, but usually >> isn't. Why then worry about __main__ getting messed with? > > > The problem with this is, how would you implement this? You can either make __main__ a builtin object with a magic __bool__() method, or you can plunk a bool named __main__ in every module namespace. I don't much like either. Shadowing would allow this to work without magic and with changes to only two modules: you could have a builtin __main__ that was always False and then set a __main__=True attribute in the main module. This still teaches the lesson about runtime introspection *and* could be used to teach an important lesson about name shadowing. Like Barry, this is my favourite of the suggestions so far, but I'm still not sure it's worth the hassle: - any such code would automatically be 3.5+ only (failing with NameError on earlier versions) - anything that emulates __main__ execution in a namespace other than the interpreter provided one would also need to be updated to set the new attribute - it adds "Why is __name__ wrong in __main__?" to the list of historical relics that can only be explained in terms of "the modern alternative didn't always exist" Potentially still a net win, though - call it +0 from me. Cheers, Nick. > > -- > --Guido van Rossum (python.org/~guido) > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Nov 25 22:22:53 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Nov 2013 13:22:53 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> Message-ID: On Mon, Nov 25, 2013 at 1:10 PM, Nick Coghlan wrote: > > On 26 Nov 2013 05:51, "Guido van Rossum" wrote: > > > > On Mon, Nov 25, 2013 at 11:42 AM, Barry Warsaw wrote: > >> > >> On Nov 26, 2013, at 01:12 AM, Steven D'Aprano wrote: > >> > >> >(1) Keep the status quo. > >> > >> I'd have no problem with this. The current idiom doesn't seem broken > to me, > >> nor is it that hard to type. I also don't think it's very hard to > discover > >> given how common it is. > > > > > > All agreed, yet it is not that easy to type either (underscores and > quotes require the shift key). Perhaps more important, it causes everyone > who sees it first to wonder why the idiom isn't simpler. > > > >> > >> >if __main__: > >> > >> If we *had* to make this easier to type, this would be my choice. It > doesn't > >> even have to be read-only, given that __name__ can be messed with, but > usually > >> isn't. Why then worry about __main__ getting messed with? > > > > > > The problem with this is, how would you implement this? You can either > make __main__ a builtin object with a magic __bool__() method, or you can > plunk a bool named __main__ in every module namespace. I don't much like > either. > > Shadowing would allow this to work without magic and with changes to only > two modules: you could have a builtin __main__ that was always False and > then set a __main__=True attribute in the main module. > > This still teaches the lesson about runtime introspection *and* could be > used to teach an important lesson about name shadowing. > > Like Barry, this is my favourite of the suggestions so far, but I'm still > not sure it's worth the hassle: > > - any such code would automatically be 3.5+ only (failing with NameError > on earlier versions) > - anything that emulates __main__ execution in a namespace other than the > interpreter provided one would also need to be updated to set the new > attribute > - it adds "Why is __name__ wrong in __main__?" to the list of historical > relics that can only be explained in terms of "the modern alternative > didn't always exist" > > Potentially still a net win, though - call it +0 from me. > Still only -0 from me, mostly because of the first two of your items, and because it just replaces one kind of magic with another. But mostly I don't see why it has to involve a __dunder__ name (other reminding the reader of the old idiom). The reason for using __dunder__ style for __name__ and '__main__' was clear: they impose on namespaces that are nominally the user's (__name__ is a global variable, the value '__main__' is a module name, we don't want to interfere with a user-defined global variable named 'name' nor with a user-defined module named 'main'). But we don't need the same kind of caution for builtins (if the use defines a variable is_builtin that variable wins). (I don't get the point against my is_main() proposal that it also uses magic. It's a builtin. *Of course* it is allowed to use magic.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Nov 25 22:44:09 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 25 Nov 2013 22:44:09 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> Message-ID: <20131125224409.31109184@fsol> On Mon, 25 Nov 2013 13:22:53 -0800 Guido van Rossum wrote: > > (I don't get the point against my is_main() proposal that it also uses > magic. It's a builtin. *Of course* it is allowed to use magic.) Just because it is allowed to use magic doesn't mean it's a good idea, though. I can imagine people struggling to understand how it works (and how they can replicate it), while the current idiom is very easy to understand. I still don't think the current idiom is problematic, so I'm -0.8 on the whole thing :-) Regards Antoine. From guido at python.org Mon Nov 25 22:57:57 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Nov 2013 13:57:57 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125224409.31109184@fsol> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> Message-ID: On Mon, Nov 25, 2013 at 1:44 PM, Antoine Pitrou wrote: > On Mon, 25 Nov 2013 13:22:53 -0800 > Guido van Rossum wrote: > > > > (I don't get the point against my is_main() proposal that it also uses > > magic. It's a builtin. *Of course* it is allowed to use magic.) > > Just because it is allowed to use magic doesn't mean it's a good idea, > though. Really? Many builtins do huge amounts of magic. > I can imagine people struggling to understand how it works > (and how they can replicate it), while the current idiom is very easy to > understand. > Hm. I think most people who wonder how it works would learn something from figuring it out (like I did as a child disassembling my mother's vacuum cleaner :-). But most everyone else wonders why this is such a strange idiom. > > I still don't think the current idiom is problematic, so I'm -0.8 on > the whole thing :-) > Check the length of this thread. (And the ones before it. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Mon Nov 25 23:12:10 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 25 Nov 2013 16:12:10 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On 11/25/2013 09:56 AM, Chris Angelico wrote: > On Tue, Nov 26, 2013 at 2:35 AM, Ron Adam wrote: >> >But in this case... >> > >> > __main__ = "__main__" >> > __name__ = __main__ >> > assert __main__ is __name__ >> > >> >That should always work. I would think it was a bug if it didn't. > Of course, that will work. But this isn't guaranteed to: > > __main__ = "__main__" > __name__ = "__main__" > assert __main__ is __name__ > > It quite possibly will, but it's not guaranteed by the language. And > this is more what's being done here - a separate literal. Are thinking that the __main__ global would be defined after the main module is loaded? I was thinking that __main__ would be set to "__main__" first, then when the main module is loaded, it's __name__ attribute set to __main__ rather than "__main__". Exactly as the first example above. Or are you thinking there may be more than one main module? If you just feel strongly that using 'is' is a bad practice in this case, I'm fine with that. Cheers, Ron From solipsis at pitrou.net Mon Nov 25 23:14:22 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 25 Nov 2013 23:14:22 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> Message-ID: <20131125231422.7dd551c5@fsol> On Mon, 25 Nov 2013 13:57:57 -0800 Guido van Rossum wrote: > > I still don't think the current idiom is problematic, so I'm -0.8 on > > the whole thing :-) > > Check the length of this thread. (And the ones before it. :-) Well, the #1 complaint seems be that it's not terribly pretty. That doesn't sound like a huge problem to me :-) Am I missing something? (I also understand that it relies on a more-or-less implementation detail, but the implementation detail will have to stay for compatibility reasons, anyway) Regards Antoine. From barry at python.org Mon Nov 25 23:16:03 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 25 Nov 2013 17:16:03 -0500 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> Message-ID: <20131125171603.1812e9fd@anarchist> On Nov 25, 2013, at 01:22 PM, Guido van Rossum wrote: >Still only -0 from me, mostly because of the first two of your items, and >because it just replaces one kind of magic with another. Just to continue playing along (since I have no problem with the current idiom), I do like Nick's shadowing idea. I guess what I don't like about is_main() is that it's a function call, and is two words separated by an underscore. I have no technical arguments against it, just that to me it doesn't look as pretty. And also, I guess, a function call seems a little more magical than checking an attribute value. What does the function *do*? OTOH, I guess a shadowed builtin variable is a little magical too, but maybe a touch more transparent magic. ;) >But mostly I don't see why it has to involve a __dunder__ name (other >reminding the reader of the old idiom). The reason for using __dunder__ >style for __name__ and '__main__' was clear: they impose on namespaces that >are nominally the user's (__name__ is a global variable, the value >'__main__' is a module name, we don't want to interfere with a user-defined >global variable named 'name' nor with a user-defined module named 'main'). >But we don't need the same kind of caution for builtins (if the use defines >a variable is_builtin that variable wins). I don't think we could use a builtin called "main", because you'd get some confusions like this: def main(): # blah blah if main: main() OTOH, maybe that's actually kind of cute. It means a builtin `main` could be False and no implicit shadowing would be necessary. I guess some folks don't like to call their main function main() so maybe that wouldn't work for them, but it's not like the old idiom would go away. Except: you surely could have main() functions in other modules which aren't run as scripts, so that would break. Okay, never mind. But I do think that this means any magic builtin would have to be dundered. Even is_main() could be a legitimate name for a function in an existing module. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From elazarg at gmail.com Mon Nov 25 23:25:56 2013 From: elazarg at gmail.com (=?UTF-8?B?15DXnNei15bXqA==?=) Date: Tue, 26 Nov 2013 00:25:56 +0200 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: 2013/11/24 Guido van Rossum > > if is_main(): > > How about going the other way around? if imported(): break Most of the time, people put the main stuff at the end of the script, so this check can serve as a seperator, equivalent to what is sometimes marked with '***********************' - snip the script here. Of course, it is still possible to do if not imported(): main() I think it is at least as obvious as is_main(), and even more so for those without a C-like background. (Personally I'd prefer an `imported` magic variable but I guess that's out of question). Elazar -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Mon Nov 25 23:26:27 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 26 Nov 2013 09:26:27 +1100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: On Tue, Nov 26, 2013 at 9:12 AM, Ron Adam wrote: > I was thinking that __main__ would be set to "__main__" first, then when the > main module is loaded, it's __name__ attribute set to __main__ rather than > "__main__". Exactly as the first example above. I thought you were describing assigning a particular string to __name__, which is where the problem would come from. If it's being set to "whatever's in __main__", then yes, your description is correct and it's safe. OTOH, what you now have is: # standard idiom, everyone knows this works if __name__ is __main__: # use of actual thing: it's a string print(__name__) # everywhere else, this is the wrong thing to do: if input() is "yes": do_dangerous_stuff() So now you have to explain why it's right to use 'is', but only with these things, which are magical. Possibly the easiest would be to guarantee that __main__ and __name__ are sys.intern()'d, which could then lead into an explanation of interning rather than an explanation of magic. ChrisA From p.f.moore at gmail.com Mon Nov 25 23:28:32 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Mon, 25 Nov 2013 22:28:32 +0000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125224409.31109184@fsol> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> Message-ID: On 25 November 2013 21:44, Antoine Pitrou wrote: > On Mon, 25 Nov 2013 13:22:53 -0800 > Guido van Rossum wrote: >> >> (I don't get the point against my is_main() proposal that it also uses >> magic. It's a builtin. *Of course* it is allowed to use magic.) > > Just because it is allowed to use magic doesn't mean it's a good idea, > though. I can imagine people struggling to understand how it works > (and how they can replicate it), while the current idiom is very easy to > understand. > > I still don't think the current idiom is problematic, so I'm -0.8 on > the whole thing :-) Also, the current idiom is in use in an immense amount of documentation and existing code. That's not going away, so people now have *two* ways of saying the same thing. And to be honest, I suspect many of the "old hands" will simply ignore the new idiom and continue using (and teaching others, by example if nothing else) the old one - so it's not even as if the old idiom is going to disappear. If we were starting from scratch, is_main() might be a reasonable proposal, but I don't see the value given that we're not. I'm definitely at least -0.5. And I can't see myself ever using the new style even if it's implemented. Paul. PS This is assuming that nobody is suggesting breaking "if __name__ == '__main__'". If they are, then that's a whole different debate. From guido at python.org Mon Nov 25 23:29:25 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Nov 2013 14:29:25 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125171603.1812e9fd@anarchist> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> Message-ID: On Mon, Nov 25, 2013 at 2:16 PM, Barry Warsaw wrote: > On Nov 25, 2013, at 01:22 PM, Guido van Rossum wrote: > > >Still only -0 from me, mostly because of the first two of your items, and > >because it just replaces one kind of magic with another. > > Just to continue playing along (since I have no problem with the current > idiom), I do like Nick's shadowing idea. > > I guess what I don't like about is_main() is that it's a function call, > and is > two words separated by an underscore. I have no technical arguments > against > it, just that to me it doesn't look as pretty. And also, I guess, a > function > call seems a little more magical than checking an attribute value. What > does > the function *do*? OTOH, I guess a shadowed builtin variable is a little > magical too, but maybe a touch more transparent magic. ;) > For all I care you can call it ismain(). But it should be a function so that it's clear that the same function can return a different value in different contexts. Variables that have different values in different contexts should be set by the user (technically they'd be different variables, even if they have the same name). For system stuff that varies by context, we use functions, since a function can dynamically look at the context. (For example, globals() and locals().) > > >But mostly I don't see why it has to involve a __dunder__ name (other > >reminding the reader of the old idiom). The reason for using __dunder__ > >style for __name__ and '__main__' was clear: they impose on namespaces > that > >are nominally the user's (__name__ is a global variable, the value > >'__main__' is a module name, we don't want to interfere with a > user-defined > >global variable named 'name' nor with a user-defined module named 'main'). > >But we don't need the same kind of caution for builtins (if the use > defines > >a variable is_builtin that variable wins). > > I don't think we could use a builtin called "main", because you'd get some > confusions like this: > > def main(): > # blah blah > > if main: > main() > > OTOH, maybe that's actually kind of cute. It means a builtin `main` could > be > False and no implicit shadowing would be necessary. I guess some folks > don't > like to call their main function main() so maybe that wouldn't work for > them, > but it's not like the old idiom would go away. > > Except: you surely could have main() functions in other modules which > aren't > run as scripts, so that would break. Okay, never mind. > > But I do think that this means any magic builtin would have to be dundered. > Even is_main() could be a legitimate name for a function in an existing > module. > That seems a logic mistake. By that reasoning we would have to be as careful with new builtins as we are with new keywords. But that's not the case. Adding is_main() to builtins is not going to break code that currently defines a global named 'is_main'. Sure, that code cannot also use the new builtin, but that's expected. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Nov 25 23:31:08 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Nov 2013 14:31:08 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> Message-ID: On Mon, Nov 25, 2013 at 2:28 PM, Paul Moore wrote: > PS This is assuming that nobody is suggesting breaking "if __name__ == > '__main__'". If they are, then that's a whole different debate. > My proposal is to define the builtin function as comparing __name__ to '__main__', so the old idiom will still work. It will just eventually start looking out of date. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Mon Nov 25 23:31:33 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 25 Nov 2013 15:31:33 -0700 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125231422.7dd551c5@fsol> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> <20131125231422.7dd551c5@fsol> Message-ID: On Mon, Nov 25, 2013 at 3:14 PM, Antoine Pitrou wrote: > (I also understand that it relies on a more-or-less implementation > detail, but the implementation detail will have to stay for > compatibility reasons, anyway) What implementation detail are you talking about? -eric From barry at python.org Mon Nov 25 23:39:03 2013 From: barry at python.org (Barry Warsaw) Date: Mon, 25 Nov 2013 17:39:03 -0500 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> Message-ID: <20131125173903.4fa90223@anarchist> On Nov 25, 2013, at 02:29 PM, Guido van Rossum wrote: >For all I care you can call it ismain(). Okay, I think I'm going to officially not care now. :) None of these suggestions seem worth the effort to indoctrinate folks to some new idiom, regardless of how it's spelled. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ericsnowcurrently at gmail.com Mon Nov 25 23:50:25 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 25 Nov 2013 15:50:25 -0700 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125173903.4fa90223@anarchist> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <20131125173903.4fa90223@anarchist> Message-ID: On Mon, Nov 25, 2013 at 3:39 PM, Barry Warsaw wrote: > On Nov 25, 2013, at 02:29 PM, Guido van Rossum wrote: > >>For all I care you can call it ismain(). > > Okay, I think I'm going to officially not care now. :) None of these > suggestions seem worth the effort to indoctrinate folks to some new idiom, > regardless of how it's spelled. +1 -eric From guido at python.org Mon Nov 25 23:50:51 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 25 Nov 2013 14:50:51 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125173903.4fa90223@anarchist> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <20131125173903.4fa90223@anarchist> Message-ID: On Mon, Nov 25, 2013 at 2:39 PM, Barry Warsaw wrote: > On Nov 25, 2013, at 02:29 PM, Guido van Rossum wrote: > > >For all I care you can call it ismain(). > > Okay, I think I'm going to officially not care now. :) None of these > suggestions seem worth the effort to indoctrinate folks to some new idiom, > regardless of how it's spelled. > I mostly agree -- but if people insist on a better idiom, a builtin function is the only one I can live with. I don't particularly care what that builtin function is called, as long as it is not a __dunder__ name. And it must be a function. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Nov 25 23:17:53 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 25 Nov 2013 14:17:53 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> <20131124222607.271fdc46@fsol> Message-ID: <5293CC91.8000005@stoneleaf.us> On 11/25/2013 02:12 PM, Ron Adam wrote: > On 11/25/2013 09:56 AM, Chris Angelico wrote: >> On Tue, Nov 26, 2013 at 2:35 AM, Ron Adam wrote: >>> But in this case... >>> >>> __main__ = "__main__" >>> __name__ = __main__ >>> assert __main__ is __name__ >>> >>> That should always work. I would think it was a bug if it didn't. >> >> Of course, that will work. But this isn't guaranteed to: >> >> __main__ = "__main__" >> __name__ = "__main__" >> assert __main__ is __name__ >> >> It quite possibly will, but it's not guaranteed by the language. And >> this is more what's being done here - a separate literal. > > Are thinking that the __main__ global would be defined after the main module is loaded? > > I was thinking that __main__ would be set to "__main__" first, then when the main module is loaded, it's __name__ > attribute set to __main__ rather than "__main__". Exactly as the first example above. > > Or are you thinking there may be more than one main module? > > If you just feel strongly that using 'is' is a bad practice in this case, I'm fine with that. Yes, it is bad practice to use `is` this way. The only time `is` should be used is when you need to know that two names/references are referring to the exact same object, which is definitely not the case here. -- ~Ethan~ From ben+python at benfinney.id.au Tue Nov 26 00:07:19 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 26 Nov 2013 10:07:19 +1100 Subject: [Python-ideas] making a module callable References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> Message-ID: <7wtxf0rsgo.fsf@benfinney.id.au> Michael Foord writes: > On 25 November 2013 03:58, Ben Finney wrote: > > > Greg Ewing writes: > > > > > Ben Finney wrote: > > > > It's more convenient to look in ?the sequence of command-line > > > > parameters? for all the command-line parameters, without > > > > special-casing the command name. > > > > > > But you hardly ever want to process argv[0] the same way as the rest > > > of the arguments, so you end up treating it as a special case anyway. > > > > This isn't about *processing* that argument; it's about *receiving* it > > in the first place to the function. Having it omitted by default means > > there's a special case just to *get at* the first command-line > > argument:: > > > > > The name of the script is not an argument *to* the script. Having it there > in the first place is the special case, not removing it. No, the command name is part of the command line arguments, as I pointed out earlier. That is, when any program is started with:: foo bar baz the command line arguments are ?foo?, ?bar?, ?baz?. That says nothing about what those arguments *mean* yet; they're all available to the program, for it to figure out their significance. > It's only an old C convention (and now an old Python convention) that > makes you think it is. Not at all. Giving the command-line arguments to the program as a sequence of strings is a standard cross-language interface; the operating system hands the command line to the running program without caring what language it was implemented in. This is all prior to talking about what those arguments *mean*; the discussion of ??foo? is a command, ?bar? and ?baz? are its parameters? all comes afterward and is irrelevant to the process of *getting at* the command line. If the program's main code wants to discard the first argument, as many programs do for good reasons, that's up to the programmer to decide explicitly. Many other programs make use of the whole command line, and ty should not need some different way to get at the contents of the command line. If we're going to make the command line sequence a parameter to the main code, there should be one interface, no special cases. As it stands, that conventional interface in Python code is:: def main(argv=None): if argv is None: argv = sys.argv and then it's up to the rest of the ?main? function (or however it's spelled) to process the full command line sequence that was received. So, while the name ?argv? is a C convention, the handling of the command line as a homogeneous sequence of strings is language-agnostic. Automatically discarding the first argument, on the assumption that the program doesn't care about it, is making a false assumption in many cases and makes a common use case needlessly difficult. -- \ ?We must find our way to a time when faith, without evidence, | `\ disgraces anyone who would claim it.? ?Sam Harris, _The End of | _o__) Faith_, 2004 | Ben Finney From tjreedy at udel.edu Tue Nov 26 00:37:14 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 25 Nov 2013 18:37:14 -0500 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> <20131125231422.7dd551c5@fsol> Message-ID: On 11/25/2013 5:31 PM, Eric Snow wrote: > On Mon, Nov 25, 2013 at 3:14 PM, Antoine Pitrou wrote: >> (I also understand that it relies on a more-or-less implementation >> detail, but the implementation detail will have to stay for >> compatibility reasons, anyway) > > What implementation detail are you talking about? The fact that .__name__ is set to "__main__" in the main module. -- Terry Jan Reedy From oscar.j.benjamin at gmail.com Tue Nov 26 00:37:21 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Mon, 25 Nov 2013 23:37:21 +0000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> Message-ID: On 25 November 2013 22:29, Guido van Rossum wrote: >> I guess what I don't like about is_main() is that it's a function call, >> and is >> two words separated by an underscore. I have no technical arguments >> against >> it, just that to me it doesn't look as pretty. And also, I guess, a >> function >> call seems a little more magical than checking an attribute value. What >> does >> the function *do*? OTOH, I guess a shadowed builtin variable is a little >> magical too, but maybe a touch more transparent magic. ;) > > > For all I care you can call it ismain(). > > But it should be a function so that it's clear that the same function can > return a different value in different contexts. How is that clear? That's precisely what functions don't normally do (in Python, maths, other programming languages...). There already seems to be confusion around the magical super() function precisely because it does different things in different contexts and this is unexpected of something that uses function call syntax. Oscar From greg.ewing at canterbury.ac.nz Tue Nov 26 00:43:25 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 26 Nov 2013 12:43:25 +1300 Subject: [Python-ideas] making a module callable In-Reply-To: <7wtxf0rsgo.fsf@benfinney.id.au> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> <7wtxf0rsgo.fsf@benfinney.id.au> Message-ID: <5293E09D.9030808@canterbury.ac.nz> Ben Finney wrote: > Automatically discarding the first argument, on the assumption that the > program doesn't care about it, is making a false assumption in many > cases and makes a common use case needlessly difficult. If you're talking about doing different things based on argv[0], I wouldn't call it a *common* use case. The last time I saw it done was on an early version of SunOS that didn't have shared libraries, so they linked all the gui tools into one big executable to reduce disk and memory usage. Now that we have shared libraries, there's much less need for that kind of trick. -- Greg From ericsnowcurrently at gmail.com Tue Nov 26 00:51:36 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 25 Nov 2013 16:51:36 -0700 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> <20131125231422.7dd551c5@fsol> Message-ID: On Mon, Nov 25, 2013 at 4:37 PM, Terry Reedy wrote: > On 11/25/2013 5:31 PM, Eric Snow wrote: >> >> On Mon, Nov 25, 2013 at 3:14 PM, Antoine Pitrou >> wrote: >>> >>> (I also understand that it relies on a more-or-less implementation >>> detail, but the implementation detail will have to stay for >>> compatibility reasons, anyway) >> >> >> What implementation detail are you talking about? > > > The fact that .__name__ is set to "__main__" in the main module. I don't see why that's an implementation detail. The language reference specifies that scripts are run in the namespace of the main module and that it's named "__main__". [1][2] -eric [1] http://docs.python.org/3.4/reference/executionmodel.html#naming-and-binding [2] http://docs.python.org/3.4/reference/toplevel_components.html#complete-python-programs From rosuav at gmail.com Tue Nov 26 00:54:28 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 26 Nov 2013 10:54:28 +1100 Subject: [Python-ideas] making a module callable In-Reply-To: <5293E09D.9030808@canterbury.ac.nz> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> <7wtxf0rsgo.fsf@benfinney.id.au> <5293E09D.9030808@canterbury.ac.nz> Message-ID: On Tue, Nov 26, 2013 at 10:43 AM, Greg Ewing wrote: > Ben Finney wrote: >> >> Automatically discarding the first argument, on the assumption that the >> program doesn't care about it, is making a false assumption in many >> cases and makes a common use case needlessly difficult. > > > If you're talking about doing different things based on > argv[0], I wouldn't call it a *common* use case. The last > time I saw it done was on an early version of SunOS that > didn't have shared libraries, so they linked all the gui > tools into one big executable to reduce disk and memory > usage. Upstart has a set of symlinks to initctl called "start", "stop", "reload", etc. They're shortcuts for "initctl start", "initctl stop", etc. Also, I've often fetched up argv[0] as part of a usage message, which isn't strictly "doing different things", but it does mean that renaming the program won't leave an old name inside its help message. ChrisA From mal at egenix.com Tue Nov 26 00:55:34 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Tue, 26 Nov 2013 00:55:34 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <20131125173903.4fa90223@anarchist> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <20131125173903.4fa90223@anarchist> Message-ID: <5293E376.20606@egenix.com> On 25.11.2013 23:39, Barry Warsaw wrote: > On Nov 25, 2013, at 02:29 PM, Guido van Rossum wrote: > >> For all I care you can call it ismain(). > > Okay, I think I'm going to officially not care now. :) None of these suggestions seem worth > the effort to indoctrinate folks to some new idiom, regardless of how it's spelled. +1 As long as "if __name__ == '__main__': ..." doesn't stop working, I don't care either :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 26 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, >>> mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From dreamingforward at gmail.com Tue Nov 26 00:59:20 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Mon, 25 Nov 2013 15:59:20 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> Message-ID: The only other possibility not mentioned thus far is to have a main.py file and force python programs to start from it. markj From ethan at stoneleaf.us Tue Nov 26 00:46:54 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 25 Nov 2013 15:46:54 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> Message-ID: <5293E16E.5050909@stoneleaf.us> On 11/25/2013 03:37 PM, Oscar Benjamin wrote: > On 25 November 2013 22:29, Guido van Rossum wrote: >>> I guess what I don't like about is_main() is that it's a function call, >>> and is >>> two words separated by an underscore. I have no technical arguments >>> against >>> it, just that to me it doesn't look as pretty. And also, I guess, a >>> function >>> call seems a little more magical than checking an attribute value. What >>> does >>> the function *do*? OTOH, I guess a shadowed builtin variable is a little >>> magical too, but maybe a touch more transparent magic. ;) >> >> >> For all I care you can call it ismain(). >> >> But it should be a function so that it's clear that the same function can >> return a different value in different contexts. > > How is that clear? That's precisely what functions don't normally do > (in Python, maths, other programming languages...). Any function which deals with the outside world returns different values based on context (aka the real world): - date - time - disk - geo-location - locals() - globals() - vars() etcetera, etcetera, and so-forth. -- ~Ethan~ From ben+python at benfinney.id.au Tue Nov 26 01:46:08 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 26 Nov 2013 11:46:08 +1100 Subject: [Python-ideas] making a module callable References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> <7wtxf0rsgo.fsf@benfinney.id.au> <5293E09D.9030808@canterbury.ac.nz> Message-ID: <7wbo18rnvz.fsf@benfinney.id.au> Greg Ewing writes: > Ben Finney wrote: > > Automatically discarding the first argument, on the assumption that > > the program doesn't care about it, is making a false assumption in > > many cases and makes a common use case needlessly difficult. > > If you're talking about doing different things based on argv[0], I > wouldn't call it a *common* use case. The common use case I'm referring to is to use the program name in output messages (e.g. help, errors) without needing to change the code when the program file is renamed, or when a different command line is constructed by the caller. Doing different things based on how the program is invoke is another common use case, yes. Both of these use cases argue for retaining the full command line sequence (or whatever replacement command line sequence the caller chooses to construct) as input to the main code, and allow the main code to decide which parts are important. -- \ ?I am too firm in my consciousness of the marvelous to be ever | `\ fascinated by the mere supernatural ?? ?Joseph Conrad, _The | _o__) Shadow-Line_ | Ben Finney From zuo at chopin.edu.pl Tue Nov 26 01:50:25 2013 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Tue, 26 Nov 2013 01:50:25 +0100 Subject: [Python-ideas] =?utf-8?b?UmVwbGFjaW5nIHRoZSBpZiBfX25hbWVfXyA9PSAi?= =?utf-8?q?=5F=5Fmain=5F=5F=22_idiom_=28was_Re=3A_making_a_module_callable?= =?utf-8?q?=29?= In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <20131125173903.4fa90223@anarchist> Message-ID: <1b618a340b3a9c8a426897bacb85a7e3@chopin.edu.pl> >> On Nov 25, 2013, at 02:29 PM, Guido van Rossum wrote: >> >> >For all I care you can call it ismain(). IMHO the ismain() function is (at least so far) the best option -- because: * it separates the "is it the main module" check from its implementation details (__name__ == '__main__') so in a far future (Py 4.0?) it may become be possible to change those details (e.g. if they do not fit well with the import machinery logic); * beautiful is better than ugly (and -- IMHO -- the current idiom is too ugly + inconvenient to type, especially as a common boilerplate...); * after allowing to pass in an optional stack frame explicitly, it could make possible (or at least far easier) to test script ("run-as-the-main=module") behaviour, e.g.: def test_my_module_as_script(self): _orig_ismain = builtins.ismain def _mocked_ismain(module_frame=None): if module_frame is None: module_frame = sys._getframe(1) name = module_frame.f_globals['__name__'] if name == 'my_module': return True return _orig_ismain(module_frame) with unittest.mock.patch('bultins.ismain', _mocked_ismain): sys.modules.pop('my_module') import my_module * the feature could be easily backported to older Python versions == as a 3rd party module or just not so complex boilerplate code: # import_me_anywhere_before_using_the_ismain_function.py: import builtins, sys def ismain(module_frame=None): if module_frame is None: module_frame = sys._getframe(1) return module_frame.f_globals['__name__'] == '__main__' builtins.ismain = ismain Cheers. *j From python at mrabarnett.plus.com Tue Nov 26 01:59:40 2013 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 26 Nov 2013 00:59:40 +0000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <528FED2D.6090507@btinternet.com> <20131123040608.GN2085@ando> <20131124003720.GU2085@ando> <529173F8.4070006@canterbury.ac.nz> Message-ID: <5293F27C.7010809@mrabarnett.plus.com> On 25/11/2013 22:25, ????? wrote: > > > > 2013/11/24 Guido van Rossum > > > > > if is_main(): > > > > > > How about going the other way around? > > if imported(): > break > > > > Most of the time, people put the main stuff at the end of the script, so > this check can serve as a seperator, equivalent to what is sometimes > marked with '***********************' - snip the script here. > > Of course, it is still possible to do > > if not imported(): > main() > > I think it is at least as obvious as is_main(), and even more so for > those without a C-like background. > > (Personally I'd prefer an `imported` magic variable but I guess that's > out of question). > Instead of "imported", how about "import"? That already exists as a reserved word and its use outside an import statement is currently illegal: if not import: main() From stephen at xemacs.org Tue Nov 26 02:32:29 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 26 Nov 2013 10:32:29 +0900 Subject: [Python-ideas] making a module callable In-Reply-To: <5293E09D.9030808@canterbury.ac.nz> References: <528D7287.7040305@stoneleaf.us> <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> <7wtxf0rsgo.fsf@benfinney.id.au> <5293E09D.9030808@canterbury.ac.nz> Message-ID: <87fvqkdk2a.fsf@uwakimon.sk.tsukuba.ac.jp> Greg Ewing writes: > If you're talking about doing different things based on > argv[0], I wouldn't call it a *common* use case. The last > time I saw it done was on an early version of SunOS that > didn't have shared libraries, so they linked all the gui > tools into one big executable to reduce disk and memory > usage. busybox is still standard in many Linux distros, though I don't really know why (embedded systems, "small" rescue media?), and surely you've seen fgrep/grep/egrep (hardlinked to the same file on Mac OS X as of "Snow Leopard"), even if you personally use POSIX-standard "grep -F" and "grep -E". Linking /bin/sh to bash or zsh is a common trick on GNU systems, and typically invokes strict POSIX conformance (well, as strictly as any GNU program ever conforms to another organization's standard :-( ). So I rather suspect you "see" it frequently, even today. You just don't recognize it when you see it. Could we do without such trickery? Sure. However, the point about usage messages still stands: it's useful to fetch the actual command line token used to invoke the program, because program names and invocation methods do change. From phd at phdru.name Tue Nov 26 02:37:42 2013 From: phd at phdru.name (Oleg Broytman) Date: Tue, 26 Nov 2013 02:37:42 +0100 Subject: [Python-ideas] making a module callable In-Reply-To: <5293E09D.9030808@canterbury.ac.nz> References: <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> <7wtxf0rsgo.fsf@benfinney.id.au> <5293E09D.9030808@canterbury.ac.nz> Message-ID: <20131126013742.GA21808@phdru.name> Hi! On Tue, Nov 26, 2013 at 12:43:25PM +1300, Greg Ewing wrote: > Ben Finney wrote: > >Automatically discarding the first argument, on the assumption that the > >program doesn't care about it, is making a false assumption in many > >cases and makes a common use case needlessly difficult. > > If you're talking about doing different things based on > argv[0], I wouldn't call it a *common* use case. The last > time I saw it done was on an early version of SunOS that > didn't have shared libraries, so they linked all the gui > tools into one big executable to reduce disk and memory > usage. > > Now that we have shared libraries, there's much less > need for that kind of trick. sys.argv[0] is used for: 1) Setup sys.path; something like lib_dir = os.path.dirname(sys.argv[0]) + os.sep + 'lib') sys.path.append(lib_dir) 2) Setup relative path(s) (to start a helper script, e.g.); something like Subprocess("%s/Robots/sub.py" % os.path.dirname(sys.argv[0])) 3) Report usage: sys.stderr.write("Usage: %s [-o|--old]\n" % sys.argv[0]) 4) Change behaviour based on the script's name. I am -1 on removing sys.argv[0] from main(argv). Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From stephen at xemacs.org Tue Nov 26 02:45:33 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 26 Nov 2013 10:45:33 +0900 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125224409.31109184@fsol> Message-ID: <87eh64djgi.fsf@uwakimon.sk.tsukuba.ac.jp> Paul Moore writes: > And to be honest, I suspect many of the "old hands" will simply > ignore the new idiom and continue using (and teaching others, by > example if nothing else) the old one - so it's not even as if the > old idiom is going to disappear. Yup. In my XEmacs init, I have a Python-specific "script" skeleton that produces it (including a call to "main()" and other boilerplate) automatically, and if python-dev doesn't break it, I won't fix it. From ethan at stoneleaf.us Tue Nov 26 02:48:32 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 25 Nov 2013 17:48:32 -0800 Subject: [Python-ideas] making a module callable In-Reply-To: <20131126013742.GA21808@phdru.name> References: <528FE705.10506@canterbury.ac.nz> <20131123042100.GR2085@ando> <7wiovite8e.fsf@benfinney.id.au> <5291C4BA.2020606@canterbury.ac.nz> <7weh65trtc.fsf@benfinney.id.au> <52927FB3.9030703@canterbury.ac.nz> <7w38mlt9nw.fsf@benfinney.id.au> <7wtxf0rsgo.fsf@benfinney.id.au> <5293E09D.9030808@canterbury.ac.nz> <20131126013742.GA21808@phdru.name> Message-ID: <5293FDF0.3050409@stoneleaf.us> On 11/25/2013 05:37 PM, Oleg Broytman wrote: > > I am -1 on removing sys.argv[0] from main(argv). +1 on the -1 (so -2 ;) (take that, maths!) -- ~Ethan~ From ron3200 at gmail.com Tue Nov 26 03:37:49 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 25 Nov 2013 20:37:49 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <20131125173903.4fa90223@anarchist> Message-ID: On 11/25/2013 04:50 PM, Guido van Rossum wrote: > On Mon, Nov 25, 2013 at 2:39 PM, Barry Warsaw > > wrote: > > On Nov 25, 2013, at 02:29 PM, Guido van Rossum wrote: > > >For all I care you can call it ismain(). > > Okay, I think I'm going to officially not care now. :) None of these > suggestions seem worth the effort to indoctrinate folks to some new idiom, > regardless of how it's spelled. > > > I mostly agree -- but if people insist on a better idiom, a builtin > function is the only one I can live with. I don't particularly care what > that builtin function is called, as long as it is not a __dunder__ name. > And it must be a function. I'm +1 on the builtin function alternative if it compares the actual modules rather than just the names. Then it would have some benefits I think. It could be used within functions. Currently __name__ is masked by the function's __name__ attribute if you try that. It would still work if someone overwrites __name__. Not that I think it happens very often. And I don't care if it's spelled ismain or is_main. Well actually I like the underscores separating words in function names. But we have isattribute() which already doesn't follow that. So which ever is more consistent. Cheers, Ron From alexander.belopolsky at gmail.com Tue Nov 26 04:04:59 2013 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 25 Nov 2013 22:04:59 -0500 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <20131125173903.4fa90223@anarchist> Message-ID: On Mon, Nov 25, 2013 at 9:37 PM, Ron Adam wrote: > It could be used within functions. Currently __name__ is masked by the > function's __name__ attribute if you try that. > That's news to me. It was not so as of 3.3.3: Python 3.3.3 (default, Nov 25 2013, 15:41:46) >>> def f(): ... print(__name__) ... >>> f() __main__ If anything, it is a magic function relying on _getframe() run-time introspection that would have a problem working inside functions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ron3200 at gmail.com Tue Nov 26 04:29:33 2013 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 25 Nov 2013 21:29:33 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <20131125173903.4fa90223@anarchist> Message-ID: On 11/25/2013 09:04 PM, Alexander Belopolsky wrote: > That's news to me. It was not so as of 3.3.3: > > Python 3.3.3 (default, Nov 25 2013, 15:41:46) > >>> def f(): > ... print(__name__) > ... > >>> f() > __main__ > > If anything, it is a magic function relying on _getframe() run-time > introspection that would have a problem working inside functions. Woops.. yep. Hmmm, Wonder what I came across recently to make me think that. Oh well. I still think there would be some advantages from comparing the actual modules rather than the __name__ attribute. Jan pointed out some use cases. Cheers, Ron From skip at pobox.com Tue Nov 26 04:45:17 2013 From: skip at pobox.com (Skip Montanaro) Date: Mon, 25 Nov 2013 21:45:17 -0600 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: Message-ID: > I still don't think the current idiom is problematic, so I'm -0.8 on > the whole thing :-) I'm with Antoine. I don't find it particularly onerous to type (how often do you start a new program/script from scratch anyway?), and if it was a big enough barrier, I'd capture the keystrokes in an Emacs macro (including skeletal main and usage functions, and corresponding imports of sys and os modules). I trust vim users could do something similar. Skip -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Nov 26 11:01:32 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 26 Nov 2013 20:01:32 +1000 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> Message-ID: On 26 Nov 2013 10:00, "Mark Janssen" wrote: > > The only other possibility not mentioned thus far is to have a main.py > file and force python programs to start from it. Looking for a __main__.py module (or submodule) is the way directory, zipfile and package execution work, so this style is already possible today for anyone that wants or needs it. Cheers, Nick. > > markj > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at slenders.be Tue Nov 26 14:54:32 2013 From: jonathan at slenders.be (Jonathan Slenders) Date: Tue, 26 Nov 2013 14:54:32 +0100 Subject: [Python-ideas] Using yield inside a comprehension. Message-ID: Where do I find the PEP that describes that the following statement assigns a generator object to `values`? values = [ (yield x) for x in range(10) ] I assume it's equivalent to the following: values = (x for x in range(10)) The reason for asking this is that I see no point in using the first syntax and that the first has another meaning in Python 2.7, if used inside a function. -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Tue Nov 26 15:32:30 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Tue, 26 Nov 2013 14:32:30 +0000 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: On 26 November 2013 13:54, Jonathan Slenders wrote: > > Where do I find the PEP that describes that the following statement assigns > a generator object to `values`? I don't think there was a PEP for this but it's a consequence of the change to binding in list comprehensions introduced in Python 3.x which is mentioned here: http://python-history.blogspot.co.uk/2010/06/from-list-comprehensions-to-generator.html Essentially this: > values = [ (yield x) for x in range(10) ] Translates to the following in Python 2.x: _tmp = [] for x in range(10): _tmp.append((yield x)) values = _tmp However in 3.x it translates to something like: def _tmpfunc(): _tmp = [] for x in range(10): _tmp.append((yield x)) return _tmp values = _tmpfunc() This change was made to prevent the comprehension variable from leaking to the enclosing scope, but as you say if the code is nested in a function then it affects which function contains the yield statement and the presence of a yield statement radically alters the behaviour of a function. So in 2.x the enclosing function must become a generator function. However in 3.x the function that is supposed to implement the list comprehension is changed into a generator function instead. $ python3 Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> values = [(yield x) for x in range(10)] >>> values at 0x00E24508> >>> def _tmpfunc(): ... _tmp = [] ... for x in range(10): ... _tmp.append((yield x)) ... return _tmp ... >>> values = _tmpfunc() >>> values > I assume it's equivalent to the following: > values = (x for x in range(10)) It will yield the same values but it will also build a list of Nones and attach it to StopIteration: >>> values = [(yield x) for x in range(3)] >>> values at 0x00E1CCD8> >>> next(values) 0 >>> next(values) 1 >>> next(values) 2 >>> next(values) Traceback (most recent call last): File "", line 1, in StopIteration: [None, None, None] > The reason for asking this is that I see no point in using the first syntax > and that the first has another meaning in Python 2.7, if used inside a > function. This has been mentioned before and AFAICT it is an unintended artefact of the new definitions for comprehensions and yield and the fact that 'return _tmp' works inside a generator function. I haven't seen a use for this behaviour yet. Oscar From abarnert at yahoo.com Tue Nov 26 15:55:20 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Nov 2013 06:55:20 -0800 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: On Nov 26, 2013, at 6:32, Oscar Benjamin wrote: > On 26 November 2013 13:54, Jonathan Slenders wrote: >> >> Where do I find the PEP that describes that the following statement assigns >> a generator object to `values`? > > I don't think there was a PEP for this but it's a consequence of the > change to binding in list comprehensions introduced in Python 3.x > which is mentioned here: > http://python-history.blogspot.co.uk/2010/06/from-list-comprehensions-to-generator.html > > Essentially this: > >> values = [ (yield x) for x in range(10) ] > > Translates to the following in Python 2.x: > > _tmp = [] > for x in range(10): > _tmp.append((yield x)) > values = _tmp > > However in 3.x it translates to something like: > > def _tmpfunc(): > _tmp = [] > for x in range(10): > _tmp.append((yield x)) > return _tmp > values = _tmpfunc() > > This change was made to prevent the comprehension variable from > leaking to the enclosing scope, but as you say if the code is nested > in a function then it affects which function contains the yield > statement and the presence of a yield statement radically alters the > behaviour of a function. So in 2.x the enclosing function must become > a generator function. However in 3.x the function that is supposed to > implement the list comprehension is changed into a generator function > instead. > > $ python3 > Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 > 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> values = [(yield x) for x in range(10)] >>>> values > at 0x00E24508> >>>> def _tmpfunc(): > ... _tmp = [] > ... for x in range(10): > ... _tmp.append((yield x)) > ... return _tmp > ... >>>> values = _tmpfunc() >>>> values > > >> I assume it's equivalent to the following: >> values = (x for x in range(10)) > > It will yield the same values but it will also build a list of Nones > and attach it to StopIteration: Unless you call .send on it, in which case it'll build a list of the values you send it and attach _that_ to StopIteration, of course. So I suppose you could use it as a coroutine version of the list function. Except that the number of values it takes is specified on the wrong end. From jonathan at slenders.be Tue Nov 26 16:33:59 2013 From: jonathan at slenders.be (Jonathan Slenders) Date: Tue, 26 Nov 2013 16:33:59 +0100 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: Thanks Oscar, for the extensive explanation. The translation process is clear to me. But I'm not convinced that yield or yield from should be bound to this invisible function instead of the parent. Now that asyncio/tulip is coming, I see some use case. Suppose that you have a list of futures, and you want to process them. The following works, using the "gather" function. @coroutine def process(futures, func): values = yield from gather(futures) return [ func(v) for v in values ] What I tried to do, with my knowledge of Python 2 generators, is the following, but that doesn't work. @coroutine def process(futures, func): return [ func((yield from f)) for f in futures ] They are also not equivalent. Using gather(), we have to wait before all the futures are ready, while in the second example, we could start creating the list on the fly. (Note that in this case "yield" instead of "yield from" would also work.) Futher, there is a really weird constructs possible: list((yield x) for x in range(10)) [0, None, 1, None, 2, None, 3, None, 4, None, 5, None, 6, None, 7, None, 8, None, 9, None] It's a logical consequence from the translation, but I don't get the point. Can't we create a local namespace, without wrapping it in a function? 2013/11/26 Andrew Barnert > On Nov 26, 2013, at 6:32, Oscar Benjamin > wrote: > > > On 26 November 2013 13:54, Jonathan Slenders > wrote: > >> > >> Where do I find the PEP that describes that the following statement > assigns > >> a generator object to `values`? > > > > I don't think there was a PEP for this but it's a consequence of the > > change to binding in list comprehensions introduced in Python 3.x > > which is mentioned here: > > > http://python-history.blogspot.co.uk/2010/06/from-list-comprehensions-to-generator.html > > > > Essentially this: > > > >> values = [ (yield x) for x in range(10) ] > > > > Translates to the following in Python 2.x: > > > > _tmp = [] > > for x in range(10): > > _tmp.append((yield x)) > > values = _tmp > > > > However in 3.x it translates to something like: > > > > def _tmpfunc(): > > _tmp = [] > > for x in range(10): > > _tmp.append((yield x)) > > return _tmp > > values = _tmpfunc() > > > > This change was made to prevent the comprehension variable from > > leaking to the enclosing scope, but as you say if the code is nested > > in a function then it affects which function contains the yield > > statement and the presence of a yield statement radically alters the > > behaviour of a function. So in 2.x the enclosing function must become > > a generator function. However in 3.x the function that is supposed to > > implement the list comprehension is changed into a generator function > > instead. > > > > $ python3 > > Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 > > 32 bit (Intel)] on win32 > > Type "help", "copyright", "credits" or "license" for more information. > >>>> values = [(yield x) for x in range(10)] > >>>> values > > at 0x00E24508> > >>>> def _tmpfunc(): > > ... _tmp = [] > > ... for x in range(10): > > ... _tmp.append((yield x)) > > ... return _tmp > > ... > >>>> values = _tmpfunc() > >>>> values > > > > > >> I assume it's equivalent to the following: > >> values = (x for x in range(10)) > > > > It will yield the same values but it will also build a list of Nones > > and attach it to StopIteration: > > Unless you call .send on it, in which case it'll build a list of the > values you send it and attach _that_ to StopIteration, of course. > > So I suppose you could use it as a coroutine version of the list function. > Except that the number of values it takes is specified on the wrong end. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.cristh at gmail.com Tue Nov 26 17:34:28 2013 From: alan.cristh at gmail.com (Alan Cristhian Ruiz) Date: Tue, 26 Nov 2013 13:34:28 -0300 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> Message-ID: <5294CD94.6050008@gmail.com> I think the need to change * if __ name__ == "__main__": * is capricious and obsessive. The current rule is better than anything that has been suggested so far. I never had any problems with the * if __ name__ == "__main__": *. Also in python there are other things much more difficult to learn and use, such as metaclasses. From abarnert at yahoo.com Tue Nov 26 18:31:26 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Nov 2013 09:31:26 -0800 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: On Nov 26, 2013, at 7:33, Jonathan Slenders wrote: > What I tried to do, with my knowledge of Python 2 generators, is the following, but that doesn't work. > > @coroutine > def process(futures, func): > return [ func((yield from f)) for f in futures ] > > They are also not equivalent. Using gather(), we have to wait before all the futures are ready, while in the second example, we could start creating the list on the fly. (Note that in this case "yield" instead of "yield from" would also work.) What would be the benefit of starting to create the list on the fly? You can't actually do anything with it, or even see it from any code, until it's completed. From abarnert at yahoo.com Tue Nov 26 18:37:28 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Nov 2013 09:37:28 -0800 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: <5294CD94.6050008@gmail.com> References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <5294CD94.6050008@gmail.com> Message-ID: On Nov 26, 2013, at 8:34, Alan Cristhian Ruiz wrote: > I think the need to change * if __ name__ == "__main__": * is capricious and obsessive. The current rule is better than anything that has been suggested so far. I never had any problems with the * if __ name__ == "__main__": *. Also in python there are other things much more difficult to learn and use, such as metaclasses. Although I agree with your main point, I don't think that's a very good argument. __main__ is something novices have to learn early and use in code regularly; metaclasses are something only experienced developers use, and not that often (and that's even if you count using stdlib metaclasses to, e.g., create ABCs, which doesn't really require you to understand how they work). It's perfectly reasonable for an "expert" feature to be more difficult to learn than a novice feature. Also, Python doesn't have a queue of improvements to be scheduled to a team of developers. Things get improved if someone is motivated enough to write the code and drive the idea to consensus and/or BDFL approval. So, improving this would have very little bearing on improving things you care about more. From jonathan at slenders.be Tue Nov 26 19:22:53 2013 From: jonathan at slenders.be (Jonathan Slenders) Date: Tue, 26 Nov 2013 19:22:53 +0100 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: You can see an early side effect in "func" here, which could also trigger exceptions early. 2013/11/26 Andrew Barnert > On Nov 26, 2013, at 7:33, Jonathan Slenders wrote: > > > What I tried to do, with my knowledge of Python 2 generators, is the > following, but that doesn't work. > > > > @coroutine > > def process(futures, func): > > return [ func((yield from f)) for f in futures ] > > > > They are also not equivalent. Using gather(), we have to wait before all > the futures are ready, while in the second example, we could start creating > the list on the fly. (Note that in this case "yield" instead of "yield > from" would also work.) > > What would be the benefit of starting to create the list on the fly? You > can't actually do anything with it, or even see it from any code, until > it's completed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Tue Nov 26 19:43:51 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 26 Nov 2013 10:43:51 -0800 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: On Nov 26, 2013, at 10:22, Jonathan Slenders wrote: > You can see an early side effect in "func" here, which could also trigger exceptions early. Well, yes, you can see a side effect of calling func in func. But not a side effect of building the list. And, more importantly, you don't (or shouldn't) use list comprehensions for expressions that you're evaluating for their side effects. That's what for statements are for. > 2013/11/26 Andrew Barnert >> On Nov 26, 2013, at 7:33, Jonathan Slenders wrote: >> >> > What I tried to do, with my knowledge of Python 2 generators, is the following, but that doesn't work. >> > >> > @coroutine >> > def process(futures, func): >> > return [ func((yield from f)) for f in futures ] >> > >> > They are also not equivalent. Using gather(), we have to wait before all the futures are ready, while in the second example, we could start creating the list on the fly. (Note that in this case "yield" instead of "yield from" would also work.) >> >> What would be the benefit of starting to create the list on the fly? You can't actually do anything with it, or even see it from any code, until it's completed. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Nov 26 23:01:46 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 27 Nov 2013 08:01:46 +1000 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: On 27 Nov 2013 01:35, "Jonathan Slenders" wrote: > Futher, there is a really weird constructs possible: > > list((yield x) for x in range(10)) > [0, None, 1, None, 2, None, 3, None, 4, None, 5, None, 6, None, 7, None, 8, None, 9, None] > > > It's a logical consequence from the translation, but I don't get the point. > Can't we create a local namespace, without wrapping it in a function? That was the original implementation I tried, and it turned out to be inordinately difficult to get the semantics right for lambda expressions that referenced iteration variables from inside the comprehension. There are also some ugly edge cases involving the locals() builtin that would need to have their semantics defined. Switching to a full lexical scope instead turned out to be much easier to implement while still providing closure semantics that matched those of generator expressions, so that's what I ended up implementing. This approach also resolved the "How does locals() work in a 3.x comprehension?" question in favour of making it work the same way it does in a generator expression. As others have noted, this approach of using an implicit function definition also results in the behaviour of yield expressions inside comprehensions being entirely consistent with their behaviour inside def statements and lambda expressions - it turns an ordinary function into a generator function. Is this particularly useful? Not that I've seen (while you can do some kinda neat one-liner hacks with it, they're basically incomprehensible to the reader). It's just a natural consequence of making comprehension semantics more consistent with those of generator expressions (which were already consistent with nested def statements and lambda expressions), and there's no compelling justification to disallow it. Regarding the PEP question, there's no dedicated PEP for the change, just a line item in PEP 3100 to make comprehensions more like generator expressions by hiding the iteration variable. There's probably a thread or two on the old python-3000 list about using a full function scope to do it, though (I seem to recall posting about it after my original pseudo-scope based approach failed to handle closures properly). Cheers, Nick. > > > > > 2013/11/26 Andrew Barnert >> >> On Nov 26, 2013, at 6:32, Oscar Benjamin wrote: >> >> > On 26 November 2013 13:54, Jonathan Slenders wrote: >> >> >> >> Where do I find the PEP that describes that the following statement assigns >> >> a generator object to `values`? >> > >> > I don't think there was a PEP for this but it's a consequence of the >> > change to binding in list comprehensions introduced in Python 3.x >> > which is mentioned here: >> > http://python-history.blogspot.co.uk/2010/06/from-list-comprehensions-to-generator.html >> > >> > Essentially this: >> > >> >> values = [ (yield x) for x in range(10) ] >> > >> > Translates to the following in Python 2.x: >> > >> > _tmp = [] >> > for x in range(10): >> > _tmp.append((yield x)) >> > values = _tmp >> > >> > However in 3.x it translates to something like: >> > >> > def _tmpfunc(): >> > _tmp = [] >> > for x in range(10): >> > _tmp.append((yield x)) >> > return _tmp >> > values = _tmpfunc() >> > >> > This change was made to prevent the comprehension variable from >> > leaking to the enclosing scope, but as you say if the code is nested >> > in a function then it affects which function contains the yield >> > statement and the presence of a yield statement radically alters the >> > behaviour of a function. So in 2.x the enclosing function must become >> > a generator function. However in 3.x the function that is supposed to >> > implement the list comprehension is changed into a generator function >> > instead. >> > >> > $ python3 >> > Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 >> > 32 bit (Intel)] on win32 >> > Type "help", "copyright", "credits" or "license" for more information. >> >>>> values = [(yield x) for x in range(10)] >> >>>> values >> > at 0x00E24508> >> >>>> def _tmpfunc(): >> > ... _tmp = [] >> > ... for x in range(10): >> > ... _tmp.append((yield x)) >> > ... return _tmp >> > ... >> >>>> values = _tmpfunc() >> >>>> values >> > >> > >> >> I assume it's equivalent to the following: >> >> values = (x for x in range(10)) >> > >> > It will yield the same values but it will also build a list of Nones >> > and attach it to StopIteration: >> >> Unless you call .send on it, in which case it'll build a list of the values you send it and attach _that_ to StopIteration, of course. >> >> So I suppose you could use it as a coroutine version of the list function. Except that the number of values it takes is specified on the wrong end. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From seewalker.120 at gmail.com Tue Nov 26 23:31:09 2013 From: seewalker.120 at gmail.com (Alex Seewald) Date: Tue, 26 Nov 2013 17:31:09 -0500 Subject: [Python-ideas] Improving Clarity of re Module Message-ID: For a match object, m, m.group(0) is the semantics for accessing the entire span of the match. For newcomers to regular expressions who are not familiar with the concept of a 'group', the name group(0) is counter-intuitive. A more natural-language-esque alias to group(0), perhaps 'matchSpan', could reduce the time novices spend from idea to working code. Of course, this convenience would introduce a bit of complexity to the codebase, so it may or may not be worth it to add an alias to group(0). What do people think? -- Alex Seewald -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Tue Nov 26 23:46:24 2013 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 26 Nov 2013 22:46:24 +0000 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: References: Message-ID: <529524C0.3080309@mrabarnett.plus.com> On 26/11/2013 22:31, Alex Seewald wrote: > For a match object, m, m.group(0) is the semantics for accessing the > entire span of the match. For newcomers to regular expressions who are > not familiar with the concept of a 'group', the name group(0) is > counter-intuitive. A more natural-language-esque alias to group(0), > perhaps 'matchSpan', could reduce the time novices spend from idea to > working code. Of course, this convenience would introduce a bit of > complexity to the codebase, so it may or may not be worth it to add an > alias to group(0). What do people think? > Well, including 'span' in the name would be confusing because it already has a .span method which returns the start and end indexes. I think that for newcomers to regexes, the concept of capture groups is one of the easiest things to understand! From denis.spir at gmail.com Tue Nov 26 23:52:42 2013 From: denis.spir at gmail.com (spir) Date: Tue, 26 Nov 2013 23:52:42 +0100 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: References: Message-ID: <5295263A.4000909@gmail.com> On 11/26/2013 11:31 PM, Alex Seewald wrote: > For a match object, m, m.group(0) is the semantics for accessing the entire > span of the match. For newcomers to regular expressions who are not > familiar with the concept of a 'group', the name group(0) is > counter-intuitive. A more natural-language-esque alias to group(0), perhaps > 'matchSpan', could reduce the time novices spend from idea to working code. I do agree and support such a change. Actually, I remember it took me some time to find that expression, precisely. (However, isn't it group() alone, without 0? Haven't used re for a while...) But "m.matchspan" is for the least redondant (since m is a match result). "m.span" or "m.snippet" would nicely do the job, wouldn't it? > Of course, this convenience would introduce a bit of complexity to the > codebase, so it may or may not be worth it to add an alias to group(0). > What do people think? At first sight, does not seem that complicated (also, the code exist for group()). How clear is the existing implementation? Denis From steve at pearwood.info Wed Nov 27 00:09:34 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 27 Nov 2013 10:09:34 +1100 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: References: Message-ID: <20131126230933.GO2085@ando> On Tue, Nov 26, 2013 at 05:31:09PM -0500, Alex Seewald wrote: > For a match object, m, m.group(0) is the semantics for accessing the entire > span of the match. For newcomers to regular expressions who are not > familiar with the concept of a 'group', the name group(0) is > counter-intuitive. A more natural-language-esque alias to group(0), perhaps > 'matchSpan', could reduce the time novices spend from idea to working code. > Of course, this convenience would introduce a bit of complexity to the > codebase, so it may or may not be worth it to add an alias to group(0). > What do people think? As a beginner to regexes, it is not very long ago that I was a novice to regexes, and I can tell you that in my experience, the difference between group(0) and matchSpan is entirely inconsequential. I was not familiar with either the concept of "group" nor "span", in fact I had never come across the concept of "span" in regards to regexes until I read your email just now. Either way, the name would be jargon I need to learn. -- Steven From seewalker.120 at gmail.com Wed Nov 27 00:30:15 2013 From: seewalker.120 at gmail.com (Alex Seewald) Date: Tue, 26 Nov 2013 18:30:15 -0500 Subject: [Python-ideas] Python-ideas Digest, Vol 84, Issue 94 In-Reply-To: References: Message-ID: As MRAB noticed, my suggested alias 'matchSpan' is flawed because the word span is already in use. As spir pointed out, 'snippet' would be a better name for the alias to group(). >However, isn't it group() alone, without 0? Both group() and group(0) have the same effect. >> Of course, this convenience would introduce a bit of complexity to the >> codebase, so it may or may not be worth it to add an alias to group(0). >> What do people think? >At first sight, does not seem that complicated (also, the code exist for >group()). How clear is the existing implementation? The existing implementation is clear enough and the alias would not add much complexity. I meant that any alias is a source of complexity and the tradeoff at hand is for convenience. On 26 November 2013 17:52, wrote: > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > https://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > > > Today's Topics: > > 1. Re: Using yield inside a comprehension. (Nick Coghlan) > 2. Improving Clarity of re Module (Alex Seewald) > 3. Re: Improving Clarity of re Module (MRAB) > 4. Re: Improving Clarity of re Module (spir) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Wed, 27 Nov 2013 08:01:46 +1000 > From: Nick Coghlan > To: Jonathan Slenders > Cc: python-ideas at python.org > Subject: Re: [Python-ideas] Using yield inside a comprehension. > Message-ID: > < > CADiSq7dh9HYQDPHnSBMkc-o-aY9KDnjd4QmNRGga4TJBy0TpCA at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > On 27 Nov 2013 01:35, "Jonathan Slenders" wrote: > > Futher, there is a really weird constructs possible: > > > > list((yield x) for x in range(10)) > > [0, None, 1, None, 2, None, 3, None, 4, None, 5, None, 6, None, 7, None, > 8, None, 9, None] > > > > > > It's a logical consequence from the translation, but I don't get the > point. > > Can't we create a local namespace, without wrapping it in a function? > > That was the original implementation I tried, and it turned out to be > inordinately difficult to get the semantics right for lambda expressions > that referenced iteration variables from inside the comprehension. There > are also some ugly edge cases involving the locals() builtin that would > need to have their semantics defined. Switching to a full lexical scope > instead turned out to be much easier to implement while still providing > closure semantics that matched those of generator expressions, so that's > what I ended up implementing. This approach also resolved the "How does > locals() work in a 3.x comprehension?" question in favour of making it work > the same way it does in a generator expression. > > As others have noted, this approach of using an implicit function > definition also results in the behaviour of yield expressions inside > comprehensions being entirely consistent with their behaviour inside def > statements and lambda expressions - it turns an ordinary function into a > generator function. Is this particularly useful? Not that I've seen (while > you can do some kinda neat one-liner hacks with it, they're basically > incomprehensible to the reader). It's just a natural consequence of making > comprehension semantics more consistent with those of generator expressions > (which were already consistent with nested def statements and lambda > expressions), and there's no compelling justification to disallow it. > > Regarding the PEP question, there's no dedicated PEP for the change, just a > line item in PEP 3100 to make comprehensions more like generator > expressions by hiding the iteration variable. There's probably a thread or > two on the old python-3000 list about using a full function scope to do it, > though (I seem to recall posting about it after my original pseudo-scope > based approach failed to handle closures properly). > > Cheers, > Nick. > > > > > > > > > > > 2013/11/26 Andrew Barnert > >> > >> On Nov 26, 2013, at 6:32, Oscar Benjamin > wrote: > >> > >> > On 26 November 2013 13:54, Jonathan Slenders > wrote: > >> >> > >> >> Where do I find the PEP that describes that the following statement > assigns > >> >> a generator object to `values`? > >> > > >> > I don't think there was a PEP for this but it's a consequence of the > >> > change to binding in list comprehensions introduced in Python 3.x > >> > which is mentioned here: > >> > > > http://python-history.blogspot.co.uk/2010/06/from-list-comprehensions-to-generator.html > >> > > >> > Essentially this: > >> > > >> >> values = [ (yield x) for x in range(10) ] > >> > > >> > Translates to the following in Python 2.x: > >> > > >> > _tmp = [] > >> > for x in range(10): > >> > _tmp.append((yield x)) > >> > values = _tmp > >> > > >> > However in 3.x it translates to something like: > >> > > >> > def _tmpfunc(): > >> > _tmp = [] > >> > for x in range(10): > >> > _tmp.append((yield x)) > >> > return _tmp > >> > values = _tmpfunc() > >> > > >> > This change was made to prevent the comprehension variable from > >> > leaking to the enclosing scope, but as you say if the code is nested > >> > in a function then it affects which function contains the yield > >> > statement and the presence of a yield statement radically alters the > >> > behaviour of a function. So in 2.x the enclosing function must become > >> > a generator function. However in 3.x the function that is supposed to > >> > implement the list comprehension is changed into a generator function > >> > instead. > >> > > >> > $ python3 > >> > Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 > >> > 32 bit (Intel)] on win32 > >> > Type "help", "copyright", "credits" or "license" for more information. > >> >>>> values = [(yield x) for x in range(10)] > >> >>>> values > >> > at 0x00E24508> > >> >>>> def _tmpfunc(): > >> > ... _tmp = [] > >> > ... for x in range(10): > >> > ... _tmp.append((yield x)) > >> > ... return _tmp > >> > ... > >> >>>> values = _tmpfunc() > >> >>>> values > >> > > >> > > >> >> I assume it's equivalent to the following: > >> >> values = (x for x in range(10)) > >> > > >> > It will yield the same values but it will also build a list of Nones > >> > and attach it to StopIteration: > >> > >> Unless you call .send on it, in which case it'll build a list of the > values you send it and attach _that_ to StopIteration, of course. > >> > >> So I suppose you could use it as a coroutine version of the list > function. Except that the number of values it takes is specified on the > wrong end. > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://mail.python.org/pipermail/python-ideas/attachments/20131127/39798cdb/attachment-0001.html > > > > ------------------------------ > > Message: 2 > Date: Tue, 26 Nov 2013 17:31:09 -0500 > From: Alex Seewald > To: python-ideas at python.org > Subject: [Python-ideas] Improving Clarity of re Module > Message-ID: > yc3uA at mail.gmail.com> > Content-Type: text/plain; charset="iso-8859-1" > > For a match object, m, m.group(0) is the semantics for accessing the entire > span of the match. For newcomers to regular expressions who are not > familiar with the concept of a 'group', the name group(0) is > counter-intuitive. A more natural-language-esque alias to group(0), perhaps > 'matchSpan', could reduce the time novices spend from idea to working code. > Of course, this convenience would introduce a bit of complexity to the > codebase, so it may or may not be worth it to add an alias to group(0). > What do people think? > > -- > Alex Seewald > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://mail.python.org/pipermail/python-ideas/attachments/20131126/0ff4c0b4/attachment-0001.html > > > > ------------------------------ > > Message: 3 > Date: Tue, 26 Nov 2013 22:46:24 +0000 > From: MRAB > To: python-ideas > Subject: Re: [Python-ideas] Improving Clarity of re Module > Message-ID: <529524C0.3080309 at mrabarnett.plus.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > On 26/11/2013 22:31, Alex Seewald wrote: > > For a match object, m, m.group(0) is the semantics for accessing the > > entire span of the match. For newcomers to regular expressions who are > > not familiar with the concept of a 'group', the name group(0) is > > counter-intuitive. A more natural-language-esque alias to group(0), > > perhaps 'matchSpan', could reduce the time novices spend from idea to > > working code. Of course, this convenience would introduce a bit of > > complexity to the codebase, so it may or may not be worth it to add an > > alias to group(0). What do people think? > > > Well, including 'span' in the name would be confusing because it > already has a .span method which returns the start and end indexes. > > I think that for newcomers to regexes, the concept of capture groups is > one of the easiest things to understand! > > > ------------------------------ > > Message: 4 > Date: Tue, 26 Nov 2013 23:52:42 +0100 > From: spir > To: python-ideas at python.org > Subject: Re: [Python-ideas] Improving Clarity of re Module > Message-ID: <5295263A.4000909 at gmail.com> > Content-Type: text/plain; charset=UTF-8; format=flowed > > On 11/26/2013 11:31 PM, Alex Seewald wrote: > > For a match object, m, m.group(0) is the semantics for accessing the > entire > > span of the match. For newcomers to regular expressions who are not > > familiar with the concept of a 'group', the name group(0) is > > counter-intuitive. A more natural-language-esque alias to group(0), > perhaps > > 'matchSpan', could reduce the time novices spend from idea to working > code. > > I do agree and support such a change. Actually, I remember it took me some > time > to find that expression, precisely. (However, isn't it group() alone, > without 0? > Haven't used re for a while...) But "m.matchspan" is for the least > redondant > (since m is a match result). "m.span" or "m.snippet" would nicely do the > job, > wouldn't it? > > > Of course, this convenience would introduce a bit of complexity to the > > codebase, so it may or may not be worth it to add an alias to group(0). > > What do people think? > > At first sight, does not seem that complicated (also, the code exist for > group()). How clear is the existing implementation? > > Denis > > > ------------------------------ > > Subject: Digest Footer > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > > > ------------------------------ > > End of Python-ideas Digest, Vol 84, Issue 94 > ******************************************** > -- Alex Seewald -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at slenders.be Wed Nov 27 09:14:57 2013 From: jonathan at slenders.be (Jonathan Slenders) Date: Wed, 27 Nov 2013 09:14:57 +0100 Subject: [Python-ideas] Using yield inside a comprehension. In-Reply-To: References: Message-ID: Thanks Nick, nice explanation! I was also not complaining, just wondering why. But this makes sense, I think we should probably avoid using "yield" inside a comprehension. 2013/11/26 Nick Coghlan > > On 27 Nov 2013 01:35, "Jonathan Slenders" wrote: > > Futher, there is a really weird constructs possible: > > > > list((yield x) for x in range(10)) > > [0, None, 1, None, 2, None, 3, None, 4, None, 5, None, 6, None, 7, None, > 8, None, 9, None] > > > > > > It's a logical consequence from the translation, but I don't get the > point. > > Can't we create a local namespace, without wrapping it in a function? > > That was the original implementation I tried, and it turned out to be > inordinately difficult to get the semantics right for lambda expressions > that referenced iteration variables from inside the comprehension. There > are also some ugly edge cases involving the locals() builtin that would > need to have their semantics defined. Switching to a full lexical scope > instead turned out to be much easier to implement while still providing > closure semantics that matched those of generator expressions, so that's > what I ended up implementing. This approach also resolved the "How does > locals() work in a 3.x comprehension?" question in favour of making it work > the same way it does in a generator expression. > > As others have noted, this approach of using an implicit function > definition also results in the behaviour of yield expressions inside > comprehensions being entirely consistent with their behaviour inside def > statements and lambda expressions - it turns an ordinary function into a > generator function. Is this particularly useful? Not that I've seen (while > you can do some kinda neat one-liner hacks with it, they're basically > incomprehensible to the reader). It's just a natural consequence of making > comprehension semantics more consistent with those of generator expressions > (which were already consistent with nested def statements and lambda > expressions), and there's no compelling justification to disallow it. > > Regarding the PEP question, there's no dedicated PEP for the change, just > a line item in PEP 3100 to make comprehensions more like generator > expressions by hiding the iteration variable. There's probably a thread or > two on the old python-3000 list about using a full function scope to do it, > though (I seem to recall posting about it after my original pseudo-scope > based approach failed to handle closures properly). > > Cheers, > Nick. > > > > > > > > > > > 2013/11/26 Andrew Barnert > >> > >> On Nov 26, 2013, at 6:32, Oscar Benjamin > wrote: > >> > >> > On 26 November 2013 13:54, Jonathan Slenders > wrote: > >> >> > >> >> Where do I find the PEP that describes that the following statement > assigns > >> >> a generator object to `values`? > >> > > >> > I don't think there was a PEP for this but it's a consequence of the > >> > change to binding in list comprehensions introduced in Python 3.x > >> > which is mentioned here: > >> > > http://python-history.blogspot.co.uk/2010/06/from-list-comprehensions-to-generator.html > >> > > >> > Essentially this: > >> > > >> >> values = [ (yield x) for x in range(10) ] > >> > > >> > Translates to the following in Python 2.x: > >> > > >> > _tmp = [] > >> > for x in range(10): > >> > _tmp.append((yield x)) > >> > values = _tmp > >> > > >> > However in 3.x it translates to something like: > >> > > >> > def _tmpfunc(): > >> > _tmp = [] > >> > for x in range(10): > >> > _tmp.append((yield x)) > >> > return _tmp > >> > values = _tmpfunc() > >> > > >> > This change was made to prevent the comprehension variable from > >> > leaking to the enclosing scope, but as you say if the code is nested > >> > in a function then it affects which function contains the yield > >> > statement and the presence of a yield statement radically alters the > >> > behaviour of a function. So in 2.x the enclosing function must become > >> > a generator function. However in 3.x the function that is supposed to > >> > implement the list comprehension is changed into a generator function > >> > instead. > >> > > >> > $ python3 > >> > Python 3.3.2 (v3.3.2:d047928ae3f6, May 16 2013, 00:03:43) [MSC v.1600 > >> > 32 bit (Intel)] on win32 > >> > Type "help", "copyright", "credits" or "license" for more information. > >> >>>> values = [(yield x) for x in range(10)] > >> >>>> values > >> > at 0x00E24508> > >> >>>> def _tmpfunc(): > >> > ... _tmp = [] > >> > ... for x in range(10): > >> > ... _tmp.append((yield x)) > >> > ... return _tmp > >> > ... > >> >>>> values = _tmpfunc() > >> >>>> values > >> > > >> > > >> >> I assume it's equivalent to the following: > >> >> values = (x for x in range(10)) > >> > > >> > It will yield the same values but it will also build a list of Nones > >> > and attach it to StopIteration: > >> > >> Unless you call .send on it, in which case it'll build a list of the > values you send it and attach _that_ to StopIteration, of course. > >> > >> So I suppose you could use it as a coroutine version of the list > function. Except that the number of values it takes is specified on the > wrong end. > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Wed Nov 27 11:06:47 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Wed, 27 Nov 2013 05:06:47 -0500 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: References: Message-ID: <5295C437.3020207@nedbatchelder.com> On 11/26/13 5:31 PM, Alex Seewald wrote: > For a match object, m, m.group(0) is the semantics for accessing the > entire span of the match. For newcomers to regular expressions who are > not familiar with the concept of a 'group', the name group(0) is > counter-intuitive. A more natural-language-esque alias to group(0), > perhaps 'matchSpan', could reduce the time novices spend from idea to > working code. Of course, this convenience would introduce a bit of > complexity to the codebase, so it may or may not be worth it to add an > alias to group(0). What do people think? > I like the idea of a better attribute for accessing the matched text. I would go for either "m.matched" or "m.text". There are convenience methods on match objects that I've almost never used: why do we need both .span() and .start()+.end(), for example? And yet, I use .group() all the time, and have to just accept that my pattern had no groups in it, and I say "group" when I mean "matched text". Yes, I understand about groups, and group 0, etc, but for such a common need, why not have a common name? While we're at it, how can it be that we haven't improved the __repr__ after all these years? >>> m = re.search("[ab]", "xay") >>> m <_sre.SRE_Match object at 0x10a2ce9f0> _sre? SRE_Match? huh? :) --Ned. > -- > Alex Seewald > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis.spir at gmail.com Wed Nov 27 11:24:16 2013 From: denis.spir at gmail.com (spir) Date: Wed, 27 Nov 2013 11:24:16 +0100 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <5295C437.3020207@nedbatchelder.com> References: <5295C437.3020207@nedbatchelder.com> Message-ID: <5295C850.6050707@gmail.com> On 11/27/2013 11:06 AM, Ned Batchelder wrote: > While we're at it, how can it be that we haven't improved the __repr__ after all > these years? > > >>> m = re.search("[ab]", "xay") > >>> m > <_sre.SRE_Match object at 0x10a2ce9f0> > > _sre? SRE_Match? huh? :) > > --Ned. looks like a joke ;-) Denis From vernondcole at gmail.com Wed Nov 27 15:03:52 2013 From: vernondcole at gmail.com (Vernon D. Cole) Date: Wed, 27 Nov 2013 15:03:52 +0100 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. Message-ID: Before anyone starts flaming me about "off topic" -- I already know that my suggestion is going to be different from what is usually discussed on this group. I may not write often, but I am a faithful reader (a.k.a. lurker). Please bear with me. I am not aware of any other place where what I am about to say is _on_ topic. ... Yesterday, I set out to try installing a medium-size django application on Windows. It did not go well. It is now 24 hours later -- and I have tried many things. The problem was that "pip" needed to compile a module. There was no C compiler on the machine. After 24 hours of web searching and trying many things, I have come to the following conclusion: I cannot compile a Python extension module with any Microsoft compiler I can obtain. My new Windows 8.1 computer is running Visual Studio 2013 -- which would be fine if I wanted to compile my own Python interpreter, I suppose. I spent most of the day trying to find a working Visual Studio 2008 or C++ 7.1 compiler. The only things I could find to download were patches for Visual Studio 2010. I tried several variations to install it on my old Windows 7 laptop -- all to no avail. May I humbly suggest that Python 3.4 be released using Microsoft Visual Studio Express 2013 for Windows Desktop, a (free) compiler which will actually run on the current version of Windows? That's the Python Idea that I am requesting. -------------- next part -------------- An HTML attachment was scrubbed... URL: From denis.spir at gmail.com Wed Nov 27 15:32:48 2013 From: denis.spir at gmail.com (spir) Date: Wed, 27 Nov 2013 15:32:48 +0100 Subject: [Python-ideas] string codes & substring equality Message-ID: <52960290.6090809@gmail.com> Hello, Coming back to python after a long time. My present project is about (yet another) top-down matching / parsing lib. There are 2 issues that, I guess, may be rather easily solved by simple string methods. The core point is that any scanning / parsing process ends up, a the lowest level, constantly comparing either single-char (rather single-code) substrings or constant (literal) substrings of the source string. This is the only operation which, when successful, actually advances in the source. Thus, it is certainly worth having it efficient, or at the minimum not having it needlessly inefficient. I suppose the same functionalities can be highly useful in various other use cases of text processing. Note again that I'm rediscovering Python (with some pleasure :-), thus may miss known solutions -- but I asked on the tutor mailing list. In both cases, I guess ordinary idiomatic Python code actually _creates_ a new string object, as a substring of length 1 or more, which is otherwise useless; for instance: if s[i] == char: # match ok -- object s[i] unneeded if s[i:j] == substr: # match ok -- object s[i:j] unneeded What is actually needed is just to check for equality (or another check about a code, see below). The case of single-code checking appears when (1) a substring happens to hold a single code (meaning it represents a simple or precomposed unicode char) (2) when matching a char from a given set, range, or more complex class (eg in regex [a-zA-Z0-9_-']). In all cases, what we want is tocheck the code: compare it to a constant value, check whether it belongs to a set of value, or lies inside a given range. We need the code --not a single-code string. Ideally, I'd like expressions like: c = s.code(i) # or s.ord(i) or s.ucode(i) [3] # and then one of: if c = code: # match ok if c in codes: # match ok if c >= code1 and c <= code2: # match ok The builtin function ord(char) does not do the job, since it only works for a single-char string. We would again need to create a new string, with ord(s[i]). The right solution apparently is a string method like code(self, i) giving the code at an arbitrary index. I guess this is trivial. I'm surprised it does not exist; maybe some may think this is a symptom there is no strong need for it; instead, I guess people routinely use a typical Python idiom without even noticing it creates a unneeded string object. [2] [3] What do you think? A second need is checking substring equality against constant substrings of arbitrary sizes. This is similar to startswith & endswith, except at any code index in the source string; a generalisation. In C implementation, it would probably delegate to memcomp, with a start pointer set to p_source+i. On the Python side, it may be a string method like sub_equals(self, substr, i). Choose you preferred name ;-). [1] [4] if s.sub_equals(substr, i): # match ok What do you think? (bis) Thank you, Denis [1] I am unsure whether an end index is useful, actually I don't really understand its usage for startswith & endswith neither. [2] Actually, the compiler, if smart enough, may eliminate this object construction and just check the code; does it? Anyway, I think it is not that easy in the cases of ranges & sets. [3] As a side-note, 'ord' is in my view a misnomer, since character codes are not ordinals, with significant order, but nominals, plain numerical codes which only need to be all distinct; they are kinds of id's. For unicode, I call them 'ucodes', an idea I stole somewhere. But I would be happy is the method is called 'ord' anyway, since the term is established in the Python community. [4] Would such a new method make startswith & endswith unneeded? From mal at egenix.com Wed Nov 27 15:33:10 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 27 Nov 2013 15:33:10 +0100 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. In-Reply-To: References: Message-ID: <529602A6.7030202@egenix.com> On 27.11.2013 15:03, Vernon D. Cole wrote: > Before anyone starts flaming me about "off topic" -- I already know that my > suggestion is going to be different from what is usually discussed on this > group. I may not write often, but I am a faithful reader (a.k.a. lurker). > Please bear with me. I am not aware of any other place where what I am > about to say is _on_ topic. ... > > Yesterday, I set out to try installing a medium-size django application on > Windows. It did not go well. It is now 24 hours later -- and I have tried > many things. The problem was that "pip" needed to compile a module. There > was no C compiler on the machine. After 24 hours of web searching and > trying many things, I have come to the following conclusion: I cannot > compile a Python extension module with any Microsoft compiler I can obtain. > > My new Windows 8.1 computer is running Visual Studio 2013 -- which would > be fine if I wanted to compile my own Python interpreter, I suppose. I > spent most of the day trying to find a working Visual Studio 2008 or C++ > 7.1 compiler. The only things I could find to download were patches for > Visual Studio 2010. I tried several variations to install it on my old > Windows 7 laptop -- all to no avail. Here's the ISO file for VS 2018 SP1 Express: http://download.microsoft.com/download/E/8/E/E8EEB394-7F42-4963-A2D8-29559B738298/VS2008ExpressWithSP1ENUX1504728.iso I cannot say whether it runs on Windows 7, but would be surprised if it didn't. > May I humbly suggest that Python 3.4 be released using Microsoft Visual > Studio Express 2013 for Windows > Desktop, > a (free) compiler which will actually run on the current version of > Windows? > > That's the Python Idea that I am requesting. The express versions don't provide some of the optimizations used for the Python binaries shipped on python.org and at least earlier versions also did not allow cross-compiling 64-bit versions (but that may have changed now that 64-bit is becoming the standard). Now instead of requiring that Python itself be compiled using the express versions, perhaps the slightly alternative option of requiring that Python (and C extensions) *can* be compiled using the express version would be a better way forward :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 27 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From rosuav at gmail.com Wed Nov 27 15:35:27 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 28 Nov 2013 01:35:27 +1100 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. In-Reply-To: <529602A6.7030202@egenix.com> References: <529602A6.7030202@egenix.com> Message-ID: On Thu, Nov 28, 2013 at 1:33 AM, M.-A. Lemburg wrote: > Here's the ISO file for VS 2018 SP1 Express: > > http://download.microsoft.com/download/E/8/E/E8EEB394-7F42-4963-A2D8-29559B738298/VS2008ExpressWithSP1ENUX1504728.iso > > I cannot say whether it runs on Windows 7, but would be surprised > if it didn't. Looks like something that wants to be here: http://docs.python.org/devguide/setup.html#windows ChrisA From stephen at xemacs.org Wed Nov 27 15:43:00 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Wed, 27 Nov 2013 23:43:00 +0900 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <5295C437.3020207@nedbatchelder.com> References: <5295C437.3020207@nedbatchelder.com> Message-ID: <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> Ned Batchelder writes: > On 11/26/13 5:31 PM, Alex Seewald wrote: >> For a match object, m, m.group(0) is the semantics for accessing the >> entire span of the match. For newcomers to regular expressions who >> are not familiar with the concept of a 'group', the name group(0) is >> counter-intuitive. A more natural-language-esque alias to group(0), >> perhaps 'matchSpan', could reduce the time novices spend from idea >> to working code. Of course, this convenience would introduce a bit of >> complexity to the codebase, so it may or may not be worth it to add >> an alias to group(0). What do people think? -1 on "matchSpan", which isn't intuitive to me (and my first guess would be (match.start, match.end) -- which *isn't* because of match.span, this is the first I've heard of it although my eyes may have just slid over it in reading the docs). -0.5 on the whole idea, the not very clueful students I occasionally have to lead by the nose through this stuff have no trouble with .group(0). Their big problem is getting peeved about the whole idea that regexps aren't globs, forgetting the period leads to failed matches that they often fail to diagnose for themselves. :-P > I like the idea of a better attribute for accessing the matched > text.? I would go for either "m.matched" or "m.text". Please, not "text"; I would expect that to be the target string, not a substring. > While we're at it, how can it be that we haven't improved the > __repr__ after all these years? Because there are multiple implementations of re? From p.f.moore at gmail.com Wed Nov 27 15:53:26 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 27 Nov 2013 14:53:26 +0000 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52960290.6090809@gmail.com> References: <52960290.6090809@gmail.com> Message-ID: On 27 November 2013 14:32, spir wrote: > In both cases, I guess ordinary idiomatic Python code actually _creates_ a > new string object, as a substring of length 1 or more, which is otherwise > useless; for instance: > > if s[i] == char: > # match ok -- object s[i] unneeded > > if s[i:j] == substr: > # match ok -- object s[i:j] unneeded > > What is actually needed is just to check for equality (or another check > about a code, see below). I almost never index or slice strings, so this isn't an issue I've encountered. My first thought is that your approach may well be sub-optimal, and something that avoids indexing might be better - but without knowing the details of what you're trying to do it's hard to say for sure. Also, I'd do some profiling to check that this really is the performance bottleneck before worrying too much about optimising it. But assuming you've done that and there's a real issue here, you could probably use str.find: if s.find(char, i, i+1) != -1: # match ok if s.find(substr, i, j) != -1: # match ok It's a bit of a hack, as find in theory scans forward from the start point - but by making the slice length the same as the length of the search string it won't do that. And after all, low-level performance tweaks generally *are* somewhat hackish, sacrificing obviousness for speed, in my experience :-) Paul From p.f.moore at gmail.com Wed Nov 27 15:55:57 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 27 Nov 2013 14:55:57 +0000 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. In-Reply-To: <529602A6.7030202@egenix.com> References: <529602A6.7030202@egenix.com> Message-ID: On 27 November 2013 14:33, M.-A. Lemburg wrote: > Here's the ISO file for VS 2018 SP1 Express: Give Guido his time machine back :-) I presume you meant 2008, as VS 2018 compiles Python direct to machine code, and no longer supports C at all... Paul From mal at egenix.com Wed Nov 27 16:02:52 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 27 Nov 2013 16:02:52 +0100 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. In-Reply-To: References: <529602A6.7030202@egenix.com> Message-ID: <5296099C.6020008@egenix.com> On 27.11.2013 15:55, Paul Moore wrote: > On 27 November 2013 14:33, M.-A. Lemburg wrote: >> Here's the ISO file for VS 2018 SP1 Express: > > Give Guido his time machine back :-) > > I presume you meant 2008, as VS 2018 compiles Python direct to machine > code, and no longer supports C at all... :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 27 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From mal at egenix.com Wed Nov 27 16:04:07 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 27 Nov 2013 16:04:07 +0100 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. In-Reply-To: <529602A6.7030202@egenix.com> References: <529602A6.7030202@egenix.com> Message-ID: <529609E7.7080004@egenix.com> On 27.11.2013 15:33, M.-A. Lemburg wrote: > On 27.11.2013 15:03, Vernon D. Cole wrote: >> Before anyone starts flaming me about "off topic" -- I already know that my >> suggestion is going to be different from what is usually discussed on this >> group. I may not write often, but I am a faithful reader (a.k.a. lurker). >> Please bear with me. I am not aware of any other place where what I am >> about to say is _on_ topic. ... >> >> Yesterday, I set out to try installing a medium-size django application on >> Windows. It did not go well. It is now 24 hours later -- and I have tried >> many things. The problem was that "pip" needed to compile a module. There >> was no C compiler on the machine. After 24 hours of web searching and >> trying many things, I have come to the following conclusion: I cannot >> compile a Python extension module with any Microsoft compiler I can obtain. >> >> My new Windows 8.1 computer is running Visual Studio 2013 -- which would >> be fine if I wanted to compile my own Python interpreter, I suppose. I >> spent most of the day trying to find a working Visual Studio 2008 or C++ >> 7.1 compiler. The only things I could find to download were patches for >> Visual Studio 2010. I tried several variations to install it on my old >> Windows 7 laptop -- all to no avail. > > Here's the ISO file for VS 2008 SP1 Express: > > http://download.microsoft.com/download/E/8/E/E8EEB394-7F42-4963-A2D8-29559B738298/VS2008ExpressWithSP1ENUX1504728.iso > > I cannot say whether it runs on Windows 7, but would be surprised > if it didn't. > >> May I humbly suggest that Python 3.4 be released using Microsoft Visual >> Studio Express 2013 for Windows >> Desktop, >> a (free) compiler which will actually run on the current version of >> Windows? >> >> That's the Python Idea that I am requesting. > > The express versions don't provide some of the optimizations > used for the Python binaries shipped on python.org and at > least earlier versions also did not allow cross-compiling > 64-bit versions (but that may have changed now that 64-bit > is becoming the standard). Looks like the cross compiling is supported, but native the native x64 bit compiler is missing in the 2013 express versions: http://msdn.microsoft.com/en-us/library/hs24szh9%28v=vs.120%29.aspx > Now instead of requiring that Python itself be compiled > using the express versions, perhaps the slightly alternative > option of requiring that Python (and C extensions) > *can* be compiled using the express version would be a better > way forward :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Nov 27 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From stephen at xemacs.org Wed Nov 27 16:08:10 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 28 Nov 2013 00:08:10 +0900 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. In-Reply-To: References: Message-ID: <87r4a1dgrp.fsf@uwakimon.sk.tsukuba.ac.jp> Vernon D. Cole writes: > Before anyone starts flaming me about "off topic" It's not off topic, and in fact there's a recent thread on python-dev which touches on this: https://mail.python.org/pipermail/python-dev/2013-November/130421.html (Sorry, a lot of the subthreads are irrelevant. Look for the subthreads with Steve Dower and Martin van Loewis posting to start.) > I cannot compile a Python extension module with any Microsoft compiler > I can obtain. Your pain is understood, but it's not simple to address it. There have been many threads on this over the years. The basic problem is that the ABI changes. Therefore it's going to require a complete new set of *all* C extensions for Windows, and the duplication of download links for all those extensions from quite a few different vendors is likely to confuse a lot of users. From masklinn at masklinn.net Wed Nov 27 16:26:43 2013 From: masklinn at masklinn.net (Masklinn) Date: Wed, 27 Nov 2013 16:26:43 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52960290.6090809@gmail.com> References: <52960290.6090809@gmail.com> Message-ID: <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> On 2013-11-27, at 15:32 , spir wrote: > Coming back to python after a long time. > My present project is about (yet another) top-down matching / parsing lib. There are 2 issues that, I guess, may be rather easily solved by simple string methods. The core point is that any scanning / parsing process ends up, a the lowest level, constantly comparing either single-char (rather single-code) substrings or constant (literal) substrings of the source string. This is the only operation which, when successful, actually advances in the source. Thus, it is certainly worth having it efficient, or at the minimum not having it needlessly inefficient. I suppose the same functionalities can be highly useful in various other use cases of text processing. > Note again that I'm rediscovering Python (with some pleasure :-), thus may miss known solutions -- but I asked on the tutor mailing list. > > In both cases, I guess ordinary idiomatic Python code actually _creates_ a new string object, as a substring of length 1 or more, which is otherwise useless Have you tried using memoryviews? (or buffers for older python compat?) From breamoreboy at yahoo.co.uk Wed Nov 27 16:43:54 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Wed, 27 Nov 2013 15:43:54 +0000 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> References: <52960290.6090809@gmail.com> <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> Message-ID: On 27/11/2013 15:26, Masklinn wrote: > On 2013-11-27, at 15:32 , spir wrote: >> Coming back to python after a long time. >> My present project is about (yet another) top-down matching / parsing lib. There are 2 issues that, I guess, may be rather easily solved by simple string methods. The core point is that any scanning / parsing process ends up, a the lowest level, constantly comparing either single-char (rather single-code) substrings or constant (literal) substrings of the source string. This is the only operation which, when successful, actually advances in the source. Thus, it is certainly worth having it efficient, or at the minimum not having it needlessly inefficient. I suppose the same functionalities can be highly useful in various other use cases of text processing. >> Note again that I'm rediscovering Python (with some pleasure :-), thus may miss known solutions -- but I asked on the tutor mailing list. >> >> In both cases, I guess ordinary idiomatic Python code actually _creates_ a new string object, as a substring of length 1 or more, which is otherwise useless > > Have you tried using memoryviews? (or buffers for older python compat?) > Python 3.3.3 on Windows 7 TypeError: memoryview: str object does not have the buffer interface. -- Python is the second best programming language in the world. But the best has yet to be invented. Christian Tismer Mark Lawrence From masklinn at masklinn.net Wed Nov 27 16:53:15 2013 From: masklinn at masklinn.net (Masklinn) Date: Wed, 27 Nov 2013 16:53:15 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> Message-ID: <698347D4-41B5-418D-BC0E-96F8ACF0862C@masklinn.net> On 2013-11-27, at 16:43 , Mark Lawrence wrote: > Python 3.3.3 on Windows 7 TypeError: memory view: str object does not have the buffer interface. Yes, you need to operate on bytes to use memoryviews. From rosuav at gmail.com Wed Nov 27 17:00:19 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 28 Nov 2013 03:00:19 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <698347D4-41B5-418D-BC0E-96F8ACF0862C@masklinn.net> References: <52960290.6090809@gmail.com> <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> <698347D4-41B5-418D-BC0E-96F8ACF0862C@masklinn.net> Message-ID: On Thu, Nov 28, 2013 at 2:53 AM, Masklinn wrote: > On 2013-11-27, at 16:43 , Mark Lawrence wrote: >> Python 3.3.3 on Windows 7 TypeError: memory view: str object does not have the buffer interface. > > Yes, you need to operate on bytes to use memoryviews. Meaning you can't use memoryviews to poke around with Unicode ordinals. ChrisA From python at mrabarnett.plus.com Wed Nov 27 18:59:01 2013 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 27 Nov 2013 17:59:01 +0000 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52960290.6090809@gmail.com> References: <52960290.6090809@gmail.com> Message-ID: <529632E5.80907@mrabarnett.plus.com> On 27/11/2013 14:32, spir wrote: [snip] > A second need is checking substring equality against constant substrings of > arbitrary sizes. This is similar to startswith & endswith, except at any code > index in the source string; a generalisation. In C implementation, it would > probably delegate to memcomp, with a start pointer set to p_source+i. On the > Python side, it may be a string method like sub_equals(self, substr, i). Choose > you preferred name ;-). [1] [4] > > if s.sub_equals(substr, i): > # match ok > > What do you think? (bis) > >>> help(str.startswith) Help on method_descriptor: startswith(...) S.startswith(prefix[, start[, end]]) -> bool Return True if S starts with the specified prefix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. prefix can also be a tuple of strings to try. >>> help(str.endswith) Help on method_descriptor: endswith(...) S.endswith(suffix[, start[, end]]) -> bool Return True if S ends with the specified suffix, False otherwise. With optional start, test S beginning at that position. With optional end, stop comparing S at that position. suffix can also be a tuple of strings to try. From apieum at gmail.com Wed Nov 27 18:58:52 2013 From: apieum at gmail.com (Gregory Salvan) Date: Wed, 27 Nov 2013 18:58:52 +0100 Subject: [Python-ideas] Replacing the if __name__ == "__main__" idiom (was Re: making a module callable) In-Reply-To: References: <20131125071932.GA65531@cskk.homeip.net> <20131125141220.GE2085@ando> <20131125144244.1cb160f0@anarchist> <20131125171603.1812e9fd@anarchist> <5294CD94.6050008@gmail.com> Message-ID: my cent: I think it's most often a bad practice (whereas it's convenient) to mix "executable" and "importable" code. Providing an easier way to do it, may encourage a bad practice. I would prefer a solution which encourage separation of "executable" and "importable" code. 2013/11/26 Andrew Barnert > On Nov 26, 2013, at 8:34, Alan Cristhian Ruiz > wrote: > > > I think the need to change * if __ name__ == "__main__": * is capricious > and obsessive. The current rule is better than anything that has been > suggested so far. I never had any problems with the * if __ name__ == > "__main__": *. Also in python there are other things much more difficult to > learn and use, such as metaclasses. > > Although I agree with your main point, I don't think that's a very good > argument. > > __main__ is something novices have to learn early and use in code > regularly; metaclasses are something only experienced developers use, and > not that often (and that's even if you count using stdlib metaclasses to, > e.g., create ABCs, which doesn't really require you to understand how they > work). It's perfectly reasonable for an "expert" feature to be more > difficult to learn than a novice feature. > > Also, Python doesn't have a queue of improvements to be scheduled to a > team of developers. Things get improved if someone is motivated enough to > write the code and drive the idea to consensus and/or BDFL approval. So, > improving this would have very little bearing on improving things you care > about more. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From yoavglazner at gmail.com Wed Nov 27 19:14:44 2013 From: yoavglazner at gmail.com (yoav glazner) Date: Wed, 27 Nov 2013 20:14:44 +0200 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. In-Reply-To: <87r4a1dgrp.fsf@uwakimon.sk.tsukuba.ac.jp> References: <87r4a1dgrp.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Wed, Nov 27, 2013 at 5:08 PM, Stephen J. Turnbull wrote: > Vernon D. Cole writes: > > > Before anyone starts flaming me about "off topic" > > It's not off topic, and in fact there's a recent thread on python-dev > which touches on this: > > https://mail.python.org/pipermail/python-dev/2013-November/130421.html > > (Sorry, a lot of the subthreads are irrelevant. Look for the subthreads > with Steve Dower and Martin van Loewis posting to start.) > > > I cannot compile a Python extension module with any Microsoft compiler > > I can obtain. > > Your pain is understood, but it's not simple to address it. There have > been many threads on this over the years. The basic problem is that the > ABI changes. Therefore it's going to require a complete new set of > *all* C extensions for Windows, and the duplication of download links > for all those extensions from quite a few different vendors is likely to > confuse a lot of users I compiled a few extensions on VS2010 to work with Python27(2008), with no issues*. Ok there were issues with some manifest stuff but there is a quick workaround for that I guess that for common extensions the ABI problems are not visiable -------------- next part -------------- An HTML attachment was scrubbed... URL: From Steve.Dower at microsoft.com Wed Nov 27 19:24:58 2013 From: Steve.Dower at microsoft.com (Steve Dower) Date: Wed, 27 Nov 2013 18:24:58 +0000 Subject: [Python-ideas] Update the required C compiler for Windows to a supported version. Message-ID: <3dd00aa18f174c4e93a26f6e806b8439@BLUPR03MB293.namprd03.prod.outlook.com> Stephen J. Turnbull wrote: > Vernon D. Cole writes: > >> I cannot compile a Python extension module with any Microsoft compiler >> I can obtain. > > Your pain is understood, but it's not simple to address it. FWIW, I'm working on making the compiler easily obtainable. The VS 2008 link that was posted is unofficial, and could theoretically disappear at any time (I'm not in control of that), but the Windows SDK for Windows 7 and .NET 3.5 SP1 (http://www.microsoft.com/en-us/download/details.aspx?id=3138) should be around for as long as Windows 7 is supported. The correct compiler (VC9) is included in this SDK, but unfortunately does not install the vcvarsall.bat file that distutils expects. (Though it's pretty simple to add one that will switch on %1 and call the correct vcvars(86|64|...).bat.) The SDK needed for Python 3.3 and 3.4 (VC10) is even worse - there are many files missing. I'm hoping we'll be able to set up some sort of downloadable package/tool that will fix this. While we'd obviously love to move CPython onto our latest compilers, it's simply not possible (for good reason). Python 3.4 is presumably locked to VC10, but hopefully 3.5 will be able to use whichever version is current when that decision is made. > The basic problem is that the ABI changes. Therefore it's going to require > a complete new set of *all* C extensions for Windows, and the duplication > of download links for all those extensions from quite a few different vendors > is likely to confuse a lot of users. Specifically, the CRT changes. The CRT is an interesting mess of data structures that are exposed in header files, which means while you can have multiple CRTs loaded, they cannot touch each other's data structures at all or things will go bad/crash, and there's no nice way to set it up to avoid this (my colleague who currently owns MSVCRT suggested a not-very-nice way to do it, but I don't think it's going to be reliable enough). Python's stable ABI helps, but does not solve this problem. The file APIs are the worst culprits. The layout of FILE* objects can and do change between CRT versions, and file descriptors are simply indices into an array of these objects that is exposed through macros rather than function calls. As a result, you cannot mix either FILE pointers or file descriptors between CRTs. The only safe option is to build with the matching CRT, and for MSVCRT, this means with the matching compiler. It's unfortunate, and the responsible teams are well aware of the limitation, but it's history at this point, so we have no choice but to work with it. Cheers, Steve From p.f.moore at gmail.com Wed Nov 27 21:01:11 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 27 Nov 2013 20:01:11 +0000 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <529632E5.80907@mrabarnett.plus.com> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> Message-ID: On 27 November 2013 17:59, MRAB wrote: >>>> help(str.startswith) > Help on method_descriptor: > > startswith(...) > S.startswith(prefix[, start[, end]]) -> bool Wow! I never knew that startswith/endswith accepted start and end arguments. Something new I learned today - thanks :-) Paul From denis.spir at gmail.com Wed Nov 27 22:30:09 2013 From: denis.spir at gmail.com (spir) Date: Wed, 27 Nov 2013 22:30:09 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> References: <52960290.6090809@gmail.com> <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> Message-ID: <52966461.1000109@gmail.com> On 11/27/2013 04:26 PM, Masklinn wrote: >> >In both cases, I guess ordinary idiomatic Python code actually _creates_ a new string object, as a substring of length 1 or more, which is otherwise useless > Have you tried using memoryviews? (or buffers for older python compat?) No, but I'll keep this idea aside if ever I really need improved performance in general. For now, I just tried to avoid unneeded object creations, and this at the core points of a parsing process. (This is more or less equivalent to a inner loop, except that for a parser it's rather the core of all scanning loop, the lowest-level matching routines.) Thank you for the suggestion, Denis From denis.spir at gmail.com Wed Nov 27 22:32:11 2013 From: denis.spir at gmail.com (spir) Date: Wed, 27 Nov 2013 22:32:11 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <698347D4-41B5-418D-BC0E-96F8ACF0862C@masklinn.net> References: <52960290.6090809@gmail.com> <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> <698347D4-41B5-418D-BC0E-96F8ACF0862C@masklinn.net> Message-ID: <529664DB.9010009@gmail.com> On 11/27/2013 04:53 PM, Masklinn wrote: > On 2013-11-27, at 16:43 , Mark Lawrence wrote: >> >Python 3.3.3 on Windows 7 TypeError: memory view: str object does not have the buffer interface. > Yes, you need to operate on bytes to use memoryviews. Awa! I chose Python for its builtin ucode-string objects! ;-) Denis From denis.spir at gmail.com Wed Nov 27 22:34:16 2013 From: denis.spir at gmail.com (spir) Date: Wed, 27 Nov 2013 22:34:16 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> Message-ID: <52966558.4030002@gmail.com> On 11/27/2013 09:01 PM, Paul Moore wrote: > On 27 November 2013 17:59, MRAB wrote: >>>>> help(str.startswith) >> Help on method_descriptor: >> >> startswith(...) >> S.startswith(prefix[, start[, end]]) -> bool > > Wow! I never knew that startswith/endswith accepted start and end > arguments. Something new I learned today - thanks :-) You're not the only one ;-) I learnt it yesterday. Probably the method name misleads (not only you and i?. Denis From greg.ewing at canterbury.ac.nz Wed Nov 27 22:41:11 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 28 Nov 2013 10:41:11 +1300 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <529666F7.1030600@canterbury.ac.nz> Stephen J. Turnbull wrote: > Please, not "text"; I would expect that to be the target string, not a > substring. I would have expected 'string' to be the matched text, but it's not. So the most obvious name is already taken. :-( How about 'all'? -- Greg From ethan at stoneleaf.us Wed Nov 27 23:28:00 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 27 Nov 2013 14:28:00 -0800 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52966558.4030002@gmail.com> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> Message-ID: <529671F0.5040200@stoneleaf.us> On 11/27/2013 01:34 PM, spir wrote: > On 11/27/2013 09:01 PM, Paul Moore wrote: >> On 27 November 2013 17:59, MRAB wrote: >>> >>> --> help(str.startswith) >>> Help on method_descriptor: >>> >>> startswith(...) >>> S.startswith(prefix[, start[, end]]) -> bool >> >> Wow! I never knew that startswith/endswith accepted start and end >> arguments. Something new I learned today - thanks :-) > > You're not the only one ;-) I learnt it yesterday. Probably the method name misleads (not only you and i?. What's misleading about it? And how hard is it to look it up in the docs (or help)? That's what I do when I'm going to use something I haven't before. -- ~Ethan~ From python at mrabarnett.plus.com Wed Nov 27 23:56:28 2013 From: python at mrabarnett.plus.com (MRAB) Date: Wed, 27 Nov 2013 22:56:28 +0000 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <529666F7.1030600@canterbury.ac.nz> References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> Message-ID: <5296789C.4080706@mrabarnett.plus.com> On 27/11/2013 21:41, Greg Ewing wrote: > Stephen J. Turnbull wrote: >> Please, not "text"; I would expect that to be the target string, not a >> substring. > > I would have expected 'string' to be the matched text, > but it's not. So the most obvious name is already > taken. :-( > > How about 'all'? > All what? -1 Reading the docs, it refers to the entire "matching string", so how about "matching_string"? Or "matched_part"? From rymg19 at gmail.com Thu Nov 28 00:15:33 2013 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 27 Nov 2013 17:15:33 -0600 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <5296789C.4080706@mrabarnett.plus.com> References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> Message-ID: Sounds nice! Here are my names: matchedText matched allGroups On Wed, Nov 27, 2013 at 4:56 PM, MRAB wrote: > On 27/11/2013 21:41, Greg Ewing wrote: > >> Stephen J. Turnbull wrote: >> >>> Please, not "text"; I would expect that to be the target string, not a >>> substring. >>> >> >> I would have expected 'string' to be the matched text, >> but it's not. So the most obvious name is already >> taken. :-( >> >> How about 'all'? >> >> All what? > > -1 > > Reading the docs, it refers to the entire "matching string", so how > about "matching_string"? Or "matched_part"? > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -- Ryan When your hammer is C++, everything begins to look like a thumb. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Nov 28 00:30:23 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 28 Nov 2013 10:30:23 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <529671F0.5040200@stoneleaf.us> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> Message-ID: On Thu, Nov 28, 2013 at 9:28 AM, Ethan Furman wrote: > On 11/27/2013 01:34 PM, spir wrote: >> >> On 11/27/2013 09:01 PM, Paul Moore wrote: >>> >>> On 27 November 2013 17:59, MRAB wrote: >>>> >>>> >>>> --> help(str.startswith) >>>> >>>> Help on method_descriptor: >>>> >>>> startswith(...) >>>> S.startswith(prefix[, start[, end]]) -> bool >>> >>> >>> Wow! I never knew that startswith/endswith accepted start and end >>> arguments. Something new I learned today - thanks :-) >> >> >> You're not the only one ;-) I learnt it yesterday. Probably the method >> name misleads (not only you and i?. > > > What's misleading about it? And how hard is it to look it up in the docs > (or help)? That's what I do when I'm going to use something I haven't > before. If I want to know if this bit of this string matches that bit of that string, I'm not going to look at startswith, unless I already know that it has this functionality - which until today I didn't. When would I go to look it up in the docs? Am I going to check help("".upper) to see if it can also convert digits to superscript? ChrisA From tjreedy at udel.edu Thu Nov 28 00:42:34 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 27 Nov 2013 18:42:34 -0500 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52966461.1000109@gmail.com> References: <52960290.6090809@gmail.com> <56D20EB4-F550-488B-842B-8333E07EBAF1@masklinn.net> <52966461.1000109@gmail.com> Message-ID: On 11/27/2013 4:30 PM, spir wrote: > No, but I'll keep this idea aside if ever I really need improved > performance in general. For now, I just tried to avoid unneeded object > creations, and this at the core points of a parsing process. It is well known that the simplicity of 'everything is an object' has a cost, including the otherwise useless boxing and unboxing of temporary values that are machine objects (int, floats, char arrays). Others have already pointed out that the core developers who added .start(end)swith added optional params to avoid this. The Psyco (Python specializing compiler) extension package speeds appropriate Python code by more generally avoiding unneeded boxing. https://pypi.python.org/pypi/psyco/1.6 (I think it is Py2 only.) This extension was the basis for the PyPy JIT If you have a few specialized functions you need to accelerate, you can make your own extension. Cython apparently makes this fairly easy. And there are other tools. -- Terry Jan Reedy From tjreedy at udel.edu Thu Nov 28 00:59:17 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 27 Nov 2013 18:59:17 -0500 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> Message-ID: On 11/27/2013 6:30 PM, Chris Angelico wrote: > On Thu, Nov 28, 2013 at 9:28 AM, Ethan Furman wrote: >> On 11/27/2013 01:34 PM, spir wrote: >>> >>> On 11/27/2013 09:01 PM, Paul Moore wrote: >>>> >>>> On 27 November 2013 17:59, MRAB wrote: >>>>> >>>>> >>>>> --> help(str.startswith) >>>>> >>>>> Help on method_descriptor: >>>>> >>>>> startswith(...) >>>>> S.startswith(prefix[, start[, end]]) -> bool >>>> >>>> >>>> Wow! I never knew that startswith/endswith accepted start and end >>>> arguments. Something new I learned today - thanks :-) >>> >>> >>> You're not the only one ;-) I learnt it yesterday. Probably the method >>> name misleads (not only you and i?. >> >> >> What's misleading about it? And how hard is it to look it up in the docs >> (or help)? That's what I do when I'm going to use something I haven't >> before. > > If I want to know if this bit of this string matches that bit of that > string, I'm not going to look at startswith, unless I already know > that it has this functionality - which until today I didn't. When > would I go to look it up in the docs? Am I going to check > help("".upper) to see if it can also convert digits to superscript? I have occasionally recommended on Python list that beginners read through the beginning chapters of the Library Reference, on builtins, just to see what is available. After several years, a refresher read would be helpful. Presuming that I once read the .start(end)swith signature, I had forgotten about the two params also. -- Terry Jan Reedy From ben+python at benfinney.id.au Thu Nov 28 01:19:02 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 28 Nov 2013 11:19:02 +1100 Subject: [Python-ideas] Improving Clarity of re Module References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> Message-ID: <7wiovdqsy1.fsf@benfinney.id.au> Ryan Gonzalez writes: > Sounds nice! > > Here are my names: > > matchedText > matched > allGroups If those suggestions are for a library you hope to be widely used in Python, or even part of the standard library, please make them PEP 8 conformant (avoid camelCaseNames). -- \ ?Our task must be to free ourselves from our prison by widening | `\ our circle of compassion to embrace all humanity and the whole | _o__) of nature in its beauty.? ?Albert Einstein | Ben Finney From ethan at stoneleaf.us Thu Nov 28 00:52:00 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 27 Nov 2013 15:52:00 -0800 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> Message-ID: <529685A0.3010305@stoneleaf.us> On 11/27/2013 03:30 PM, Chris Angelico wrote: > > Am I going to check help("".upper) to see if it can > also convert digits to superscript? Heh, that would be cool. I retract (most of) my comment. start(end)swith is certainly not an obvious name for something that can also do random substring comparisons. Personally, I do search the docs when looking for functionality I don't know about. As there is no obvious substring method I would check the docs for all the string methods to see if any of them had any thing close to, or exactly, what I needed. But, hey, maybe that's just me. :) -- ~Ethan~ From rymg19 at gmail.com Thu Nov 28 02:24:42 2013 From: rymg19 at gmail.com (Ryan Gonzalez) Date: Wed, 27 Nov 2013 19:24:42 -0600 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <7wiovdqsy1.fsf@benfinney.id.au> References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> <7wiovdqsy1.fsf@benfinney.id.au> Message-ID: Whoops... matched_text matched all_groups Bad habit from C++... On Wed, Nov 27, 2013 at 6:19 PM, Ben Finney wrote: > Ryan Gonzalez writes: > > > Sounds nice! > > > > Here are my names: > > > > matchedText > > matched > > allGroups > > If those suggestions are for a library you hope to be widely used in > Python, or even part of the standard library, please make them PEP 8 > conformant (avoid camelCaseNames). > > -- > \ ?Our task must be to free ourselves from our prison by widening | > `\ our circle of compassion to embrace all humanity and the whole | > _o__) of nature in its beauty.? ?Albert Einstein | > Ben Finney > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -- Ryan When your hammer is C++, everything begins to look like a thumb. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu Nov 28 02:40:45 2013 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 28 Nov 2013 01:40:45 +0000 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> <7wiovdqsy1.fsf@benfinney.id.au> Message-ID: <52969F1D.4080006@mrabarnett.plus.com> On 28/11/2013 01:24, Ryan Gonzalez wrote: > Whoops... > > matched_text The tendency is to refer to it as a "string" rather than "text". > matched > all_groups > The complaint was that .group isn't clear enough, so "all_groups" is right out! :-) And there's already .groups... > Bad habit from C++... > > > On Wed, Nov 27, 2013 at 6:19 PM, Ben Finney > wrote: > > Ryan Gonzalez > writes: > > > Sounds nice! > > > > Here are my names: > > > > matchedText > > matched > > allGroups > > If those suggestions are for a library you hope to be widely used in > Python, or even part of the standard library, please make them PEP 8 > conformant (avoid camelCaseNames). > [snip] From rosuav at gmail.com Thu Nov 28 05:02:00 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 28 Nov 2013 15:02:00 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <529685A0.3010305@stoneleaf.us> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> Message-ID: On Thu, Nov 28, 2013 at 10:52 AM, Ethan Furman wrote: > On 11/27/2013 03:30 PM, Chris Angelico wrote: >> >> >> Am I going to check help("".upper) to see if it can >> also convert digits to superscript? > > > Heh, that would be cool. > > I retract (most of) my comment. start(end)swith is certainly not an obvious > name for something that can also do random substring comparisons. > > Personally, I do search the docs when looking for functionality I don't know > about. As there is no obvious substring method I would check the docs for > all the string methods to see if any of them had any thing close to, or > exactly, what I needed. > > But, hey, maybe that's just me. :) Yeah, that would be one good way to do it. But do you need a "substring comparison" method, or do you simply extract a substring and compare it (which is done with slicing)? Sometimes you don't need a single named way to do exactly what you want[1], you should just build from primitives. ChrisA [1] http://php.net/manual/en/function.gzgetss.php - why does this exist? From ethan at stoneleaf.us Thu Nov 28 05:09:50 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 27 Nov 2013 20:09:50 -0800 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> Message-ID: <5296C20E.1080306@stoneleaf.us> On 11/27/2013 08:02 PM, Chris Angelico wrote: > On Thu, Nov 28, 2013 at 10:52 AM, Ethan Furman wrote: >> On 11/27/2013 03:30 PM, Chris Angelico wrote: >>> >>> >>> Am I going to check help("".upper) to see if it can >>> also convert digits to superscript? >> >> >> Heh, that would be cool. >> >> I retract (most of) my comment. start(end)swith is certainly not an obvious >> name for something that can also do random substring comparisons. >> >> Personally, I do search the docs when looking for functionality I don't know >> about. As there is no obvious substring method I would check the docs for >> all the string methods to see if any of them had any thing close to, or >> exactly, what I needed. >> >> But, hey, maybe that's just me. :) > > Yeah, that would be one good way to do it. But do you need a > "substring comparison" method, or do you simply extract a substring > and compare it (which is done with slicing)? Sometimes you don't need > a single named way to do exactly what you want[1], you should just > build from primitives. Which is what I would start with. I wouldn't worry about the rest unless I was curious, or had profiled and knew there was a performance issue. -- ~Ethan~ From abarnert at yahoo.com Thu Nov 28 06:41:30 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Nov 2013 21:41:30 -0800 (PST) Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> <7wiovdqsy1.fsf@benfinney.id.au> Message-ID: <1385617290.48082.YahooMailNeo@web181002.mail.ne1.yahoo.com> From: Ryan Gonzalez >Whoops... > >matched_text >matched >all_groups > >Bad habit from C++... When did C++ change from for_each, find_if, and replace_copy to forEach, findIf, and replaceCopy? The entire C++ standard library, the C standard library that it incorporates by references, Boost and many related third-party libraries,?the POSIX API that's available on almost every platform (even Windows), etc. are all lowercase_with_underscores. Did you mean "Windows C API" when you said C++? From abarnert at yahoo.com Thu Nov 28 06:55:47 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Nov 2013 21:55:47 -0800 (PST) Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> Message-ID: <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> From: Chris Angelico > and compare it (which is done with slicing)? Sometimes you don't need > a single named way to do exactly what you want[1], you should just > build from primitives. ? > [1] http://php.net/manual/en/function.gzgetss.php - why does this exist? Because PHP.?In Python there's one obvious way to do it. In Perl every possible way you could do it works. In PHP, there are three ways you can almost do something like it. Getting back to the original topic, and more seriously, I don't see why this is a problem. Having boxed single-character objects wouldn't be significantly faster than using strings for single characters. Especially for Unicode, where a character isn't a byte, but an abstract code point that can be represented as at least three different variable-length sequences, taking up to 6 bytes. (Not to mention that with Unicode,?half the time you want to do things like locale-based collation or searches that treat NFC and NFD the same or searches that treat Cyriliic small Es and Latin small C the same, etc., and if you think you don't, you've probably got a bug in your code. So character objects would be more of an attractive nuisance than a useful thing anyway.) If your code is really spending a significant amount of time building these single-character strings out of your slices, then any code that iterates over characters in a Python loop is almost certainly going to be way too slow no matter how you optimize it. From rosuav at gmail.com Thu Nov 28 07:05:40 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 28 Nov 2013 17:05:40 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> Message-ID: On Thu, Nov 28, 2013 at 4:55 PM, Andrew Barnert wrote: > From: Chris Angelico >> and compare it (which is done with slicing)? Sometimes you don't need >> a single named way to do exactly what you want[1], you should just >> build from primitives. >> [1] http://php.net/manual/en/function.gzgetss.php - why does this exist? > > Because PHP. In Python there's one obvious way to do it. In Perl every possible way you could do it works. In PHP, there are three ways you can almost do something like it. Sure, but the point is still there. I picked up an extreme example by pointing to PHP, but it's still the same thing: the startswith function, given more parameters, is effectively equivalent to slicing and comparing. What is gained by having the method do both jobs in one wrapper? In this case, the answer might be performance, or it might be readability, and both can be argued. But it's certainly not a glaring hole; if startswith could ONLY check the beginning of a string, the push to _add_ this feature would be quite weak. > Especially for Unicode, where a character isn't a byte, but an abstract code point that can be represented as at least three different variable-length sequences, taking up to 6 bytes. No, a character is simply an integer. How it's represented is immaterial. The easiest representation in Python is a straight int, the easiest in C is probably also an int (32-bit; if it's 64-bit, you waste 40-odd bits, but it's still easiest); the variable length byte representations are for transmission/storage, not for manipulation. ChrisA From abarnert at yahoo.com Thu Nov 28 07:52:31 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 27 Nov 2013 22:52:31 -0800 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> Message-ID: On Nov 27, 2013, at 22:05, Chris Angelico wrote: > On Thu, Nov 28, 2013 at 4:55 PM, Andrew Barnert >> Especially for Unicode, where a character isn't a byte, but an abstract code point that can be represented as at least three different variable-length sequences, taking up to 6 bytes. > > No, a character is simply an integer. How it's represented is > immaterial. The easiest representation in Python is a straight int, > the easiest in C is probably also an int (32-bit; if it's 64-bit, you > waste 40-odd bits, but it's still easiest); the variable length byte > representations are for transmission/storage, not for manipulation The easiest representation of a Unicode character is a Unicode string. It's certainly easiest for the person writing and debugging Python code, who can call string methods like isdigit, print out the character or it's repr, etc. It's no harder for the person writing the Python implementation. If you mean easiest for the CPU, do you really think creating and dealing with arbitrary-length integers wrapped in structs with PyObject headers is easier than dealing with strings of 1/2/4-byte characters wrapped in structs with PyObject headers? From stephen at xemacs.org Thu Nov 28 07:58:21 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 28 Nov 2013 15:58:21 +0900 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> Message-ID: <87haaxc8s2.fsf@uwakimon.sk.tsukuba.ac.jp> Chris Angelico writes: > No, a character is simply an integer. How it's represented is > immaterial. To be pedantic, in Python, not quite immaterial. If you have the Latin 1 subset, you could cast it to bytes and use bytes-oriented APIs (eg, memoryview, bytearray) which would be faster in some cases. From rosuav at gmail.com Thu Nov 28 08:06:18 2013 From: rosuav at gmail.com (Chris Angelico) Date: Thu, 28 Nov 2013 18:06:18 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> Message-ID: On Thu, Nov 28, 2013 at 5:52 PM, Andrew Barnert wrote: > The easiest representation of a Unicode character is a Unicode string. Well, okay. The easiest representation other than as a one-character string. If you want to distinguish characters from strings. ChrisA From g.brandl at gmx.net Thu Nov 28 08:43:47 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 28 Nov 2013 08:43:47 +0100 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <1385617290.48082.YahooMailNeo@web181002.mail.ne1.yahoo.com> References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> <7wiovdqsy1.fsf@benfinney.id.au> <1385617290.48082.YahooMailNeo@web181002.mail.ne1.yahoo.com> Message-ID: Am 28.11.2013 06:41, schrieb Andrew Barnert: > From: Ryan Gonzalez > > >> Whoops... >> >> matched_text matched all_groups >> >> Bad habit from C++... > > > When did C++ change from for_each, find_if, and replace_copy to forEach, > findIf, and replaceCopy? The entire C++ standard library, the C standard > library that it incorporates by references, Boost and many related > third-party libraries, the POSIX API that's available on almost every > platform (even Windows), etc. are all lowercase_with_underscores. > > Did you mean "Windows C API" when you said C++? I don't think C++ stdlib and Boost are the only ways to program in C++ -- take Qt for example, it has the camelCase convention (and don't I wish it hadn't). Georg From abarnert at yahoo.com Thu Nov 28 09:00:32 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Thu, 28 Nov 2013 00:00:32 -0800 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> <7wiovdqsy1.fsf@benfinney.id.au> <1385617290.48082.YahooMailNeo@web181002.mail.ne1.yahoo.com> Message-ID: On Nov 27, 2013, at 23:43, Georg Brandl wrote: > Am 28.11.2013 06:41, schrieb Andrew Barnert: >> From: Ryan Gonzalez >> >> >>> Whoops... >>> >>> matched_text matched all_groups >>> >>> Bad habit from C++... >> >> >> When did C++ change from for_each, find_if, and replace_copy to forEach, >> findIf, and replaceCopy? The entire C++ standard library, the C standard >> library that it incorporates by references, Boost and many related >> third-party libraries, the POSIX API that's available on almost every >> platform (even Windows), etc. are all lowercase_with_underscores. >> >> Did you mean "Windows C API" when you said C++? > > I don't think C++ stdlib and Boost are the only ways to program in C++ -- > take Qt for example, it has the camelCase convention (and don't I wish it > hadn't). Well, sure, and PyQt and PySide have the same camelCase conventions in Python. From pcmanticore at gmail.com Thu Nov 28 11:46:55 2013 From: pcmanticore at gmail.com (Popa Claudiu) Date: Thu, 28 Nov 2013 12:46:55 +0200 Subject: [Python-ideas] Improving Clarity of re Module Message-ID: On 11/27/2013 11:06 AM, Ned Batchelder wrote: >* While we're at it, how can it be that we haven't improved the __repr__ after all *>* these years? *>>* >>> m = re.search("[ab]", "xay") *>* >>> m *>* <_sre.SRE_Match object at 0x10a2ce9f0> *>>* _sre? SRE_Match? huh? :) *>> * --Ned.* Actually, it's fixed in Python 3.4, see http://bugs.python.org/issue17087. Python 3.4.0b1 (default:9c7ab3e68243, Nov 26 2013, 19:42:06) [GCC 4.2.1 20070831 patched [FreeBSD]] on freebsd8 Type "help", "copyright", "credits" or "license" for more information. >>> import re >>> re.search('aaa', 'aaab') <_sre.SRE_Match object; span=(0, 3), match='aaa'> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Nov 28 11:53:09 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 28 Nov 2013 21:53:09 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <529685A0.3010305@stoneleaf.us> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> Message-ID: <20131128105308.GW2085@ando> On Wed, Nov 27, 2013 at 03:52:00PM -0800, Ethan Furman wrote: > On 11/27/2013 03:30 PM, Chris Angelico wrote: > > > >Am I going to check help("".upper) to see if it can > > also convert digits to superscript? > > Heh, that would be cool. > > I retract (most of) my comment. start(end)swith is certainly not an > obvious name for something that can also do random substring comparisons. startswith and endswith are not suitable for arbitrary substring comparisons. That same suggestion was made on the tutor list. The OP (Denis) gave an example like this: mystring = "abcde" assert mystring[1:-1] == "bcd" At first glance, using startswith for substring comparisons works fine: assert mystring.startswith("bcd", 1, -1) but alas startswith does the wrong thing: astr = "abcdefghijklmnopqrstuvwxyz" astr.startswith("bcd", 1, -1) == (astr[1:-1] == "bcd") => returns False -- Steven From denis.spir at gmail.com Thu Nov 28 12:43:05 2013 From: denis.spir at gmail.com (spir) Date: Thu, 28 Nov 2013 12:43:05 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52960290.6090809@gmail.com> References: <52960290.6090809@gmail.com> Message-ID: <52972C49.1040909@gmail.com> All right, thank you all for the exchange, the issue of substring comparison for equality is solved, with either .startswith(substr, i) or .find(substr, i,j). But there remain the problem of getting codes (unicodes code point) at arbitrary indexes in a string? Is it weird to consider a .code(i) string method? What would be its implementation cost? I would really have good usage for it, and certainly numerous other use cases exist. PS: I would be ready to implement it myself, with the help of some dev to guide me through the code base. But it's stupid, since the method is so tiny & simple it would be far less time for such a dev to implement it directly himself. denis From denis.spir at gmail.com Thu Nov 28 13:39:39 2013 From: denis.spir at gmail.com (spir) Date: Thu, 28 Nov 2013 13:39:39 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> Message-ID: <5297398B.90806@gmail.com> On 11/28/2013 07:05 AM, Chris Angelico wrote: > Sure, but the point is still there. I picked up an extreme example by > pointing to PHP, but it's still the same thing: the startswith > function, given more parameters, is effectively equivalent to slicing > and comparing. Rather, it is equivalent to comparing an interval without slicing. That's the whole point. * This is a semantic (conceptual) difference, because the operation is about comparing, not slicing. * This a performance difference, relevant in this case because such comparisons form the core of a scanning/parsing process; every higher-level maching pattern is a composition of such low level substring comps and a little set of operations on codes. For individual, isolated opreations, I would not care. In other words, startswith with a start index just does the right thing; unfortunately, its name does not say it, instead it misleads (me, at least). > What is gained by having the method do both jobs in one > wrapper? In this case, the answer might be performance, or it might be > readability, and both can be argued. But it's certainly not a glaring > hole; if startswith could ONLY check the beginning of a string, the > push to _add_ this feature would be quite weak. Yes. > >Especially for Unicode, where a character isn't a byte, but an abstract code point that can be represented as at least three different variable-length sequences, taking up to 6 bytes. > No, a character is simply an integer. How it's represented is > immaterial. The easiest representation in Python is a straight int, > the easiest in C is probably also an int (32-bit; if it's 64-bit, you > waste 40-odd bits, but it's still easiest); the variable length byte > representations are for transmission/storage, not for manipulation. Right. Except the representation of characters properly speaking (rather than in the weird and polysemic Unicode sense) is a far more complicated issue. As you certainly know. Else, many other languages would probably have a decoded representation for textual data as a code string, like Python has. But this representation, intermediate between byte string and character string, is only the starting point of solving issues of character representation. To have a string of chars, in both everyday and traditional computing senses, one then needs to group codes into character "piles", normalise (NFD to avoid losing info) them, then sort codes inside these code piles. At this cost, one has a bi-univoque string of char reprs. I did this once (for and in language D). It's possible to have it efficient (2-3 time the cost of decoding), but remains a big computing task. Some of the issues can be illustrated by: s1 = "\u0062\u0069\u0308\u0062\u0069\u0302" # primary, decomposed repr of "bi?bi?" s2 = "\u0062\u00EF\u0062\u00EE" # precomposed repr of "bi?bi?" print(s1, s2) # bi?bi? b?b? -- all right! assert(s1.find("i") == 1) # incorrect: # there is here no representation of the character "i", # but a base code (a base mark), part of an actual char representation assert(s1.find("?") == -1) # incorrect: # "\u0069\u0308" is the primary Unicode representation of "?" assert(s1.find("\u0069\u0308") == 1) # correct: # (no comment) assert(s1.find("\u00EF") == -1) # incorrect: # this is another, precomposed repr of "?" assert(s2.find("i") == -1) # correct: # no character "i" here assert(s2.find("\u00EF") == 1) # correct: # (no comment) assert(s2.find("?") == 1) # correct: # there is a precomposed repr of "i" assert(s2.find("\u0069\u0308") == -1) # incorrect : # "\u0069\u0308" is the primary Unicode representation of "?" assert(s2.find("\u0308") == -1) # problematic : # how do I know there is here a char with an umlaut '?' ? # see also https://en.wikipedia.org/wiki/Unicode_equivalence Denis From denis.spir at gmail.com Thu Nov 28 13:50:45 2013 From: denis.spir at gmail.com (spir) Date: Thu, 28 Nov 2013 13:50:45 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <20131128105308.GW2085@ando> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <20131128105308.GW2085@ando> Message-ID: <52973C25.4010506@gmail.com> On 11/28/2013 11:53 AM, Steven D'Aprano wrote: > but alas startswith does the wrong thing: > > astr = "abcdefghijklmnopqrstuvwxyz" > astr.startswith("bcd", 1, -1) == (astr[1:-1] == "bcd") > => returns False Sorry, it works fine (I still don't understand your reasoning, Steven). You are here using a wrong end-index (-1). And anyway there is no question of end-index in substr comparison for equality, it's just the substring size. If you really want to use this second index for some reason I find myself unable to guess, a right way would be, I think: ss, i, n = "bcd", 1, len("bcd") astr.startswith(ss, i, i+n) == (astr[i:i+n] == ss) spir at ospir:~$ python3 Python 3.3.1 (default, Sep 25 2013, 19:29:01) [GCC 4.7.3] on linux Type "help", "copyright", "credits" or "license" for more information. >>> astr = "abcdefghijklmnopqrstuvwxyz" >>> ss, i, n = "bcd", 1, len("bcd") >>> astr.startswith(ss, i, i+n) == (astr[i:i+n] == ss) True Denis From p.f.moore at gmail.com Thu Nov 28 14:22:21 2013 From: p.f.moore at gmail.com (Paul Moore) Date: Thu, 28 Nov 2013 13:22:21 +0000 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5297398B.90806@gmail.com> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> <5297398B.90806@gmail.com> Message-ID: On 28 November 2013 12:39, spir wrote: > Right. Except the representation of characters properly speaking (rather > than in the weird and polysemic Unicode sense) is a far more complicated > issue. As you certainly know. Else, many other languages would probably have > a decoded representation for textual data as a code string, like Python has. > But this representation, intermediate between byte string and character > string, is only the starting point of solving issues of character > representation. To have a string of chars, in both everyday and traditional > computing senses, one then needs to group codes into character "piles", > normalise (NFD to avoid losing info) them, then sort codes inside these code > piles. At this cost, one has a bi-univoque string of char reprs. > I did this once (for and in language D). It's possible to have it efficient > (2-3 time the cost of decoding), but remains a big computing task. > > Some of the issues can be illustrated by: > > s1 = "\u0062\u0069\u0308\u0062\u0069\u0302" # primary, decomposed repr of > "bi?bi?" > s2 = "\u0062\u00EF\u0062\u00EE" # precomposed repr of "bi?bi?" > print(s1, s2) # bi?bi? b?b? -- all right! > > > assert(s1.find("i") == 1) # incorrect: > # there is here no representation of the character "i", > # but a base code (a base mark), part of an actual char representation My eyes glaze over at this level of Unicode, but shouldn't you be looking at the stuff in the unicodedata module? And possibly even some external 3rd party Unicode handling modules (if they exist)? I didn't think that Python handled the fancier levels of Unicode normalisation, collation, etc, as part of the native string type. Or ever claimed to. Paul From ncoghlan at gmail.com Thu Nov 28 14:32:34 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 28 Nov 2013 23:32:34 +1000 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> <5297398B.90806@gmail.com> Message-ID: On 28 November 2013 23:22, Paul Moore wrote: > My eyes glaze over at this level of Unicode, but shouldn't you be > looking at the stuff in the unicodedata module? And possibly even some > external 3rd party Unicode handling modules (if they exist)? I didn't > think that Python handled the fancier levels of Unicode normalisation, > collation, etc, as part of the native string type. Or ever claimed to. We don't - even in Python 3, strings are still just sequences of code points, not characters, graphemes or glyphs. And let's not even get into the implications of bidirectional text :) Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From rymg19 at gmail.com Thu Nov 28 17:52:51 2013 From: rymg19 at gmail.com (Ryan) Date: Thu, 28 Nov 2013 10:52:51 -0600 Subject: [Python-ideas] Improving Clarity of re Module In-Reply-To: <1385617290.48082.YahooMailNeo@web181002.mail.ne1.yahoo.com> References: <5295C437.3020207@nedbatchelder.com> <87txexdhxn.fsf@uwakimon.sk.tsukuba.ac.jp> <529666F7.1030600@canterbury.ac.nz> <5296789C.4080706@mrabarnett.plus.com> <7wiovdqsy1.fsf@benfinney.id.au> <1385617290.48082.YahooMailNeo@web181002.mail.ne1.yahoo.com> Message-ID: It was the book I used to learn the language that got me into that habit. In fact, I've only used the Windows C API once, and I gave up because I couldn't handle the lack of exception handling. *shudders* Andrew Barnert wrote: >From: Ryan Gonzalez > > >>Whoops... >> >>matched_text >>matched >>all_groups >> >>Bad habit from C++... > > >When did C++ change from for_each, find_if, and replace_copy to >forEach, findIf, and replaceCopy? The entire C++ standard library, the >C standard library that it incorporates by references, Boost and many >related third-party libraries,?the POSIX API that's available on almost >every platform (even Windows), etc. are all lowercase_with_underscores. > >Did you mean "Windows C API" when you said C++? -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Thu Nov 28 17:57:42 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 28 Nov 2013 19:57:42 +0300 Subject: [Python-ideas] string.Template v2 Message-ID: http://docs.python.org/2/library/string.html#template-strings ## Original Idea stdlib lacks the most popular basic variable extension syntax "{{ variable }}" that can be found in Django [1], Jinja2 [2] and other templating engines [3]. ## stdlib Analysis string.Template syntax is ancient (dates back to Python 2.4 from 9 years ago). I haven't seen a template like this for a long time. ## Scope of Enhancement st = 'Hello {{world}}.' world = 'is not enough' t = Template(string, style='brace') t.render(locals()) ## Links 1. https://docs.djangoproject.com/en/dev/topics/templates/#variables 2. http://jinja.pocoo.org/docs/templates/#variables 3. http://mustache.github.io/ ## Feature Creeping # Allow to override {{ }} symbols to make it more generic. # `foo.bar` attribute lookup for 2D (nested) structures. Questions is it has to be supported: `foo.bar` in Django does dictionary lookup first, then attribute lookup `foo.bar` in Jinja2 does attribute lookup first I am not sure which is better. I definitely don't want some method or property on a dict passed to render() method to hide dict value. -- anatoly t. From victor.stinner at gmail.com Thu Nov 28 18:53:07 2013 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 28 Nov 2013 18:53:07 +0100 Subject: [Python-ideas] string.Template v2 In-Reply-To: References: Message-ID: 2013/11/28 anatoly techtonik : > ## Scope of Enhancement > > st = 'Hello {{world}}.' > world = 'is not enough' > t = Template(string, style='brace') > t.render(locals()) I'm not sure that I understood correctly your proposition. Do you suggest to add a new module to the Python module or to enhance the existing string.Template class? In my opinion, string.Template alone is almost useless (it's probably why it is not used). Templates engines are much more powerful: cache results, loops, apply functions on variables, escape HTML, etc. IMO it's better to continue to develop templates outside Python stdlib. When a module enters the Python stdlib, its API is almost frozen during many years and you must keep the backward compatibility. It can be a blocker point if you want to enhance your module. It looks like template engines are moving fast, at least faster than the CPython stdlib. Template engines are also specific to a domain, especially to the web. You may need a different template engines for a different domain. Tell me if I'm wrong. Well, that's just my opinion. Others may like to enhance string.Template class to support the "{{name}}" syntax. -- If you want to include a whole template engine into the Python stdlib, the proposition should come from the maintainer of the module. We don't want to have a split (different API, different behaviour) between the version available in CPython and the version maintained on PyPI. And the proposition must be a full PEP describing the API, explain why it's better to get the engine into the stdlib, rationale, compare with other engines, etc. Victor From rymg19 at gmail.com Thu Nov 28 19:31:50 2013 From: rymg19 at gmail.com (Ryan) Date: Thu, 28 Nov 2013 12:31:50 -0600 Subject: [Python-ideas] string.Template v2 In-Reply-To: References: Message-ID: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> Why not use string's .format? anatoly techtonik wrote: >http://docs.python.org/2/library/string.html#template-strings > >## Original Idea > >stdlib lacks the most popular basic variable extension syntax >"{{ variable }}" that can be found in Django [1], Jinja2 [2] and >other templating engines [3]. > >## stdlib Analysis > >string.Template syntax is ancient (dates back to Python 2.4 >from 9 years ago). I haven't seen a template like this for a long time. > >## Scope of Enhancement > >st = 'Hello {{world}}.' >world = 'is not enough' >t = Template(string, style='brace') >t.render(locals()) > >## Links > >1. https://docs.djangoproject.com/en/dev/topics/templates/#variables >2. http://jinja.pocoo.org/docs/templates/#variables >3. http://mustache.github.io/ > >## Feature Creeping > ># Allow to override {{ }} symbols to make it more generic. > ># `foo.bar` attribute lookup for 2D (nested) structures. > >Questions is it has to be supported: >`foo.bar` in Django does dictionary lookup first, then attribute lookup > `foo.bar` in Jinja2 does attribute lookup first > >I am not sure which is better. I definitely don't want some method or >property on a dict passed to render() method to hide dict value. >-- >anatoly t. >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rymg19 at gmail.com Thu Nov 28 19:31:51 2013 From: rymg19 at gmail.com (Ryan) Date: Thu, 28 Nov 2013 12:31:51 -0600 Subject: [Python-ideas] string.Template v2 In-Reply-To: References: Message-ID: Why not use string's .format? anatoly techtonik wrote: >http://docs.python.org/2/library/string.html#template-strings > >## Original Idea > >stdlib lacks the most popular basic variable extension syntax >"{{ variable }}" that can be found in Django [1], Jinja2 [2] and >other templating engines [3]. > >## stdlib Analysis > >string.Template syntax is ancient (dates back to Python 2.4 >from 9 years ago). I haven't seen a template like this for a long time. > >## Scope of Enhancement > >st = 'Hello {{world}}.' >world = 'is not enough' >t = Template(string, style='brace') >t.render(locals()) > >## Links > >1. https://docs.djangoproject.com/en/dev/topics/templates/#variables >2. http://jinja.pocoo.org/docs/templates/#variables >3. http://mustache.github.io/ > >## Feature Creeping > ># Allow to override {{ }} symbols to make it more generic. > ># `foo.bar` attribute lookup for 2D (nested) structures. > >Questions is it has to be supported: >`foo.bar` in Django does dictionary lookup first, then attribute lookup > `foo.bar` in Jinja2 does attribute lookup first > >I am not sure which is better. I definitely don't want some method or >property on a dict passed to render() method to hide dict value. >-- >anatoly t. >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >https://mail.python.org/mailman/listinfo/python-ideas -- Sent from my Android phone with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Thu Nov 28 20:55:35 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 28 Nov 2013 14:55:35 -0500 Subject: [Python-ideas] string.Template v2 In-Reply-To: References: Message-ID: <52979FB7.7040100@nedbatchelder.com> On 11/28/13 11:57 AM, anatoly techtonik wrote: > http://docs.python.org/2/library/string.html#template-strings > > ## Original Idea > > stdlib lacks the most popular basic variable extension syntax > "{{ variable }}" that can be found in Django [1], Jinja2 [2] and > other templating engines [3]. Anatoly, you've been participating here long enough to know that an important question to answer is: why does the stdlib need this functionality? As you point out here, it is already available from popular third-party packages. Why not just let them provide the functionality, and leave it at that? --Ned. > ## stdlib Analysis > > string.Template syntax is ancient (dates back to Python 2.4 > from 9 years ago). I haven't seen a template like this for a long time. > > ## Scope of Enhancement > > st = 'Hello {{world}}.' > world = 'is not enough' > t = Template(string, style='brace') > t.render(locals()) > > ## Links > > 1. https://docs.djangoproject.com/en/dev/topics/templates/#variables > 2. http://jinja.pocoo.org/docs/templates/#variables > 3. http://mustache.github.io/ > > ## Feature Creeping > > # Allow to override {{ }} symbols to make it more generic. > > # `foo.bar` attribute lookup for 2D (nested) structures. > > Questions is it has to be supported: > `foo.bar` in Django does dictionary lookup first, then attribute lookup > `foo.bar` in Jinja2 does attribute lookup first > > I am not sure which is better. I definitely don't want some method or > property on a dict passed to render() method to hide dict value. > -- > anatoly t. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas From ethan at stoneleaf.us Thu Nov 28 21:50:50 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 28 Nov 2013 12:50:50 -0800 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <20131128105308.GW2085@ando> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <20131128105308.GW2085@ando> Message-ID: <5297ACAA.2070803@stoneleaf.us> On 11/28/2013 02:53 AM, Steven D'Aprano wrote: > On Wed, Nov 27, 2013 at 03:52:00PM -0800, Ethan Furman wrote: >> On 11/27/2013 03:30 PM, Chris Angelico wrote: >>> >>> Am I going to check help("".upper) to see if it can >>> also convert digits to superscript? >> >> Heh, that would be cool. >> >> I retract (most of) my comment. start(end)swith is certainly not an >> obvious name for something that can also do random substring comparisons. > > startswith and endswith are not suitable for arbitrary substring > comparisons. Sure they are. Just do it right. :) > That same suggestion was made on the tutor list. The OP (Denis) gave an > example like this: > > mystring = "abcde" > assert mystring[1:-1] == "bcd" > > At first glance, using startswith for substring comparisons works fine: > > assert mystring.startswith("bcd", 1, -1) > > but alas startswith does the wrong thing: > > astr = "abcdefghijklmnopqrstuvwxyz" > astr.startswith("bcd", 1, -1) == (astr[1:-1] == "bcd") > => returns False Which simply shows an easy mistake to make. The proper way, if using startswith (or endswith) is to be careful of the length of both pieces. -- ~Ethan~ From joshua at landau.ws Thu Nov 28 23:05:51 2013 From: joshua at landau.ws (Joshua Landau) Date: Thu, 28 Nov 2013 22:05:51 +0000 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <1385618147.50484.YahooMailNeo@web181004.mail.ne1.yahoo.com> Message-ID: On 28 November 2013 05:55, Andrew Barnert wrote: > Getting back to the original topic, and more seriously, I don't see why this is a problem. Having boxed single-character objects wouldn't be significantly faster than using strings for single characters. Especially for Unicode, where a character isn't a byte, but an abstract code point that can be represented as at least three different variable-length sequences, taking up to 6 bytes. (Not to mention that with Unicode, half the time you want to do things like locale-based collation or searches that treat NFC and NFD the same or searches that treat Cyriliic small Es and Latin small C the same, etc., and if you think you don't, you've probably got a bug in your code. So character objects would be more of an attractive nuisance than a useful thing anyway.) > > If your code is really spending a significant amount of time building these single-character strings out of your slices, then any code that iterates over characters in a Python loop is almost certainly going to be way too slow no matter how you optimize it. Talking about substring comparisons, this isn't true. Building the string is O(n), comparing it is amortized O(1) for nonequal strings. From greg.ewing at canterbury.ac.nz Thu Nov 28 23:56:18 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Nov 2013 11:56:18 +1300 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52972C49.1040909@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> Message-ID: <5297CA12.7050607@canterbury.ac.nz> spir wrote: > Is it weird to consider a .code(i) string method? Another approach would be to keep a cache of single-char strings, like we do for small integers. -- Greg From rosuav at gmail.com Thu Nov 28 23:59:36 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 29 Nov 2013 09:59:36 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5297CA12.7050607@canterbury.ac.nz> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> Message-ID: On Fri, Nov 29, 2013 at 9:56 AM, Greg Ewing wrote: > spir wrote: >> >> Is it weird to consider a .code(i) string method? > > > Another approach would be to keep a cache of single-char > strings, like we do for small integers. Would you really keep a cache of 1114112 string objects around? Seems a bit of overkill. ChrisA From techtonik at gmail.com Fri Nov 29 00:16:37 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 29 Nov 2013 02:16:37 +0300 Subject: [Python-ideas] string.Template v2 In-Reply-To: <52979FB7.7040100@nedbatchelder.com> References: <52979FB7.7040100@nedbatchelder.com> Message-ID: On Thu, Nov 28, 2013 at 10:55 PM, Ned Batchelder wrote: > On 11/28/13 11:57 AM, anatoly techtonik wrote: >> >> http://docs.python.org/2/library/string.html#template-strings >> >> ## Original Idea >> >> stdlib lacks the most popular basic variable extension syntax >> "{{ variable }}" that can be found in Django [1], Jinja2 [2] and >> other templating engines [3]. > > Anatoly, you've been participating here long enough to know that an > important question to answer is: why does the stdlib need this > functionality? As you point out here, it is already available from popular > third-party packages. Why not just let them provide the functionality, and > leave it at that? stdlib doesn't need the functionality provided by current string,Template, and if stdlib comes with batteries included, it is better to ship Alkaline than NiCd. Support for {{ }} syntax and attribute lookup is a way to provide something simple from the start, that can be later enhanced by using dependencies, but without rewriting everything to a new syntax. -- anatoly t. From techtonik at gmail.com Fri Nov 29 00:25:26 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 29 Nov 2013 02:25:26 +0300 Subject: [Python-ideas] string.Template v2 In-Reply-To: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> References: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> Message-ID: On Thu, Nov 28, 2013 at 9:31 PM, Ryan wrote: > Why not use string's .format? Good question. I'd say format language is too complicated. It is the same cryptic printf-like char-micromanagement language syntax, where every byte counts even if unreadable. I don't know why it was introduced. Perhaps there was no other way, but it looks more complicated than common templating engine conventions. I'd say it is not the best syntax, and its API is not common. From greg.ewing at canterbury.ac.nz Fri Nov 29 00:29:29 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Nov 2013 12:29:29 +1300 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> Message-ID: <5297D1D9.9020404@canterbury.ac.nz> Chris Angelico wrote: > Would you really keep a cache of 1114112 string objects around? Well, not *exactly* like the int cache... the idea would be to remember ones that had been used, not pre-generate all possible 1-char strings. -- Greg From rosuav at gmail.com Fri Nov 29 00:35:38 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 29 Nov 2013 10:35:38 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5297D1D9.9020404@canterbury.ac.nz> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> <5297D1D9.9020404@canterbury.ac.nz> Message-ID: On Fri, Nov 29, 2013 at 10:29 AM, Greg Ewing wrote: > Chris Angelico wrote: >> >> Would you really keep a cache of 1114112 string objects around? > > > Well, not *exactly* like the int cache... the idea > would be to remember ones that had been used, not > pre-generate all possible 1-char strings. That might have some value, but there'd still be cost to creating those small strings, so it gains little and can still potentially be a large cache. Personally, I wouldn't bother. Let 'em be like any other string. ChrisA From tjreedy at udel.edu Fri Nov 29 00:36:36 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 28 Nov 2013 18:36:36 -0500 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52972C49.1040909@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> Message-ID: On 11/28/2013 6:43 AM, spir wrote: > All right, thank you all for the exchange, the issue of substring > comparison for equality is solved, with either .startswith(substr, i) or > .find(substr, i,j). But there remain the problem of getting codes > (unicodes code point) at arbitrary indexes in a string? Do you mean ord(code[i])? We already have that. > Is it weird to consider a .code(i) string method? No and yes. There are hundreds, thousands of simple compositions that different people might like baked into the language to speed a particular application. Some numerical users might like Python to have the C equivalent of def muladd(a,b,c): return a * b + c # or maybe def muladd(a,b,d): return a + b * c > What would be its implementation cost? What would be the implementation, maintenance, learning, and usability cost of adding thousands of such little methods? > I would really have good usage for it, I believe use of ord is rather rare, as builtins go. In 2.7, it works with both (byte) strings and unicode. In 3.3, it does not work with bytes as indexing directly returns ordinals (b'abc'[1] == 98). So if the text you are parsing is limited to ascii or and small ascii superset, such as latin-1, you might do better using the bytes encoding. If your text potentially includes and unicode char and if you have measurements that show the the extra cost of the intermediate single char is really a bottleneck, then add the composed function privately. Or perhaps you could use ctypes to access the innards of a string and see if that is faster. > certainly numerous other use cases exist. More that a hand wave is needed to demonstrate that. -- Terry Jan Reedy From steve at pearwood.info Fri Nov 29 01:12:48 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Nov 2013 11:12:48 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5297ACAA.2070803@stoneleaf.us> References: <52960290.6090809@gmail.com> <529632E5.80907@mrabarnett.plus.com> <52966558.4030002@gmail.com> <529671F0.5040200@stoneleaf.us> <529685A0.3010305@stoneleaf.us> <20131128105308.GW2085@ando> <5297ACAA.2070803@stoneleaf.us> Message-ID: <20131129001248.GY2085@ando> On Thu, Nov 28, 2013 at 12:50:50PM -0800, Ethan Furman wrote: > >startswith and endswith are not suitable for arbitrary substring > >comparisons. > > Sure they are. Just do it right. :) At the point you "do it right" you've lost any advantage over slicing, except in the extreme case that the slice you are interested in is so enormous that the copying cost is extreme. And that is the point of the OP's post, he believes that slicing is inefficient because it creates a new string object. (I'm giving him the benefit of the doubt that he has profiled his application and that this actually is the case.) Having said that, I suppose it's up to me to prove what I say with some benchmarks... py> def match(s, substr, start=0, end=None): ... if end is None: end = len(s) ... if start < 0: start += len(s) ... if end < 0: end += len(s) ... if len(substr) != end - start: return False ... return s.startswith(substr, start, end) ... py> match("abcde", "bcd", 1, -1) True py> match("abcdefg", "bcd", 1, -1) False And some benchmarks: py> from timeit import Timer py> setup = "from __main__ import match" py> t1 = Timer("match('abcdef', 'cde', 2, -1)", setup) py> t2 = Timer("s[2:-1] == 'cde'", "s = 'abcdef'") py> min(t1.repeat(repeat=5)) 1.2987589836120605 py> min(t2.repeat(repeat=5)) 0.25656223297119141 Slicing is about three times faster. [...] > Which simply shows an easy mistake to make. The proper way, if using > startswith (or endswith) is to be careful of the length of both pieces. My point exactly. startswith does not test *substring equality*, which is what the OP is asking for, but *prefix equality*. Only in the case where the length of the substring equals the length of the prefix are they the same thing. -- Steven From steve at pearwood.info Fri Nov 29 01:45:27 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Nov 2013 11:45:27 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52972C49.1040909@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> Message-ID: <20131129004527.GZ2085@ando> On Thu, Nov 28, 2013 at 12:43:05PM +0100, spir wrote: > All right, thank you all for the exchange, the issue of substring > comparison for equality is solved, with either .startswith(substr, i) or > .find(substr, i,j). But there remain the problem of getting codes (unicodes > code point) at arbitrary indexes in a string? ord(s[i]) is the accepted solution to that. > Is it weird to consider a .code(i) string method? Such a method should not be called "code", since "ordinal" or "ord" is the accepted term for it. Should strings have an ord() method? Disadvantages: - another piece of code to be written, debugged, maintained, documented; - another thing for users to learn; - cognitive load of having to decide whether to use the ord() method or the ord() function. Advantage: - you save the cost of extracting a one-character string before passing it to the ord() function. In this case, both the disadvantages and advantages are tiny. That being the case, I would expect that unless somebody else goes "Yes! That's exactly what I need too!" and is motiviated to write the patch for you, the only way this has *any* chance of happening is for you to write the patch yourself. That means: - write the code; - test that it doesn't break anything; - write tests for it; - write documentation for it; and most importantly: - write benchmarks that demonstrate that calling your str.ord(i) method really is faster than calling ord(s[i]). When you have to do all that work yourself, you will soon see that it's perhaps not as "tiny & simple" as when somebody else does the work. On balance, is the benefit greater than the cost? I think it is a close call, balanced on a knife-edge, but having benchmarked it in Python 3.3 I think that perhaps there could be some on balance a tiny nett benefit. Here is my benchmark: py> from timeit import Timer py> setup = "s = 'abcdef'" py> t1 = Timer("ord('c')") # establish a base-mark of calling ord py> t2 = Timer("ord(s[2])", setup) py> min(t1.repeat(repeat=5)) 0.13925810158252716 py> min(t2.repeat(repeat=5)) 0.2207092922180891 The difference is the cost of creating a single character string before taking the ordinal value of it. Still, that cost is tiny: less than 0.1 microseconds on my machine. On my PC, I could extract ten million such ordinals before the total cost exceeded one second. I find it difficult to see that this cost could be a bottleneck in any real-world application, but still, in Python 3.3 it seems to be a reasonable micro-optimization to have an ord method. But even if there is such a benefit, the benefit is so small that I have no interest in pushing for it. I have more important things to work on. +0 on a str.ord method. -- Steven From steve at pearwood.info Fri Nov 29 01:58:14 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Nov 2013 11:58:14 +1100 Subject: [Python-ideas] string.Template v2 In-Reply-To: References: Message-ID: <20131129005814.GA2085@ando> On Thu, Nov 28, 2013 at 06:53:07PM +0100, Victor Stinner wrote: > In my opinion, string.Template alone is almost useless (it's probably > why it is not used). I've used it. In my opinion, string.Template is not aimed at the programmer, but at the end-user. As a programmer, string.Template is very underpowered and much less convenient than either % string interpolation or str.format. But as an end-user, sometimes all I need is basic string templating without all the complicated features of % and {} formatting. For that simple use-case $name templates are all I want and a much nicer look than Anatoly's suggested {{name}}. If Anatoly wishes to create a new templating engine using {{}}, I encourage him to do so, but he should leave string.Template alone. (I'm not opposed to enhancements to Template, but I am opposed to radically overhauling it into something completely different.) -- Steven From stephen at xemacs.org Fri Nov 29 02:13:04 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 29 Nov 2013 10:13:04 +0900 Subject: [Python-ideas] string.Template v2 In-Reply-To: References: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> Message-ID: <87bo14c8nz.fsf@uwakimon.sk.tsukuba.ac.jp> anatoly techtonik writes: > Good question. I'd say format language is too complicated. It is the > same cryptic printf-like char-micromanagement language syntax, where > every byte counts even if unreadable. In fact, .format() is *not* complicated in its basic use: "I propose a fine of ${payment} for not reading the docs.".format(payment=100) (Note that this isn't Perl, the "$" is a literal. :-) But when you don't have a TABLE element available for formatting tables, width, precision, base, and the like are *necessary* information for nice output. Do you really think it's readable to write "I propose a fine of ${payment: type=float precision=2} for not reading the docs.".format(payment=100) vs. "I propose a fine of ${payment:.2f} for not reading the docs.".format(payment=100) .format() provides simple syntax for simple tasks, and more complex syntax for tasks requiring more precision of expression. It's true that the marshalling of substitution values (either as keyword arguments or in a **kw dictionary) is exposed here in the invocation of .format() itself, but you don't avoid marshalling in Django or other web frameworks, it's just in a different place with somewhat different requirements. Web framework templating languages do avoid "cryptic printf-like char-micromanagement", but they can do that because they delegate the styling to HTML and CSS. It's true that such template languages allow things like object attribute access, but there are severe restrictions, and deciding on the restrictions is a very subtle matter. It's generally agreed that separation of business logic from presentation is a Good Thing[tm], so allowing attribute access and filtering is *apparently* a step in the Wrong direction. Where "apparent" becomes "actual" is unclear, especially for the general-purpose facility that Python must provide. Although AFAIK attribute access and filter syntax are common features of popular templating languages, I don't think it's a good idea for Python to advocate a particular set of features in its format language when the set of commonly accepted features is quite restricted (eg, one level of object attribute access), so in practice templating languages are not going to go away. On the Python side, Python cannot assume that the output language will be a structured markup language with styling features; the low-level formatting syntax is still needed. -1 on adding more templating features to Python's stdlib; .format() already hits the sweet spot given current best practice IMO. From tjreedy at udel.edu Fri Nov 29 02:57:57 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 28 Nov 2013 20:57:57 -0500 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> Message-ID: On 11/28/2013 5:59 PM, Chris Angelico wrote: > On Fri, Nov 29, 2013 at 9:56 AM, Greg Ewing wrote: >> spir wrote: >>> >>> Is it weird to consider a .code(i) string method? >> >> >> Another approach would be to keep a cache of single-char >> strings, like we do for small integers. > > Would you really keep a cache of 1114112 string objects around? Seems > a bit of overkill. I believe py2 *does* have a cache of 256 single byte strings. There *might* be such a cache of the same size for the first 256 unicode single char strings. -- Terry Jan Reedy From tjreedy at udel.edu Fri Nov 29 03:08:20 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 28 Nov 2013 21:08:20 -0500 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <20131129004527.GZ2085@ando> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> Message-ID: On 11/28/2013 7:45 PM, Steven D'Aprano wrote: > On Thu, Nov 28, 2013 at 12:43:05PM +0100, spir wrote: >> All right, thank you all for the exchange, the issue of substring >> comparison for equality is solved, with either .startswith(substr, i) or >> .find(substr, i,j). But there remain the problem of getting codes (unicodes >> code point) at arbitrary indexes in a string? > > ord(s[i]) is the accepted solution to that. > >> Is it weird to consider a .code(i) string method? > > Such a method should not be called "code", since "ordinal" or "ord" is > the accepted term for it. > > Should strings have an ord() method? > > Disadvantages: > > - another piece of code to be written, debugged, maintained, documented; > > - another thing for users to learn; > > - cognitive load of having to decide whether to use the ord() method or > the ord() function. > > > Advantage: > > - you save the cost of extracting a one-character string before passing > it to the ord() function. > > > In this case, both the disadvantages and advantages are tiny. That being > the case, I would expect that unless somebody else goes "Yes! That's > exactly what I need too!" and is motiviated to write the patch for you, > the only way this has *any* chance of happening is for you to write the > patch yourself. That means: > > - write the code; > - test that it doesn't break anything; > - write tests for it; > - write documentation for it; > > and most importantly: > > - write benchmarks that demonstrate that calling your str.ord(i) method > really is faster than calling ord(s[i]). > > When you have to do all that work yourself, you will soon see that it's > perhaps not as "tiny & simple" as when somebody else does the work. > > On balance, is the benefit greater than the cost? I think it is a close > call, balanced on a knife-edge, but having benchmarked it in Python 3.3 > I think that perhaps there could be some on balance a tiny nett benefit. > > Here is my benchmark: > > py> from timeit import Timer > py> setup = "s = 'abcdef'" > py> t1 = Timer("ord('c')") # establish a base-mark of calling ord > py> t2 = Timer("ord(s[2])", setup) > py> min(t1.repeat(repeat=5)) > 0.13925810158252716 > py> min(t2.repeat(repeat=5)) > 0.2207092922180891 Thanks for real data. > The difference is the cost of creating a single character string before > taking the ordinal value of it. > > Still, that cost is tiny: less than 0.1 microseconds on my machine. On > my PC, I could extract ten million such ordinals before the total cost > exceeded one second. I find it difficult to see that this cost could be > a bottleneck in any real-world application, but still, in Python 3.3 it > seems to be a reasonable micro-optimization to have an ord method. From my reading of developer discussions on the tracker (and pydev), I believe most would consider .1 microsecond too little gain for adding a new (duplicate) string method. > But even if there is such a benefit, the benefit is so small that I have > no interest in pushing for it. I have more important things to work on. > > +0 on a str.ord method. -- Terry Jan Reedy From ncoghlan at gmail.com Fri Nov 29 03:48:47 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 29 Nov 2013 12:48:47 +1000 Subject: [Python-ideas] string.Template v2 In-Reply-To: <87bo14c8nz.fsf@uwakimon.sk.tsukuba.ac.jp> References: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> <87bo14c8nz.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On 29 November 2013 11:13, Stephen J. Turnbull wrote: > Although AFAIK attribute access and filter syntax are common features > of popular templating languages, I don't think it's a good idea for > Python to advocate a particular set of features in its format language > when the set of commonly accepted features is quite restricted (eg, > one level of object attribute access), so in practice templating > languages are not going to go away. On the Python side, Python cannot > assume that the output language will be a structured markup language > with styling features; the low-level formatting syntax is still > needed. > > -1 on adding more templating features to Python's stdlib; .format() > already hits the sweet spot given current best practice IMO. As is frequently the case, Anatoly has failed to do his research on what is already possible and the rationale for the status quo before proposing changes. This is in spite of repeated requests (over a number of years) that he stop wasting people's time on the core development lists. 1. The string.Template syntax is aimed at document translators and other non-developer string formatting use cases. It is not really intended for programmatic use. This is covered explicitly in PEP 292 (which added string.Template) 2. The str.format mini-language *is* designed for programmatic use, and already offers attribute and item access as part of element substitution (see PEP 3101 and the standard library documentation). There's essentially zero chance of a fourth approach to string formatting being added to the standard library - third party libraries remain free to do whatever they want, though. Regards, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From ben+python at benfinney.id.au Fri Nov 29 04:15:38 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 29 Nov 2013 14:15:38 +1100 Subject: [Python-ideas] string.Template v2 References: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> <87bo14c8nz.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <7wob53q4o5.fsf@benfinney.id.au> "Stephen J. Turnbull" writes: > But when you don't have a TABLE element available for formatting > tables, width, precision, base, and the like are *necessary* > information for nice output. Do you really think it's readable to > write > > "I propose a fine of ${payment: type=float precision=2} for not reading the docs.".format(payment=100) > > vs. > > "I propose a fine of ${payment:.2f} for not reading the docs.".format(payment=100) The first is far more readable, because it is explicit and doesn't require special knowledge to know what the parameters are. An uninformed guess by a casual reader is much more likely to be right in the first case; I think that's an important criterion for readability. Which is not to say I *prefer* the first one. I think readability can be sacrificed in some cases where the benefit of concision is high, and a template paramter format is one of those cases IMO. But we need to acknowledge that concision and readability are sometimes conflicting goals. I think it's clear that your examples illustrate that conflict. > -1 on adding more templating features to Python's stdlib; .format() > already hits the sweet spot given current best practice IMO. Definitely agreed. -- \ ?Always code as if the guy who ends up maintaining your code | `\ will be a violent psychopath who knows where you live.? ?John | _o__) F. Woods | Ben Finney From greg.ewing at canterbury.ac.nz Fri Nov 29 05:57:26 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Nov 2013 17:57:26 +1300 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <20131129004527.GZ2085@ando> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> Message-ID: <52981EB6.50003@canterbury.ac.nz> Steven D'Aprano wrote: > On Thu, Nov 28, 2013 at 12:43:05PM +0100, spir wrote: > >>Is it weird to consider a .code(i) string method? > > Such a method should not be called "code", since "ordinal" or "ord" is > the accepted term for it. How about giving ord() an optional index parameter? ord(s, i) would be equivalent to ord(s[i]) but would avoid creating an intermediate string object. -- Greg From rymg19 at gmail.com Fri Nov 29 06:01:07 2013 From: rymg19 at gmail.com (Ryan) Date: Thu, 28 Nov 2013 23:01:07 -0600 Subject: [Python-ideas] string.Template v2 In-Reply-To: References: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> Message-ID: Nah, format just takes getting used to, like anything else would. And you don't need to use half the language; Python comes with enough builtin attributes on strings. anatoly techtonik wrote: >On Thu, Nov 28, 2013 at 9:31 PM, Ryan wrote: >> Why not use string's .format? > >Good question. I'd say format language is too complicated. It is the >same cryptic printf-like char-micromanagement language syntax, where >every byte counts even if unreadable. I don't know why it was >introduced. Perhaps there was no other way, but it looks more >complicated than common templating engine conventions. I'd say it is >not the best syntax, and its API is not common. -- Sent from my Android device with K-9 Mail. Please excuse my brevity. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Fri Nov 29 05:55:28 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 29 Nov 2013 13:55:28 +0900 Subject: [Python-ideas] string.Template v2 In-Reply-To: <7wob53q4o5.fsf@benfinney.id.au> References: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> <87bo14c8nz.fsf@uwakimon.sk.tsukuba.ac.jp> <7wob53q4o5.fsf@benfinney.id.au> Message-ID: <874n6vdcxr.fsf@uwakimon.sk.tsukuba.ac.jp> Ben Finney writes: > "Stephen J. Turnbull" > writes: > > > But when you don't have a TABLE element available for formatting > > tables, width, precision, base, and the like are *necessary* > > information for nice output. Do you really think it's readable to > > write > > > > "I propose a fine of ${payment: type=float precision=2} for not reading the docs.".format(payment=100) > > > > vs. > > > > "I propose a fine of ${payment:.2f} for not reading the docs.".format(payment=100) > > The first is far more readable, because it is explicit and doesn't > require special knowledge to know what the parameters are. Ah, but the example isn't really very good. In most cases in running text "everything default" is what you want, which I think we all agree is most readable. (Somebody wanted Perl-style $VARIABLE, but I don't see a big difference between that and {VARIABLE}, especially in cases where you want concatenation with trailing identifier characters and would have to spell it "${VARIABLE}" or something like that anyway.) Where you'd want precise control of formatting is a multicolumn table, which will add a "width=10" parameter for a total of 45 columns for the format for a single variable. As soon as it wraps (two columns ;-), it becomes unreadable. The "10.2f" in the idiom of .format() will take only 16 columns, so you can easily get 5 readable columns in an 80-column window, and in general about 3X as many columns in the same window as with the "locally" more readable explicit format. That's not at all inconsistent with your statement that "sometimes concise expression is important", of course, but I did want to point out that concise expression has important readability benefits. > An uninformed guess by a casual reader is much more likely to be > right in the first case; I think that's an important criterion for > readability. I'd argue that the casual reader most likely doesn't actually care about the formatting, especially in in the running text case. YMMV, but I'd bet on that, and would take the current formatting with keyword arguments for content and short moderately mnemonic codes for style as "most readable" for the cases where the styling of presentation matters. From rosuav at gmail.com Fri Nov 29 06:23:40 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 29 Nov 2013 16:23:40 +1100 Subject: [Python-ideas] string.Template v2 In-Reply-To: <874n6vdcxr.fsf@uwakimon.sk.tsukuba.ac.jp> References: <9027c102-17b5-4ae5-979c-3d9ac02c3e16@email.android.com> <87bo14c8nz.fsf@uwakimon.sk.tsukuba.ac.jp> <7wob53q4o5.fsf@benfinney.id.au> <874n6vdcxr.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Fri, Nov 29, 2013 at 3:55 PM, Stephen J. Turnbull wrote: > I'd argue that the casual reader most likely doesn't actually care > about the formatting, especially in in the running text case. YMMV, > but I'd bet on that, and would take the current formatting with > keyword arguments for content and short moderately mnemonic codes for > style as "most readable" for the cases where the styling of > presentation matters. Definitely. Focus should be on the overall string and its contents, it shouldn't be screaming at you "HEY LOOK I'M ABOUT TO PUT A FLOAT IN HERE WITH TWO DECIMALS". A simple, compact notation helps with that. Anyway, if you can't figure out what ".2f" means, you're going to have trouble in a lot of format languages that derive from printf. It's a handy notation, and no harder to figure out than a more verbose "type=float" is. After all, what does "float" mean? Will it render in exponential notation? Decimal notation? Presumably it won't output the IEEE binary representation, eg "40 49 0f db" (compacted into hex for convenience), though it's plausible there might be a way to emit that. So if you can learn the specifics of the "float" representation, you should be able to learn the specifics of the "f" renderer. Same thing, less keystrokes. ChrisA From denis.spir at gmail.com Fri Nov 29 07:10:11 2013 From: denis.spir at gmail.com (spir) Date: Fri, 29 Nov 2013 07:10:11 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5297D1D9.9020404@canterbury.ac.nz> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> <5297D1D9.9020404@canterbury.ac.nz> Message-ID: <52982FC3.30907@gmail.com> On 11/29/2013 12:29 AM, Greg Ewing wrote: > Chris Angelico wrote: >> Would you really keep a cache of 1114112 string objects around? > > Well, not *exactly* like the int cache... the idea > would be to remember ones that had been used, not > pre-generate all possible 1-char strings. I thought you (Greg) meant users (like me) who parse or scan strings piecemeal could build a list of 1-char strings before starting: source_chars = [ch for ch in source] Then index as needed during parsing. or maybe keep a list of codes directly source_ucodes = [ord(ch) for ch in source] But isn't it easier (and cheaper and more "intuitive") just to have a method like s.code(i)? Is there some resistance against this proposal? If yes, what rationale? Denis From denis.spir at gmail.com Fri Nov 29 07:21:00 2013 From: denis.spir at gmail.com (spir) Date: Fri, 29 Nov 2013 07:21:00 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> Message-ID: <5298324C.1020909@gmail.com> On 11/29/2013 12:36 AM, Terry Reedy wrote: >> certainly numerous other use cases exist. > > More that a hand wave is needed to demonstrate that. You are right. After some reflexion, I guess those use cases are: * every time people compare single-char substrings * every time people make a single-char slice nad use ord() on it * every time people could simply do something using a unicode code at arbitrary position, but do otherwise because the correcponding method does not exist We may not realise the breadth of potential usage until the right tool exist and starts to be used broadly. I think it is also somewhat comparable to building a string by piecemeal concat: people may just do it that way, and it works fine, doesn't it? However, once aware of the cost and that there is an _existing_ alternative, ... Also, why does ord() exist? What's the point of such a builtin func working for single-code (historically: single-byte) strings only? Denis From denis.spir at gmail.com Fri Nov 29 07:22:00 2013 From: denis.spir at gmail.com (spir) Date: Fri, 29 Nov 2013 07:22:00 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <20131129004527.GZ2085@ando> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> Message-ID: <52983288.7040907@gmail.com> On 11/29/2013 01:45 AM, Steven D'Aprano wrote: > Should strings have an ord() method? > > Disadvantages: > > - another piece of code to be written, debugged, maintained, documented; > > - another thing for users to learn; > > - cognitive load of having to decide whether to use the ord() method or > the ord() function. > > > Advantage: > > - you save the cost of extracting a one-character string before passing > it to the ord() function. Why is there ord(), the builtin func? Denis From denis.spir at gmail.com Fri Nov 29 07:27:15 2013 From: denis.spir at gmail.com (spir) Date: Fri, 29 Nov 2013 07:27:15 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5297CA12.7050607@canterbury.ac.nz> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> Message-ID: <529833C3.7000806@gmail.com> On 11/28/2013 11:56 PM, Greg Ewing wrote: > spir wrote: >> Is it weird to consider a .code(i) string method? > > Another approach would be to keep a cache of single-char > strings, like we do for small integers. Yes, this is an idea. If I did the decoding (usually from utf8) into a string of codes, instead of Python doing it to build its internal str repr, then I could easily in // record a list of single-char strings. I'll keep this idea aside anyway, thank you, Greg. Denis From denis.spir at gmail.com Fri Nov 29 07:42:44 2013 From: denis.spir at gmail.com (spir) Date: Fri, 29 Nov 2013 07:42:44 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52981EB6.50003@canterbury.ac.nz> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <52981EB6.50003@canterbury.ac.nz> Message-ID: <52983764.9010204@gmail.com> On 11/29/2013 05:57 AM, Greg Ewing wrote: > Steven D'Aprano wrote: >> On Thu, Nov 28, 2013 at 12:43:05PM +0100, spir wrote: >> >>> Is it weird to consider a .code(i) string method? >> >> Such a method should not be called "code", since "ordinal" or "ord" is the >> accepted term for it. > > How about giving ord() an optional index parameter? > > ord(s, i) > > would be equivalent to > > ord(s[i]) > > but would avoid creating an intermediate string > object. Very well, for me. Does the job as needed. (But then, possibly, some would wonder why this new ord(s,i) is not a string method ;-) Denis From denis.spir at gmail.com Fri Nov 29 07:40:12 2013 From: denis.spir at gmail.com (spir) Date: Fri, 29 Nov 2013 07:40:12 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <20131129004527.GZ2085@ando> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> Message-ID: <529836CC.6060400@gmail.com> On 11/29/2013 01:45 AM, Steven D'Aprano wrote: > Here is my benchmark: > > py> from timeit import Timer > py> setup = "s = 'abcdef'" > py> t1 = Timer("ord('c')") # establish a base-mark of calling ord > py> t2 = Timer("ord(s[2])", setup) > py> min(t1.repeat(repeat=5)) > 0.13925810158252716 > py> min(t2.repeat(repeat=5)) > 0.2207092922180891 You are right, Steven, the benefit is far tinier than I supposed. I reproduced this on my machine: the time for char-creation + ord() is about 3/2 of ord() alone (which is just indexing). Now, there is a mystery: how is the time for creating a single-char string object about half the time of a simple indexing (in C!)? This just cannot be, can it? Even if there is no alloc [1]. Or is it so that there is a cache, maybe temporary, for such recently created objects? (The char 'c' would be created only once, then just accessed.) Denis [1] Don't know of Python arcanes, but small strings do not always require alloc, for their data can be stored in place, in the object's "facade" struct. From greg.ewing at canterbury.ac.nz Fri Nov 29 07:49:47 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Nov 2013 19:49:47 +1300 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52983764.9010204@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <52981EB6.50003@canterbury.ac.nz> <52983764.9010204@gmail.com> Message-ID: <5298390B.30506@canterbury.ac.nz> spir wrote: > (But then, possibly, some would wonder why this new ord(s,i) is not a > string method ;-) For the same reason that the existing ord() function isn't a string method, whatever that is! -- Greg From greg.ewing at canterbury.ac.nz Fri Nov 29 07:56:03 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 29 Nov 2013 19:56:03 +1300 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52982FC3.30907@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> <5297D1D9.9020404@canterbury.ac.nz> <52982FC3.30907@gmail.com> Message-ID: <52983A83.9080803@canterbury.ac.nz> spir wrote: > I thought you (Greg) meant users (like me) who parse or scan strings > piecemeal could build a list of 1-char strings before starting: No, I meant that there would be a built-in cache operating behind the scenes, so that whenever you index a string and the resulting character has been extracted from a string before, you'd get the cached object instead of a new one. The advantage would be that no new functions or methods are needed. > But isn't it easier (and cheaper and more "intuitive") just to have a > method like s.code(i)? It would help with code that use single-char strings for more purposes than just taking the ord() of them. -- Greg From denis.spir at gmail.com Fri Nov 29 09:08:50 2013 From: denis.spir at gmail.com (spir) Date: Fri, 29 Nov 2013 09:08:50 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <52983A83.9080803@canterbury.ac.nz> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <5297CA12.7050607@canterbury.ac.nz> <5297D1D9.9020404@canterbury.ac.nz> <52982FC3.30907@gmail.com> <52983A83.9080803@canterbury.ac.nz> Message-ID: <52984B92.5070800@gmail.com> On 11/29/2013 07:56 AM, Greg Ewing wrote: > spir wrote: >> I thought you (Greg) meant users (like me) who parse or scan strings piecemeal >> could build a list of 1-char strings before starting: > > No, I meant that there would be a built-in cache operating > behind the scenes, so that whenever you index a string and > the resulting character has been extracted from a string > before, you'd get the cached object instead of a new one. > > The advantage would be that no new functions or methods > are needed. > >> But isn't it easier (and cheaper and more "intuitive") just to have a method >> like s.code(i)? > > It would help with code that use single-char strings for > more purposes than just taking the ord() of them. You are right on this. However, both features (1-char string interning and extending the func ord to ord(s,i)) are somewhat independant (and in any case compatible). Denis From steve at pearwood.info Fri Nov 29 10:06:44 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 29 Nov 2013 20:06:44 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <529836CC.6060400@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <529836CC.6060400@gmail.com> Message-ID: <20131129090644.GD2085@ando> On Fri, Nov 29, 2013 at 07:40:12AM +0100, spir wrote: > Now, there is a mystery: how is the time for creating a single-char string > object about half the time of a simple indexing (in C!)? This just cannot > be, can it? Of course it can be. What makes you think that creating small strings will be expensive? There is some cost, of course, but I expect that creating a small string will be very cheap. Not quite as cheap as C of course, since Python does a lot more for you, and creates rich objects rather than just copying low-level bytes, but it won't be expensive to take a small slice of a string. -- Steven From solipsis at pitrou.net Fri Nov 29 18:54:25 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 29 Nov 2013 18:54:25 +0100 Subject: [Python-ideas] os.path.join References: Message-ID: <20131129185425.583a5e8e@fsol> On Wed, 30 Oct 2013 10:06:03 -0700 Bruce Leban wrote: > > I agree it might be confusing but it's pretty explicitly documented. On the > other hand, this is also documented and it's wrong by the above standard > > >>> os.path.join(r'c:\abc', r'\def\g') # Windows paths > '\\def\\g' > > On Windows \def\g is a drive-relative path not an absolute path. To get the > right result you need to do: > > >>> drive, path = os.path.splitdrive(r'c:\abc') > >>> drive + os.path.join(path, r'/def/g') > 'c:/def/g' Note that pathlib gets it right: >>> PureWindowsPath(r'c:\abc') / r'\def\g' PureWindowsPath('c:/def/g') >>> PureWindowsPath(r'\\abc\def\ghi') / r'\x\y' PureWindowsPath('//abc/def/x/y') Regards Antoine. From tjreedy at udel.edu Sat Nov 30 01:32:50 2013 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 29 Nov 2013 19:32:50 -0500 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <529836CC.6060400@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <529836CC.6060400@gmail.com> Message-ID: On 11/29/2013 1:40 AM, spir wrote: > On 11/29/2013 01:45 AM, Steven D'Aprano wrote: >> Here is my benchmark: >> >> py> from timeit import Timer >> py> setup = "s = 'abcdef'" >> py> t1 = Timer("ord('c')") # establish a base-mark of calling ord >> py> t2 = Timer("ord(s[2])", setup) >> py> min(t1.repeat(repeat=5)) >> 0.139258101582527 >> py> min(t2.repeat(repeat=5)) >> 0.2207092922180891 > > You are right, Steven, the benefit is far tinier than I supposed. I > reproduced this on my machine: the time for char-creation + ord() is > about 3/2 of ord() alone (which is just indexing). > > Now, there is a mystery: how is the time for creating a single-char > string object about half the time of a simple indexing (in C!)? Much of the time for ord('c') is the time to make a function call from Python. 's[2]' only involves a internal C function call, which is much faster. -- Terry Jan Reedy From denis.spir at gmail.com Sat Nov 30 10:08:12 2013 From: denis.spir at gmail.com (spir) Date: Sat, 30 Nov 2013 10:08:12 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <529836CC.6060400@gmail.com> Message-ID: <5299AAFC.3090902@gmail.com> On 11/30/2013 01:32 AM, Terry Reedy wrote: > On 11/29/2013 1:40 AM, spir wrote: >> On 11/29/2013 01:45 AM, Steven D'Aprano wrote: >>> Here is my benchmark: >>> >>> py> from timeit import Timer >>> py> setup = "s = 'abcdef'" >>> py> t1 = Timer("ord('c')") # establish a base-mark of calling ord >>> py> t2 = Timer("ord(s[2])", setup) >>> py> min(t1.repeat(repeat=5)) >>> 0.139258101582527 >>> py> min(t2.repeat(repeat=5)) >>> 0.2207092922180891 >> >> You are right, Steven, the benefit is far tinier than I supposed. I >> reproduced this on my machine: the time for char-creation + ord() is >> about 3/2 of ord() alone (which is just indexing). >> >> Now, there is a mystery: how is the time for creating a single-char >> string object about half the time of a simple indexing (in C!)? > > Much of the time for ord('c') is the time to make a function call from Python. > 's[2]' only involves a internal C function call, which is much faster. Right, that's it, certainly! Thank you very much, Terry. Is there something else I should know about Python's internal repr for strings? For instance, is there an optimisation for (very) short strings? (like, codes stored in place in the struct) Also, I read somewhere, I guess it was in a wikipedia article about string interning in a pool, that Python does that; which surprised me pretty much. Which strings, if any, are interned in Python (I'd bet, for lookup speed, __dict__ key, meaning id's, meaning var & attr names)? Denis From daniel at daniel-watkins.co.uk Sat Nov 30 10:26:14 2013 From: daniel at daniel-watkins.co.uk (Daniel Watkins) Date: Sat, 30 Nov 2013 09:26:14 +0000 Subject: [Python-ideas] Refactor Assertions Out of unittest.TestCase Message-ID: <20131130092614.GE15129@daniel-watkins.co.uk> Hello all, I would like to propose refactoring the assertions out of unittest.TestCase. As code speaks louder than words, you can see my initial work at https://github.com/OddBloke/cpython. The aim of the refactor is: (a) to reduce the amount of repeated code in the assertions, (b) to provide a clearer framework for third-party assertions to follow, and (c) to make it easier to split gnarly assertions in to an easier-to-digest form. My proposed implementation (as seen in the code above) is to move each assertion in to its own class. There will be a shared superclass (called Assert currently; see [0]) implementing the template pattern (look at __call__ on line 69), meaning that each assertion only has to concern itself with its unique aspects: what makes it fail, and how that specific failure should be presented. To maintain the current TestCase interface (all of the tests pass in my branch), the existing assert* methods instantiate the Assert sub-classes on each call with a context that captures self.longMessage, self.failureException, self.maxDiff, and self._diffThreshold from the TestCase instance. Other potential aims include eventually deprecating the assertion methods, providing a framework for custom assertions to hook in to, and providing assertion functions (a la nose.tools) with a default context set. This proposal would help address #18054[1] as the new assertions could just be implemented as separate classes (and not included in the TestCase.assert* API); people who wanted to use them could just instantiate them themselves. I???d love some feedback on this proposal (and the implementation thus far). Cheers, Dan (Odd_Bloke) [0] https://github.com/OddBloke/cpython/blob/master/Lib/unittest/assertions/__init__.py#L15 [1] http://bugs.python.org/issue18054 From robertc at robertcollins.net Sat Nov 30 10:38:31 2013 From: robertc at robertcollins.net (Robert Collins) Date: Sat, 30 Nov 2013 22:38:31 +1300 Subject: [Python-ideas] Refactor Assertions Out of unittest.TestCase In-Reply-To: <20131130092614.GE15129@daniel-watkins.co.uk> References: <20131130092614.GE15129@daniel-watkins.co.uk> Message-ID: Have you seen Matchers? Inspired by hamcrest, these also move assertions out of unittest.TestCase, but are much more composable than assertions, and can give significantly richer errors. http://testtools.readthedocs.org/en/latest/for-test-authors.html#matchers There is an open ticket in the Python bug tracker for getting something like this into the stdlib. I've been meaning to port them and get a review up for a bit :(. -Rob On 30 November 2013 22:26, Daniel Watkins wrote: > Hello all, > > I would like to propose refactoring the assertions out of > unittest.TestCase. As code speaks louder than words, you can see my > initial work at https://github.com/OddBloke/cpython. > > The aim of the refactor is: > (a) to reduce the amount of repeated code in the assertions, > (b) to provide a clearer framework for third-party assertions to follow, > and > (c) to make it easier to split gnarly assertions in to an > easier-to-digest form. > > My proposed implementation (as seen in the code above) is to move each > assertion in to its own class. There will be a shared superclass (called > Assert currently; see [0]) implementing the template pattern (look at > __call__ on line 69), meaning that each assertion only has to concern > itself with its unique aspects: what makes it fail, and how that > specific failure should be presented. > > To maintain the current TestCase interface (all of the tests pass in my > branch), the existing assert* methods instantiate the Assert sub-classes > on each call with a context that captures self.longMessage, > self.failureException, self.maxDiff, and self._diffThreshold from the > TestCase instance. > > Other potential aims include eventually deprecating the assertion > methods, providing a framework for custom assertions to hook in to, and > providing assertion functions (a la nose.tools) with a default context > set. > > This proposal would help address #18054[1] as the new assertions could > just be implemented as separate classes (and not included in the > TestCase.assert* API); people who wanted to use them could just > instantiate them themselves. > > I?d love some feedback on this proposal (and the implementation thus > far). > > > Cheers, > > Dan (Odd_Bloke) > > [0] https://github.com/OddBloke/cpython/blob/master/Lib/unittest/assertions/__init__.py#L15 > > [1] http://bugs.python.org/issue18054 > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -- Robert Collins Distinguished Technologist HP Converged Cloud From daniel at daniel-watkins.co.uk Sat Nov 30 11:14:12 2013 From: daniel at daniel-watkins.co.uk (Daniel Watkins) Date: Sat, 30 Nov 2013 10:14:12 +0000 Subject: [Python-ideas] Refactor Assertions Out of unittest.TestCase In-Reply-To: References: <20131130092614.GE15129@daniel-watkins.co.uk> Message-ID: <20131130101411.GF15129@daniel-watkins.co.uk> Hi Rob, On Sat, Nov 30, 2013 at 10:38:31PM +1300, Robert Collins wrote: > Have you seen Matchers? Inspired by hamcrest, these also move > assertions out of unittest.TestCase, but are much more composable than > assertions, and can give significantly richer errors. > > http://testtools.readthedocs.org/en/latest/for-test-authors.html#matchers I hadn't given it a proper look. Very cool! This looks like a much more adaptable solution than mine. > There is an open ticket in the Python bug tracker for getting > something like this into the stdlib. I've been meaning to port them > and get a review up for a bit :(. Great! Two things that (at a glance) testtools is missing are diffs to describe mis-matches and support for the existing attributes (longMessage, maxDiff, failureException, _diffThreshold) that unittest.TestCase uses to manage output (though, obviously, at least two of these are diff-related). Are there any thoughts as to how to address this? Is there anything I can do to help push matchers in to the stdlib? Cheers, Dan From apieum at gmail.com Sat Nov 30 11:14:30 2013 From: apieum at gmail.com (Gregory Salvan) Date: Sat, 30 Nov 2013 11:14:30 +0100 Subject: [Python-ideas] Refactor Assertions Out of unittest.TestCase In-Reply-To: References: <20131130092614.GE15129@daniel-watkins.co.uk> Message-ID: Hi, Nice ! Maybe lib operator can help you. I've opened a ticket few days ago on this topic: http://bugs.python.org/issue19645 I'm working on, making some tests, to have a POC for a possible PEP. Actually I've kept these ideas: - independant from unittest - same behaviour as "assert" in optimize mode (trying to see if it would be convenient to have an env var to force it) - simple matchers compatibles with testtools protocol - factories to make assertions Idea I've to dig further: provide a mecanism to override "assert" (thought for py.test) If you're interested we can work together. 2013/11/30 Robert Collins > Have you seen Matchers? Inspired by hamcrest, these also move > assertions out of unittest.TestCase, but are much more composable than > assertions, and can give significantly richer errors. > > http://testtools.readthedocs.org/en/latest/for-test-authors.html#matchers > > There is an open ticket in the Python bug tracker for getting > something like this into the stdlib. I've been meaning to port them > and get a review up for a bit :(. > > -Rob > > On 30 November 2013 22:26, Daniel Watkins > wrote: > > Hello all, > > > > I would like to propose refactoring the assertions out of > > unittest.TestCase. As code speaks louder than words, you can see my > > initial work at https://github.com/OddBloke/cpython. > > > > The aim of the refactor is: > > (a) to reduce the amount of repeated code in the assertions, > > (b) to provide a clearer framework for third-party assertions to follow, > > and > > (c) to make it easier to split gnarly assertions in to an > > easier-to-digest form. > > > > My proposed implementation (as seen in the code above) is to move each > > assertion in to its own class. There will be a shared superclass (called > > Assert currently; see [0]) implementing the template pattern (look at > > __call__ on line 69), meaning that each assertion only has to concern > > itself with its unique aspects: what makes it fail, and how that > > specific failure should be presented. > > > > To maintain the current TestCase interface (all of the tests pass in my > > branch), the existing assert* methods instantiate the Assert sub-classes > > on each call with a context that captures self.longMessage, > > self.failureException, self.maxDiff, and self._diffThreshold from the > > TestCase instance. > > > > Other potential aims include eventually deprecating the assertion > > methods, providing a framework for custom assertions to hook in to, and > > providing assertion functions (a la nose.tools) with a default context > > set. > > > > This proposal would help address #18054[1] as the new assertions could > > just be implemented as separate classes (and not included in the > > TestCase.assert* API); people who wanted to use them could just > > instantiate them themselves. > > > > I?d love some feedback on this proposal (and the implementation thus > > far). > > > > > > Cheers, > > > > Dan (Odd_Bloke) > > > > [0] > https://github.com/OddBloke/cpython/blob/master/Lib/unittest/assertions/__init__.py#L15 > > > > [1] http://bugs.python.org/issue18054 > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > https://mail.python.org/mailman/listinfo/python-ideas > > > > > > -- > Robert Collins > Distinguished Technologist > HP Converged Cloud > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Sat Nov 30 11:39:11 2013 From: rosuav at gmail.com (Chris Angelico) Date: Sat, 30 Nov 2013 21:39:11 +1100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5299AAFC.3090902@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <529836CC.6060400@gmail.com> <5299AAFC.3090902@gmail.com> Message-ID: On Sat, Nov 30, 2013 at 8:08 PM, spir wrote: > Is there something else I should know about Python's internal repr for > strings? Just to clarify: You're talking about the internal representation, which isn't anything to do with the repr() function. ChrisA From denis.spir at gmail.com Sat Nov 30 18:20:52 2013 From: denis.spir at gmail.com (spir) Date: Sat, 30 Nov 2013 18:20:52 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <529836CC.6060400@gmail.com> <5299AAFC.3090902@gmail.com> Message-ID: <529A1E74.30504@gmail.com> On 11/30/2013 11:39 AM, Chris Angelico wrote: >> >Is there something else I should know about Python's internal repr for >> >strings? > Just to clarify: You're talking about the internal representation, > which isn't anything to do with the repr() function. Yop, sorry for the anbiguity. Denis From barry at python.org Sat Nov 30 19:55:21 2013 From: barry at python.org (Barry Warsaw) Date: Sat, 30 Nov 2013 13:55:21 -0500 Subject: [Python-ideas] string.Template v2 References: Message-ID: <20131130135521.7073b855@anarchist> On Nov 28, 2013, at 07:57 PM, anatoly techtonik wrote: >string.Template syntax is ancient (dates back to Python 2.4 >from 9 years ago). I haven't seen a template like this for a long time. string.Template syntax is defined in PEP 292 and was deliberately named "Simpler String Substitutions". The motivation for PEP 292 was the observation of many years of difficulty with translation of gettext message strings in highly i18n'd code. We saw countless errors where translators would leave off a trailing 's' in %(foo)s placeholders, breaking systems, or requiring them to be defensive to the point of being unreadable. $strings have a rich tradition in programming languages and translators seem to have a better time with such placeholders. It's might be useful to provide some flexibility in string.Template, but I don't want to lose the beauty of simplicity or sacrifice the original use case to try to support every possible variation. OTOH, string.Template is just one class in the stdlib. There's no reason other libraries can't provide different classes supporting different formats. FWIW, flufl.i18n builds on string.Template and PEP 292, as well as gettext, to provide a richer API for managing translations in Python programs. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Sat Nov 30 19:57:07 2013 From: barry at python.org (Barry Warsaw) Date: Sat, 30 Nov 2013 13:57:07 -0500 Subject: [Python-ideas] string.Template v2 References: Message-ID: <20131130135707.77197d7c@anarchist> On Nov 28, 2013, at 06:53 PM, Victor Stinner wrote: >In my opinion, string.Template alone is almost useless (it's probably >why it is not used). It *is* used, but IME more as a fundamental building block for higher level APIs. string.Template is very definitely not intended to be a templating engine on the order of the many powerful ones available in web frameworks. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From greg.ewing at canterbury.ac.nz Sat Nov 30 23:13:25 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 01 Dec 2013 11:13:25 +1300 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <5299AAFC.3090902@gmail.com> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <529836CC.6060400@gmail.com> <5299AAFC.3090902@gmail.com> Message-ID: <529A6305.8020107@canterbury.ac.nz> spir wrote: > Which strings, if any, are interned in Python (I'd bet, for > lookup speed, __dict__ key, meaning id's, meaning var & attr names)? Yes, identifiers are interned, and also short string literals that resemble identifiers (since they might get used in getattr calls and the like). But as far as I know, it's never done for strings constructed at run time unless you explicitly ask for it. So it wouldn't help for what we've been talking about. -- Greg From denis.spir at gmail.com Sat Nov 30 23:20:25 2013 From: denis.spir at gmail.com (spir) Date: Sat, 30 Nov 2013 23:20:25 +0100 Subject: [Python-ideas] string codes & substring equality In-Reply-To: <529A6305.8020107@canterbury.ac.nz> References: <52960290.6090809@gmail.com> <52972C49.1040909@gmail.com> <20131129004527.GZ2085@ando> <529836CC.6060400@gmail.com> <5299AAFC.3090902@gmail.com> <529A6305.8020107@canterbury.ac.nz> Message-ID: <529A64A9.50303@gmail.com> On 11/30/2013 11:13 PM, Greg Ewing wrote: > spir wrote: >> Which strings, if any, are interned in Python (I'd bet, for lookup speed, >> __dict__ key, meaning id's, meaning var & attr names)? > > Yes, identifiers are interned, and also short string > literals that resemble identifiers (since they might > get used in getattr calls and the like). > > But as far as I know, it's never done for strings > constructed at run time unless you explicitly ask > for it. So it wouldn't help for what we've been > talking about. Thank you ,Greg! Denis