From manuel at enigmage.de Mon Feb 7 20:43:06 2011 From: manuel at enigmage.de (=?ISO-8859-15?Q?Manuel_B=E4renz?=) Date: Mon, 07 Feb 2011 20:43:06 +0100 Subject: [Python-ideas] Add a feature similar to C++ "using some_namespace" Message-ID: <4D504B4A.2030508@enigmage.de> In C++, the the approach to the namespace problem is having different namespaces that should not contain different definitions of the same name. Members of a namespace can be accessed explicitly by e.g. calling "std::cout<< etc." or "using namespace std; cout<< etc." I understand Pythons approach to be "objects can be used as namespaces and their attributes are the names they contain". I find this a very beautiful way of solving the issue, but it has a downside, in my opinion, because it lacks the "using" directive from C++. If the object is a module, we can of course do "from mymodule import spam, eggs". But if it is not a module, this does not work. Consider for example: class Spam(object): def frobnicate(self): self.eggs = self.buy_eggs() self.scrambled = self.scramble(self.eggs) return self.scrambled> 42 This could be easier to implement and read if we had something like: class Spam(object): def frobnicate(self): using self: eggs = buy_eggs() scrambled = scramble(eggs) return scrambled> 42 Of course this opens a lot of conceptual questions like how should this using block behave if self doesn't have an attribute called "eggs", but I think it is worth considering. From rrr at ronadam.com Mon Feb 7 21:21:45 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 07 Feb 2011 14:21:45 -0600 Subject: [Python-ideas] Cleaner separation of help() and interactive help. Message-ID: Currently any call to help() uses the pager in pydoc. Because of that, you can't do some things (easily) like... >>> result = help(thing) It also can be annoying (to me) when I use help and the result is cleared after the pager is done with it. That requires me to re-do the same help requests to recheck details rather than simply scroll back in the current console window. The alternative is to keep another console open just for using help. That's not always good when you already have multiple windows open for doing other things. In python 3.3, I would like to have help() return the results as a string, and only use the pager if you are actually in interactive-help mode. >>> help(thing) # return a help string of a thing. >>> help() # enter interactive help mode where the pager is used. Separating these even more might be good, ... help() and ihelp(). In this case, the current help() function could just be renamed to ihelp(), and a new help() function would return a result as a string. The default might be help(thing='help'). The help on 'help' would refer to 'ihelp' for the interactive help. Sometime later, I want to look into moving the pager in pydoc to the cmd module as cmd.pager(). (I think it fits well there.) And related to that... I am going to look into re-implementing pydocs interactive help with the cmd module. I think it will remove some duplication, and both modules will benefit. Cheers, Ron From masklinn at masklinn.net Mon Feb 7 21:36:16 2011 From: masklinn at masklinn.net (Masklinn) Date: Mon, 7 Feb 2011 21:36:16 +0100 Subject: [Python-ideas] Add a feature similar to C++ "using some_namespace" In-Reply-To: <4D504B4A.2030508@enigmage.de> References: <4D504B4A.2030508@enigmage.de> Message-ID: <7F8DEBEB-1E09-42E3-BFD3-65028CA926B2@masklinn.net> On 2011-02-07, at 20:43 , Manuel B?renz wrote: > In C++, the the approach to the namespace problem is having different namespaces that should not contain different definitions of the same name. > Members of a namespace can be accessed explicitly by e.g. calling "std::cout<< etc." or "using namespace std; cout<< etc." > > I understand Pythons approach to be "objects can be used as namespaces and their attributes are the names they contain". I find this a very beautiful way of solving the issue, but it has a downside, in my opinion, because it lacks the "using" directive from C++. > > If the object is a module, we can of course do "from mymodule import spam, eggs". Or `from mymodule import *` though that's not exactly recommended. On the other hand, since when can `using` be used on anything but a C++ namespace (which as far as I know is a well-defined entity, not an arbitrary object, and quite similar to a Python module though not identical)? > But if it is not a module, this does not work. > > Consider for example: > > class Spam(object): > def frobnicate(self): > self.eggs = self.buy_eggs() > self.scrambled = self.scramble(self.eggs) > return self.scrambled> 42 > > This could be easier to implement and read if we had something like: > > class Spam(object): > def frobnicate(self): > using self: > eggs = buy_eggs() > scrambled = scramble(eggs) > return scrambled> 42 > > Of course this opens a lot of conceptual questions like how should this using block behave if self doesn't have an attribute called "eggs", but I think it is worth considering. My 2 cents: Javascript has this "feature" (with). It's utterly terrible, and mainly a very good way to shoot yourself in the foot repeatedly. I especially find the assertion that: > This could be easier to [?] read Extremely debatable: in my experience of that feature in Javascript, it makes code much harder to understand and reason about. From ethan at stoneleaf.us Mon Feb 7 22:07:06 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 07 Feb 2011 13:07:06 -0800 Subject: [Python-ideas] Add a feature similar to C++ "using some_namespace" In-Reply-To: <4D504B4A.2030508@enigmage.de> References: <4D504B4A.2030508@enigmage.de> Message-ID: <4D505EFA.6080300@stoneleaf.us> Manuel B?renz wrote: [...] > This could be easier to implement and read if we had something like: > > class Spam(object): > def frobnicate(self): > using self: > eggs = buy_eggs() > scrambled = scramble(eggs) > return scrambled> 42 > > Of course this opens a lot of conceptual questions like how should this > using block behave if self doesn't have an attribute called "eggs", but > I think it is worth considering. In Foxpro, the keyword is 'with' instead of 'using', and to make it clear when the to look into the namespace for the variable (instead of, for example, locals), the variable name is prepended with a '.'; so the example above becomes: class Spam(object): def frobnicate(self): with self: .eggs = buy_eggs() .scrambled = scramble(eggs) return .scrambled > 42 While I have occasionally missed this feature, I haven't lost any sleep over it, either. -1 without leading periods +0 with leading periods no preference on keyword name (with vs using vs ...) ~Ethan~ From greg.ewing at canterbury.ac.nz Mon Feb 7 22:14:40 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 Feb 2011 10:14:40 +1300 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: References: Message-ID: <4D5060C0.3050802@canterbury.ac.nz> Ron Adam wrote: > It also can be annoying (to me) when I use help and the result is > cleared after the pager is done with it. Yeah, that annoys me too. I can't imagine why anyone thought it was a good idea to design a pager that way. Come to think of it, the whole concept of a pager is something of an anti-feature nowadays, when most people's "terminal" is really a gui window with its own scrolling facilities that are considerably better than the pager's. What I'd really like is for interactive help output to appear in a new window, preferably using by favourite text editor so I can use its searching facilities, which are also considerably better than the pager's. -- Greg From greg.ewing at canterbury.ac.nz Mon Feb 7 22:37:47 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 Feb 2011 10:37:47 +1300 Subject: [Python-ideas] Add a feature similar to C++ "using some_namespace" In-Reply-To: <4D504B4A.2030508@enigmage.de> References: <4D504B4A.2030508@enigmage.de> Message-ID: <4D50662B.2030306@canterbury.ac.nz> Manuel B?renz wrote: > This could be easier to implement and read if we had something like: > > class Spam(object): > def frobnicate(self): > using self: > eggs = buy_eggs() > scrambled = scramble(eggs) > return scrambled> 42 On the contrary, I think most people here would consider this to be *harder* to read and maintain. It seems clear to you because you just wrote it and it's fresh in your mind which namespaces the various names belong to. But to someone else, it's far from obvious, for example, whether 'scramble' is a method of self or a module-level function. It also interacts badly with Python's declaration-free determination of name scope. What if you wanted 'eggs' or 'scrambled' to be local variables rather than an attribute of self? -- Greg From manuel at enigmage.de Mon Feb 7 22:48:20 2011 From: manuel at enigmage.de (=?windows-1252?Q?Manuel_B=E4renz?=) Date: Mon, 07 Feb 2011 22:48:20 +0100 Subject: [Python-ideas] Add a feature similar to C++ "using some_namespace" In-Reply-To: <7F8DEBEB-1E09-42E3-BFD3-65028CA926B2@masklinn.net> References: <4D504B4A.2030508@enigmage.de> <7F8DEBEB-1E09-42E3-BFD3-65028CA926B2@masklinn.net> Message-ID: <4D5068A4.50900@enigmage.de> > My 2 cents: Javascript has this "feature" (with). It's utterly terrible, and mainly a very good way to shoot yourself in the foot repeatedly. > > I especially find the assertion that: >> This could be easier to [?] read > Extremely debatable: in my experience of that feature in Javascript, it makes code much harder to understand and reason about. I get your point. Another downside is the uselessness of a "using"-block on a frequent problem: class Spam(object): def frobnicate(self, egg1, egg2): self.egg1 = egg1 self.egg2 = egg2 would translate to class Spam(object): def frobnicate(self, egg1, egg2): using self: egg1 = egg1 egg2 = egg2 , which is ridiculous. However with Ethans Foxpro style suggestion, it would be ok and IMO readable. From steve at pearwood.info Mon Feb 7 22:48:37 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 08 Feb 2011 08:48:37 +1100 Subject: [Python-ideas] Add a feature similar to C++ "using some_namespace" In-Reply-To: <4D504B4A.2030508@enigmage.de> References: <4D504B4A.2030508@enigmage.de> Message-ID: <4D5068B5.2070502@pearwood.info> Manuel B?renz wrote: > In C++, the the approach to the namespace problem is having different > namespaces that should not contain different definitions of the same name. > Members of a namespace can be accessed explicitly by e.g. calling > "std::cout<< etc." or "using namespace std; cout<< etc." This is like the "with" statement of Pascal, and is already a Python FAQ: http://docs.python.org/faq/design.html#why-doesn-t-python-have-a-with-statement-for-attribute-assignments -- Steven From steve at pearwood.info Mon Feb 7 23:08:28 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 08 Feb 2011 09:08:28 +1100 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <4D5060C0.3050802@canterbury.ac.nz> References: <4D5060C0.3050802@canterbury.ac.nz> Message-ID: <4D506D5C.70207@pearwood.info> Greg Ewing wrote: > Ron Adam wrote: > >> It also can be annoying (to me) when I use help and the result is >> cleared after the pager is done with it. > > Yeah, that annoys me too. I can't imagine why anyone thought it was > a good idea to design a pager that way. Come to think of it, the > whole concept of a pager is something of an anti-feature nowadays, > when most people's "terminal" is really a gui window with its own > scrolling facilities that are considerably better than the pager's. Do you have a reliable source for that claim about "most" people that is relevant to Python coders? We're not all using Microsoft VisualStudio :) This is open source, and people scratch their own itch. I daresay help() was written to suit the working processes of the creator. That suits me fine, because I like help() just the way it is, I like the pager just the way it is, and I don't want it to change. So -1 on any change to the default behaviour. > What I'd really like is for interactive help output to appear in > a new window, preferably using by favourite text editor so I can > use its searching facilities, which are also considerably better > than the pager's. Personally, I hate it when applications decide to launch additional applications without an explicit request. What you're describing *is* a pager. The default pager is the (almost) lowest common denominator which should work for anyone anywhere. (Possibly not if they're using a teletype.) Perhaps help() should come with some more pagers and an easy interface for setting which one is used. -- Steven From steve at pearwood.info Mon Feb 7 23:24:09 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 08 Feb 2011 09:24:09 +1100 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: References: Message-ID: <4D507109.7000805@pearwood.info> Ron Adam wrote: > > Currently any call to help() uses the pager in pydoc. Because of that, > you can't do some things (easily) like... > > >>> result = help(thing) I've not often needed to do that, not enough to really care, and the few times I have wanted it, result = thing.__doc__ was good enough. But I can see that it might occasionally be useful. But not enough to make it the default behaviour of help()! Something like one of these might be good though: result = help(thing, interactive=False) result = help(thing, pager=None) > It also can be annoying (to me) when I use help and the result is > cleared after the pager is done with it. That requires me to re-do the > same help requests to recheck details rather than simply scroll back in > the current console window. Really? I find that a feature, not an annoyance. Otherwise, I'd have to scroll back to recheck results in the console window. > The alternative is to keep another console open just for using help. > That's not always good when you already have multiple windows open for > doing other things. Then what's the problem with one more tab in your xterm app? > In python 3.3, I would like to have help() return the results as a > string, and only use the pager if you are actually in interactive-help > mode. > > >>> help(thing) # return a help string of a thing. > > >>> help() # enter interactive help mode where the pager is used. -1. I use help(thing) dozens of times a session, and help() on its own maybe a handful of times a year. Why would I want to get half a page of instructions *every single time* I use help() when I can just say help(thing) and go straight to the part I care about? I sympathize with your desire for a way to get help to return the text rather than feed it through a pager, but don't want it to be the default. -- Steven From greg.ewing at canterbury.ac.nz Mon Feb 7 23:34:57 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 08 Feb 2011 11:34:57 +1300 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <4D506D5C.70207@pearwood.info> References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> Message-ID: <4D507391.202@canterbury.ac.nz> Steven D'Aprano wrote: > Do you have a reliable source for that claim about "most" people that is > relevant to Python coders? We're not all using Microsoft VisualStudio :) I'm not talking about IDEs. I'm talking about things like the Terminal in MacOSX, the cmd window in Windows, and equivalent things in the Linux and X11 worlds. It's very rare nowadays for anyone to be using a command-line style interface in anything that doesn't have scroll bars attached to it. > What you're describing *is* a pager. Yes, of course, but it's one better matched to the characteristics of the environment I usually find myself working in nowadays. Using a pager designed for glass ttys in a Terminal window is actually *worse* in many ways than just dumping the text out with no pager at all. > Perhaps help() should come > with some more pagers and an easy interface for setting which one is used. Yes, that would be good. -- Greg From ben+python at benfinney.id.au Mon Feb 7 23:59:10 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 08 Feb 2011 09:59:10 +1100 Subject: [Python-ideas] Cleaner separation of help() and interactive help. References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> Message-ID: <87oc6nqwn5.fsf@benfinney.id.au> Steven D'Aprano writes: > Greg Ewing wrote: (Greg's message isn't showing up on my Usenet server.) > > Come to think of it, the whole concept of a pager is something of an > > anti-feature nowadays, when most people's "terminal" is really a gui > > window with its own scrolling facilities that are considerably > > better than the pager's. Not at all. Most of my GUI terminal windows are running a terminal multiplexer, which is constantly controlling the whole terminal output as a full-terminal application. That totally defeats the GUI scrolling capability, and means that any scrolling I want to occur needs to be done via the terminal features. Terminal multiplexers include BSD's ?tmux? and GNU Screen. > This is open source, and people scratch their own itch. I daresay > help() was written to suit the working processes of the creator. That > suits me fine, because I like help() just the way it is, I like the > pager just the way it is, and I don't want it to change. So -1 on any > change to the default behaviour. Indeed. Even if it werent the case that the GUI scrolling capability were useless in this case, I would still want to use PgUp and PgDn keys to do it ? which are (correctly) captured by the terminal program, and not available for GUI scrolling. > > What I'd really like is for interactive help output to appear in a > > new window, preferably using by favourite text editor so I can use > > its searching facilities, which are also considerably better than > > the pager's. You might want that, and I can see that it's a valid request. It's not at all what I want, though. Switching context to another window costs me valuable attention and ?flow?. Having the help appear in exactly the same window is priceless for maintaining my attention on the task. > Personally, I hate it when applications decide to launch additional > applications without an explicit request. Yup. -- \ ?It has yet to be proven that intelligence has any survival | `\ value.? ?Arthur C. Clarke, 2000 | _o__) | Ben Finney From jsbueno at python.org.br Tue Feb 8 00:11:12 2011 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 7 Feb 2011 21:11:12 -0200 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <87oc6nqwn5.fsf@benfinney.id.au> References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <87oc6nqwn5.fsf@benfinney.id.au> Message-ID: On Mon, Feb 7, 2011 at 8:59 PM, Ben Finney wrote: > Steven D'Aprano writes: > >> Greg Ewing wrote: > > (Greg's message isn't showing up on my Usenet server.) > >> > Come to think of it, the whole concept of a pager is something of an >> > anti-feature nowadays, when most people's "terminal" is really a gui >> > window with its own scrolling facilities that are considerably >> > better than the pager's. > > Not at all. Most of my GUI terminal windows are running a terminal > multiplexer, which is constantly controlling the whole terminal output > as a full-terminal application. That totally defeats the GUI scrolling > capability, and means that any scrolling I want to occur needs to be > done via the terminal features. > I am -1 for changing the default behavior, and +1 for having a way of help to simply return a a text string. Maybe the addition of a keyword argumetn to help() to select this function could make everyone happy. (Or just leave default help as is, and add another function like nhelp() or ahelp() to simply return a string) js -><- From tjreedy at udel.edu Tue Feb 8 01:16:31 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 07 Feb 2011 19:16:31 -0500 Subject: [Python-ideas] Add a feature similar to C++ "using some_namespace" In-Reply-To: <4D504B4A.2030508@enigmage.de> References: <4D504B4A.2030508@enigmage.de> Message-ID: Given that any object can be given a single character name, even as a temporary alias, this feature seems unnecessary > class Spam(object): > def frobnicate(self): > self.eggs = self.buy_eggs() > self.scrambled = self.scramble(self.eggs) > return self.scrambled class Spam(object): def frobnicate(self): s = self # or name the parameter 's' instead of 'self' s.eggs = s.buy_eggs() s.scrambled = s.scramble(s.eggs) return s.scrambled -- Terry Jan Reedy From stephen at xemacs.org Tue Feb 8 01:12:54 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 08 Feb 2011 09:12:54 +0900 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <4D506D5C.70207@pearwood.info> References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> Message-ID: <87fwrzz8mx.fsf@uwakimon.sk.tsukuba.ac.jp> Steven D'Aprano writes: > What you're describing *is* a pager. The default pager is the (almost) > lowest common denominator which should work for anyone anywhere. > (Possibly not if they're using a teletype.) Perhaps help() should come > with some more pagers and an easy interface for setting which one is used. +1 And the next one to supply is "cat". ;-) From tjreedy at udel.edu Tue Feb 8 01:36:31 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 07 Feb 2011 19:36:31 -0500 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: References: Message-ID: On 2/7/2011 3:21 PM, Ron Adam wrote: > It also can be annoying (to me) when I use help and the result is > cleared after the pager is done with it. That requires me to re-do the > same help requests to recheck details rather than simply scroll back in > the current console window. ?? On windows, both the command prompt and IDLE shell windows keep the help text. I scroll up and down all the time. Multi-screen text would be pretty useless otherwise. I consider any other behavior buggy. Indeed, interposing the pager is a nuisance as it requires multiple returns before one can get the full text for scrolling. By whatever means, IDLE does not visibly page but gives the whole text at once. None is still the return object. help(ob) returning a string is a different issue. Since the text is ofter more than ob.__doc__ (see help(int) versus int.__doc__, for instance), returning the composed text would at minimum make it easier to test help, or to check the composed result for a particular module or class. One might even use the result as a quick ref manual, or a first draft for one. -- Terry Jan Reedy From ben+python at benfinney.id.au Tue Feb 8 02:24:55 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 08 Feb 2011 12:24:55 +1100 Subject: [Python-ideas] Cleaner separation of help() and interactive help. References: Message-ID: <878vxrqpw8.fsf@benfinney.id.au> Terry Reedy writes: > On 2/7/2011 3:21 PM, Ron Adam wrote: > > > It also can be annoying (to me) when I use help and the result is > > cleared after the pager is done with it. [?] > > ?? On windows, both the command prompt and IDLE shell windows keep the > help text. I scroll up and down all the time. Multi-screen text would > be pretty useless otherwise. I consider any other behavior buggy. The default pager program on many GNU+Linux systems is ?less?. The default behaviour of ?less? when it quits is to restore the screen contents to what they were before the program started. Actually, I can't find any way to configure ?less? not to do that. Of course, one can choose a different pager by setting the ?PAGER? environment variable:: $ PAGER=more python Python 2.6.6 (r266:84292, Dec 27 2010, 10:20:06) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> help(os) but I'd prefer to keep ?less? and just fix its behaviour on quit. Does anyone know how? -- \ ?Unix is an operating system, OS/2 is half an operating system, | `\ Windows is a shell, and DOS is a boot partition virus.? ?Peter | _o__) H. Coffin | Ben Finney From mwm at mired.org Tue Feb 8 02:34:54 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 7 Feb 2011 20:34:54 -0500 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <878vxrqpw8.fsf@benfinney.id.au> References: <878vxrqpw8.fsf@benfinney.id.au> Message-ID: <20110207203454.1525fce7@bhuda.mired.org> On Tue, 08 Feb 2011 12:24:55 +1100 Ben Finney wrote: > Terry Reedy writes: > > > On 2/7/2011 3:21 PM, Ron Adam wrote: > > > > > It also can be annoying (to me) when I use help and the result is > > > cleared after the pager is done with it. [?] > > > > ?? On windows, both the command prompt and IDLE shell windows keep the > > help text. I scroll up and down all the time. Multi-screen text would > > be pretty useless otherwise. I consider any other behavior buggy. > > The default pager program on many GNU+Linux systems is ?less?. The > default behaviour of ?less? when it quits is to restore the screen > contents to what they were before the program started. Not quite. > Actually, I can't find any way to configure ?less? not to do that. That's because this behavior isn't controlled by less, it's controlled by TERMINFO. For some reason, the xterm terminal de-initialization strings in TERMINFO clears the screen. This is different from TERMCAP, and as others have noted, can be really annoying. One fix for this is to fix the TERMINFO entries, but that's 1) not universal, and 2) probably another battle last to the barbarians. An easier fix is to feed less the "-X" flag (via either the LESS or MORE environment variables, depending on which your fingers know), which causes it to not use the TERMINFO initialization/de-initialization strings. I recommend using -c with it, which clears the screen and draws from the top, so you get a screen clear at startup. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From rrr at ronadam.com Tue Feb 8 02:50:55 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 07 Feb 2011 19:50:55 -0600 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <4D506D5C.70207@pearwood.info> References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> Message-ID: <4D50A17F.1010009@ronadam.com> On 02/07/2011 04:08 PM, Steven D'Aprano wrote: > Greg Ewing wrote: >> Ron Adam wrote: >> >>> It also can be annoying (to me) when I use help and the result is >>> cleared after the pager is done with it. >> >> Yeah, that annoys me too. I can't imagine why anyone thought it was >> a good idea to design a pager that way. Come to think of it, the >> whole concept of a pager is something of an anti-feature nowadays, >> when most people's "terminal" is really a gui window with its own >> scrolling facilities that are considerably better than the pager's. Ok, I'm not the only one. ;-) > Do you have a reliable source for that claim about "most" people that is > relevant to Python coders? We're not all using Microsoft VisualStudio :) > > This is open source, and people scratch their own itch. I daresay help() > was written to suit the working processes of the creator. That suits me > fine, because I like help() just the way it is, I like the pager just the > way it is, and I don't want it to change. So -1 on any change to the > default behaviour. I don't want to remove it. Just change where it lives and make non-paged output easy to get. So I'm looking for the best way's to do that, and then of out those ways, find the one with the least objections. :-) > What you're describing *is* a pager. The default pager is the (almost) > lowest common denominator which should work for anyone anywhere. (Possibly > not if they're using a teletype.) Perhaps help() should come with some more > pagers and an easy interface for setting which one is used. I'm not sure what you mean by default pager. I guess that would be the plainpager. Which is just writing to the stdout. But it's not what you get if anything else works. The way it works, is when you first call help, it calls the pager function that then tries to create a pager from various options including looking at on environment variables PAGER and TERM. It's easier to post the code so you can see for yourself. Look in pydoc for the rest of it. In any case, I'm not proposing any changes to the pager. Cheers, Ron def pager(text): """The first time this is called, determine what kind of pager to use.""" global pager pager = getpager() pager(text) def getpager(): """Decide what method to use for paging through text.""" if not hasattr(sys.stdout, "isatty"): return plainpager if not sys.stdin.isatty() or not sys.stdout.isatty(): return plainpager if 'PAGER' in os.environ: if sys.platform == 'win32': # pipes completely broken in Windows return lambda text: tempfilepager(plain(text), os.environ['PAGER']) elif os.environ.get('TERM') in ('dumb', 'emacs'): return lambda text: pipepager(plain(text), os.environ['PAGER']) else: return lambda text: pipepager(text, os.environ['PAGER']) if os.environ.get('TERM') in ('dumb', 'emacs'): return plainpager if sys.platform == 'win32' or sys.platform.startswith('os2'): return lambda text: tempfilepager(plain(text), 'more <') if hasattr(os, 'system') and os.system('(less) 2>/dev/null') == 0: return lambda text: pipepager(text, 'less') import tempfile (fd, filename) = tempfile.mkstemp() os.close(fd) try: if hasattr(os, 'system') and os.system('more "%s"' % filename) == 0: return lambda text: pipepager(text, 'more') else: return ttypager finally: os.unlink(filename) [skip various pager alternatives ....] def plainpager(text): """Simply print unformatted text. This is the ultimate fallback.""" sys.stdout.write(plain(text)) From rrr at ronadam.com Tue Feb 8 05:05:34 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 07 Feb 2011 22:05:34 -0600 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: References: Message-ID: <4D50C10E.6090606@ronadam.com> On 02/07/2011 06:36 PM, Terry Reedy wrote: > On 2/7/2011 3:21 PM, Ron Adam wrote: > >> It also can be annoying (to me) when I use help and the result is >> cleared after the pager is done with it. That requires me to re-do the >> same help requests to recheck details rather than simply scroll back in >> the current console window. > > ?? On windows, both the command prompt and IDLE shell windows keep the help > text. I scroll up and down all the time. Multi-screen text would be pretty > useless otherwise. I consider any other behavior buggy. I'm using Ubuntu now... I use to (and occasionally switch to) windows, which is why it may be annoying for me now. It doesn't act the same. I think it may be different still on Macs. I like the how the pager uses the arrow and page up keys on linux systems, but don't always want that. So I do want to keep it and make it available for use in other projects. I also want to simplify how help works and this was one thing that would do that. But there seems to be some some support for the current behaviour. > Indeed, interposing the pager is a nuisance as it requires multiple returns > before one can get the full text for scrolling. Yes, I've run across that before also. > By whatever means, IDLE > does not visibly page but gives the whole text at once. None is still the > return object. I'm not sure if idle is doing anything other than just getting what is sent to stdout. The pager selection and control is complex enough that it is hard to tell just what is happening and how. > help(ob) returning a string is a different issue. Since the text is ofter > more than ob.__doc__ (see help(int) versus int.__doc__, for instance), > returning the composed text would at minimum make it easier to test help, > or to check the composed result for a particular module or class. One might > even use the result as a quick ref manual, or a first draft for one. The complete text (as a string) for each request is sent to the pager all at once. So the content isn't an issue. The only thing that might be an issue is the bold mark ups. Which I do like. :-/ Cheers, Ron From ben+python at benfinney.id.au Tue Feb 8 03:00:50 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Tue, 08 Feb 2011 13:00:50 +1100 Subject: [Python-ideas] Cleaner separation of help() and interactive help. References: <878vxrqpw8.fsf@benfinney.id.au> <20110207203454.1525fce7@bhuda.mired.org> Message-ID: <87zkq7p9nx.fsf@benfinney.id.au> Mike Meyer writes: > An easier fix is to feed less the "-X" flag (via either the LESS or > MORE environment variables, depending on which your fingers know), Yep, that works for me. Another way is to set the ?PAGER? variable to the exact command to be used. For those experimenting at home, try this:: $ PAGER="less -cX" python Python 2.6.6 (r266:84292, Dec 27 2010, 10:20:06) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> help(os) and see whether you like that setting for the ?PAGER? variable. -- \ ?Pinky, are you pondering what I'm pondering?? ?I think so, | `\ Brain, but if the plural of mouse is mice, wouldn't the plural | _o__) of spouse be spice?? ?_Pinky and The Brain_ | Ben Finney From azrael.zila at gmail.com Tue Feb 8 12:52:25 2011 From: azrael.zila at gmail.com (Arthur) Date: Tue, 8 Feb 2011 09:52:25 -0200 Subject: [Python-ideas] Change the array declaration syntax. Message-ID: (Sorry for any spelling errors, I'm using Google translator to write this message) Lists and tuples are data structures dynamically typed, and that's great! Makes writing code easier, and allows the developer to keep the focus on more important things. However, when these structures contain large amounts of elements, or when the program requires a lot of computational resources, dynamic typingbecomes a waste if the lists and tuples not use it. One way to avoid wasting resources with dynamic typing where it is unnecessary is to use the array class. The current syntax for creating arrays is: >>> from array import * #necessary to create lists of single type >>> var = array('i',[1,2,3]) #the first argument is the type and the second is the list This class is the solution when working with lists, but there is something similar when working with tuples. Thus, the variable "var" is a list, not a tuple, even if it is declared with with a tuple instead of a list: >>> var = array('i',(1,2,3)) >>> var array('i', [1, 2, 3]) I think, as well as raw strings in Python 2, lists and tuples could be declared a "single type" using a prefix. >>> var1 = i[1, 2, 3] # a singletype int list, as a current array >>> var2 = i('1', '2', '3') # a singletype int tuple, as a current array, but immutable My suggestion is to allow the type of element in the list or tuple is specified if the addition of a prefix before '[' or '('. This would simplify the use of arrays and improve the performance of programs that make use of lists of "single type". Besides creating tuples " single type". This seems more efficient, not wasting computing resources without polluting the code or make the language more complicated. For more information about the class array, the address of the documentation is http://docs.python.org/library/array.html . Thank You and Goodbye! Sig: Arthur Juli?o -------------------------------------------------- ---------------------------------------------- "Quero que a estrada venha sempre at? voc? e que o vento esteja sempre a seu favor, quero que haja sempre uma cerveja em sua m?o e que esteja ao seu lado seu grande amor." (Tempo Ruim - A Arte do Insulto - Matanza) -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirkjan at ochtman.nl Tue Feb 8 13:01:35 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Tue, 8 Feb 2011 13:01:35 +0100 Subject: [Python-ideas] Change the array declaration syntax. In-Reply-To: References: Message-ID: On Tue, Feb 8, 2011 at 12:52, Arthur wrote: >>>> var1 = i[1, 2, 3] # a singletype?int?list, as a current array >>>> var2 = i('1', '2', '3') # a singletype int tuple, as a current >>>> array,?but?immutable These statements are already meaningful: the first is a get-item of i with the key (1, 2, 3), the second is a function call of i with the arguments '1', '2' and '3'. This makes it unlikely the syntax will ever change this way. On the bright side, this means you can easily implement the shortcuts you want in today's Python, if you need them in a project. Cheers, Dirkjan From ncoghlan at gmail.com Tue Feb 8 15:07:58 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Feb 2011 00:07:58 +1000 Subject: [Python-ideas] Change the array declaration syntax. In-Reply-To: References: Message-ID: On Tue, Feb 8, 2011 at 9:52 PM, Arthur wrote: > Thus,?the?variable?"var"?is?a?list,?not?a?tuple,??even if?it > is?declared?with?with?a?tuple?instead?of?a?list: >>>> var = array('i',(1,2,3)) >>>> var > array('i', [1, 2, 3]) Note that an array is its own beast - it just happens to use list notation in its repr, as the square brackets contrast better with the parentheses used for the function call syntax. Regardless, if you want a quick and easy way to create arrays of particular types, just define your own constructor function: >>> from array import array >>> def iarray(*elements): ... return array('i', elements) ... >>> x = iarray(1, 2, 3) >>> x array('i', [1, 2, 3]) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From azrael.zila at gmail.com Tue Feb 8 15:34:44 2011 From: azrael.zila at gmail.com (Arthur) Date: Tue, 8 Feb 2011 12:34:44 -0200 Subject: [Python-ideas] Change the array declaration syntax. In-Reply-To: References: Message-ID: I thought about how to do but did not analyze the feasibility, sorry about that. It could be after the symbol']'and ')'. So would: >>> var1 = [1, 2, 3]i # a singletype int list, as a current array >>> var2 = ('1', '2', '3')i # a singletype int tuple, as a current >>> array, but immutable At? mais! Ass.: Arthur Juli?o ------------------------------------------------------------------------------------------------ "Quero que a estrada venha sempre at? voc? e que o vento esteja sempre a seu favor, quero que haja sempre uma cerveja em sua m?o e que esteja ao seu lado seu grande amor." (Tempo Ruim - A Arte do Insulto - Matanza) 2011/2/8 Nick Coghlan > On Tue, Feb 8, 2011 at 9:52 PM, Arthur wrote: > > Thus, the variable "var" is a list, not a tuple, even if it > > is declared with with a tuple instead of a list: > >>>> var = array('i',(1,2,3)) > >>>> var > > array('i', [1, 2, 3]) > > Note that an array is its own beast - it just happens to use list > notation in its repr, as the square brackets contrast better with the > parentheses used for the function call syntax. > > Regardless, if you want a quick and easy way to create arrays of > particular types, just define your own constructor function: > > >>> from array import array > >>> def iarray(*elements): > ... return array('i', elements) > ... > >>> x = iarray(1, 2, 3) > >>> x > array('i', [1, 2, 3]) > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Tue Feb 8 15:38:08 2011 From: masklinn at masklinn.net (Masklinn) Date: Tue, 8 Feb 2011 15:38:08 +0100 Subject: [Python-ideas] Change the array declaration syntax. In-Reply-To: References: Message-ID: <6CB733FC-0275-476B-82C2-8636478CC32F@masklinn.net> On 2011-02-08, at 12:52 , Arthur wrote: > (Sorry for any spelling errors, I'm using Google translator to write this > message) > > Lists and tuples are data structures dynamically typed, and that's great! > Makes writing code easier, and allows the developer to keep the focus on > more important things. However, when these structures contain large amounts > of elements, or when the program requires a lot of computational resources, > dynamic typingbecomes a waste if the lists and tuples not use it. > > One way to avoid wasting resources with dynamic typing where it is > unnecessary is to use the array class. > The current syntax for creating arrays is: > >>>> from array import * #necessary to create lists of single type You don't need to import * here. Array is just another type in a module. >>> import array >>> a = array.array('i', [1,2,3]) >>> a array('i', [1, 2, 3]) >>>> var = array('i',[1,2,3]) #the first argument is the type and the second > is the list > > This class is the solution when working with lists, but there is something > similar when working with tuples. > Thus, the variable "var" is a list, not a tuple, even if it is declared > with with a tuple instead of a list: > >>>> var = array('i',(1,2,3)) >>>> var > array('i', [1, 2, 3]) > Since 2.4, arrays can be initialized using any iterable. They're not "lists" or "tuples", they're arrays, a separate data type. You can provide an xrange as initializer if you wish to: >>> array.array('i', xrange(5)) array('i', [0, 1, 2, 3, 4]) or even provide no initializer at all: >>> array.array('i') array('i') > I think, as well as raw strings in Python 2, lists and tuples could be > declared a "single type" using a prefix. >>>> var1 = i[1, 2, 3] # a singletype int list, as a current array >>>> var2 = i('1', '2', '3') # a singletype int tuple, as a current array, > but immutable Not only is this syntax ambiguous with current constructs, as Dirkjan pointed out, I don't think you've made a case for the necessity of an immutable array type so far, let alone for the necessity of a literal syntax for it (why couldn't the `array` module just have an additional type e.g. `frozenarray` similar to `set` and `frozenset`?) > My suggestion is to allow the type of element in the list or tuple is > specified if the addition of a prefix before '[' or '('. > This would simplify the use of arrays and improve the performance of > programs that make use of lists of "single type". Besides creating tuples " > single type". How would it improve the performance of code which does not currently see any need for using `array`? Would the interpreter have to infer the collection's type in order to generate the correct typed array? From tjreedy at udel.edu Tue Feb 8 19:13:55 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 08 Feb 2011 13:13:55 -0500 Subject: [Python-ideas] Change the array declaration syntax. In-Reply-To: References: Message-ID: The only thing one can do with a tuple that cannot be done with a list is hash it for use as a dict key. And this is the only thing that would be gained with an immutable homogeneous array type. But tuple keys are typically so short (two or three items) that there is little need for anything else. Besides which, in 3.x, bytes are immutable sequences of (small) ints and are also available as dict keys. -- Terry Jan Reedy From raymond.hettinger at gmail.com Tue Feb 8 23:47:37 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Tue, 8 Feb 2011 14:47:37 -0800 Subject: [Python-ideas] Changing the name of __pycache__ Message-ID: It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). Raymond From solipsis at pitrou.net Wed Feb 9 00:15:22 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 9 Feb 2011 00:15:22 +0100 Subject: [Python-ideas] Changing the name of __pycache__ References: Message-ID: <20110209001522.47f581ad@pitrou.net> On Tue, 8 Feb 2011 14:47:37 -0800 Raymond Hettinger wrote: > It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. > > The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). The fact that pyc files were not named ".foo.pyc" hints that we want them to be visible, IMO. Also, I'm not sure how a single __pycache__ directory is worse than N pyc files. Regards Antoine. From mal at egenix.com Wed Feb 9 10:07:08 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 09 Feb 2011 10:07:08 +0100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: Message-ID: <4D52593C.5080408@egenix.com> Raymond Hettinger wrote: > It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. > > The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). While I don't the like name either, I think it's important that this particular aspect is not configurable: there are tools relying on finding the .pyc files based on the location of the .py files and those don't necessarily run in the same environment as the application, e.g. think of all the freeze tools, or situations where the application itself runs as daemon under a different user account than the one used to administer the application. Another use case is shipping precompiled packages. If the user changes the pyc cache dir name, the precompiled versions won't get used. BTW: I wonder how PEP 3147 will support source-less distributions. With previous versions of Python, this was easy: you just remove the .py files. With Python 3.2, removing the .py files and leaving just the files in the pyc cache will cause ImportErrors (see http://www.python.org/dev/peps/pep-3147/#id59). It seems that the only way to "build" a working source-less package is by running a special tool that moves the pyc files to where the source files lived. In addition, this mechanism does not appear to work with the new names, so distribution of packages with pycs for multiple Python versions is not possible. I'm not sure why this was done. It looks like an unnecessary limitation of the PEP. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 09 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Wed Feb 9 11:41:59 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Feb 2011 20:41:59 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D52593C.5080408@egenix.com> References: <4D52593C.5080408@egenix.com> Message-ID: Raymond's suggestion is specifically addressed (and rejected) in PEP 3147: http://www.python.org/dev/peps/pep-3147/#pyc On Wed, Feb 9, 2011 at 7:07 PM, M.-A. Lemburg wrote: > I'm not sure why this was done. It looks like an unnecessary > limitation of the PEP. Because we wanted deleting the .py file to be the "one obvious way" to remove a module, rather than allowing "ghost modules" to hang around in __pycache__. Sourceless imports use the legacy .pyc location to distinguish them as a deliberate distribution choice over the "mere" caching in __pycache__. (Note that the original version of PEP 3147 didn't support sourceless distribution at all). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mal at egenix.com Wed Feb 9 12:16:28 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 09 Feb 2011 12:16:28 +0100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> Message-ID: <4D52778C.7000109@egenix.com> Nick Coghlan wrote: > Raymond's suggestion is specifically addressed (and rejected) in PEP > 3147: http://www.python.org/dev/peps/pep-3147/#pyc > > On Wed, Feb 9, 2011 at 7:07 PM, M.-A. Lemburg wrote: >> I'm not sure why this was done. It looks like an unnecessary >> limitation of the PEP. > > Because we wanted deleting the .py file to be the "one obvious way" to > remove a module, rather than allowing "ghost modules" to hang around > in __pycache__. Isn't that a rather rare use case nowadays where packages and modules are usually installed as egg directories ? You typically delete the whole egg dir to remove an installed module/package and don't really care what's inside the directory. > Sourceless imports use the legacy .pyc location to > distinguish them as a deliberate distribution choice over the "mere" > caching in __pycache__. (Note that the original version of PEP 3147 > didn't support sourceless distribution at all). Ok, but why don't those pyc files support the same add-on as the files in the __pycache__ dir ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 09 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From p.f.moore at gmail.com Wed Feb 9 12:32:55 2011 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 9 Feb 2011 11:32:55 +0000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> Message-ID: On 9 February 2011 10:41, Nick Coghlan wrote: > Raymond's suggestion is specifically addressed (and rejected) in PEP > 3147: http://www.python.org/dev/peps/pep-3147/#pyc > > On Wed, Feb 9, 2011 at 7:07 PM, M.-A. Lemburg wrote: >> I'm not sure why this was done. It looks like an unnecessary >> limitation of the PEP. > > Because we wanted deleting the .py file to be the "one obvious way" to > remove a module, rather than allowing "ghost modules" to hang around > in __pycache__. Sourceless imports use the legacy .pyc location to > distinguish them as a deliberate distribution choice over the "mere" > caching in __pycache__. (Note that the original version of PEP 3147 > didn't support sourceless distribution at all). Note that it looks like bdist_wininst installers don't delete the __pycache__ entries on an uninstall (which I'd argue is an error, but I'm not going to bother until I at least have time to raise an issue for it...) So satisfying imports from __pycache__ without there being a source file would break bdist_wininst uninstallation badly as things stand. Paul. From solipsis at pitrou.net Wed Feb 9 12:42:14 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 9 Feb 2011 12:42:14 +0100 Subject: [Python-ideas] Changing the name of __pycache__ References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> Message-ID: <20110209124214.59f0fa61@pitrou.net> On Wed, 09 Feb 2011 12:16:28 +0100 "M.-A. Lemburg" wrote: > Nick Coghlan wrote: > > Raymond's suggestion is specifically addressed (and rejected) in PEP > > 3147: http://www.python.org/dev/peps/pep-3147/#pyc > > > > On Wed, Feb 9, 2011 at 7:07 PM, M.-A. Lemburg wrote: > >> I'm not sure why this was done. It looks like an unnecessary > >> limitation of the PEP. > > > > Because we wanted deleting the .py file to be the "one obvious way" to > > remove a module, rather than allowing "ghost modules" to hang around > > in __pycache__. > > Isn't that a rather rare use case nowadays where packages and > modules are usually installed as egg directories ? > > You typically delete the whole egg dir to remove an installed > module/package and don't really care what's inside the directory. Nick was talking about deleting a single file. As a developer, not as an administrator. Bizarre issues because of a dangling pyc file after removing the py file are now the past, which is a good thing :) Regards Antoine. From ncoghlan at gmail.com Wed Feb 9 13:47:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Feb 2011 22:47:09 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <20110209124214.59f0fa61@pitrou.net> References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <20110209124214.59f0fa61@pitrou.net> Message-ID: On Wed, Feb 9, 2011 at 9:42 PM, Antoine Pitrou wrote: > On Wed, 09 Feb 2011 12:16:28 +0100 > "M.-A. Lemburg" wrote: >> Isn't that a rather rare use case nowadays where packages and >> modules are usually installed as egg directories ? >> >> You typically delete the whole egg dir to remove an installed >> module/package and don't really care what's inside the directory. > > Nick was talking about deleting a single file. As a developer, not as > an administrator. Bizarre issues because of a dangling pyc file > after removing the py file are now the past, which is a good thing :) Yep, Antoine picked up exactly what I was getting at here. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Feb 9 14:02:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 9 Feb 2011 23:02:20 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D52778C.7000109@egenix.com> References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> Message-ID: On Wed, Feb 9, 2011 at 9:16 PM, M.-A. Lemburg wrote: > Ok, but why don't those pyc files support the same add-on > as the files in the __pycache__ dir ? Because the idea was mainly to retain the legacy .pyc support so we didn't break any sourceless distributions that already worked, not to encourage more of them. If people want to target a specific interpreter and ship sourceless, they can do that, or they can target multiple interpreter implementations by shipping the source. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From raymond.hettinger at gmail.com Wed Feb 9 17:02:15 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 9 Feb 2011 08:02:15 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D52593C.5080408@egenix.com> References: <4D52593C.5080408@egenix.com> Message-ID: On Feb 9, 2011, at 1:07 AM, M.-A. Lemburg wrote: > Raymond Hettinger wrote: >> It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. >> >> The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). > > While I don't the like name either, I think it's important that this > particular aspect is not configurable: there are tools relying on > finding the .pyc files based on the location of the .py files > and those don't necessarily run in the same environment as the > application, e.g. think of all the freeze tools, or situations > where the application itself runs as daemon under a different > user account than the one used to administer the application. The #define for the name is on line 115 in Python/import.c. If a consensus were to emerge, it would still be possible to change the name from "__pycache__" to ".pycache". Raymond From mal at egenix.com Wed Feb 9 17:13:37 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 09 Feb 2011 17:13:37 +0100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> Message-ID: <4D52BD31.6040805@egenix.com> Raymond Hettinger wrote: > > On Feb 9, 2011, at 1:07 AM, M.-A. Lemburg wrote: > >> Raymond Hettinger wrote: >>> It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. >>> >>> The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). >> >> While I don't the like name either, I think it's important that this >> particular aspect is not configurable: there are tools relying on >> finding the .pyc files based on the location of the .py files >> and those don't necessarily run in the same environment as the >> application, e.g. think of all the freeze tools, or situations >> where the application itself runs as daemon under a different >> user account than the one used to administer the application. > > The #define for the name is on line 115 in Python/import.c. > > If a consensus were to emerge, it would still be possible to > change the name from "__pycache__" to ".pycache". +1 on ".pycache". "__pycache__" looks too much like a special Python package dir to me. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 09 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ethan at stoneleaf.us Wed Feb 9 18:07:05 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 09 Feb 2011 09:07:05 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D52BD31.6040805@egenix.com> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> Message-ID: <4D52C9B9.5070709@stoneleaf.us> M.-A. Lemburg wrote: > Raymond Hettinger wrote: >> On Feb 9, 2011, at 1:07 AM, M.-A. Lemburg wrote: >> >>> Raymond Hettinger wrote: >>>> It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. >>>> >>>> The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). >>> While I don't the like name either, I think it's important that this >>> particular aspect is not configurable: there are tools relying on >>> finding the .pyc files based on the location of the .py files >>> and those don't necessarily run in the same environment as the >>> application, e.g. think of all the freeze tools, or situations >>> where the application itself runs as daemon under a different >>> user account than the one used to administer the application. >> The #define for the name is on line 115 in Python/import.c. >> >> If a consensus were to emerge, it would still be possible to >> change the name from "__pycache__" to ".pycache". > > +1 on ".pycache". > > "__pycache__" looks too much like a special Python package > dir to me. > +1 on ".pycache" as well. ~Ethan~ From alexandre.conrad at gmail.com Wed Feb 9 18:17:46 2011 From: alexandre.conrad at gmail.com (Alexandre Conrad) Date: Wed, 9 Feb 2011 09:17:46 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <20110209001522.47f581ad@pitrou.net> References: <20110209001522.47f581ad@pitrou.net> Message-ID: 2011/2/8 Antoine Pitrou : > The fact that pyc files were not named ".foo.pyc" hints that we > want them to be visible, IMO. > Also, I'm not sure how a single __pycache__ directory is worse than N > pyc files. Maybe one of the purpose of __pycache__ was to "hide" the existing .pyc files. I am +1 on .pycache which would be less intrusive. -- Alex | twitter.com/alexconrad From solipsis at pitrou.net Wed Feb 9 18:27:08 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 9 Feb 2011 18:27:08 +0100 Subject: [Python-ideas] Changing the name of __pycache__ References: <20110209001522.47f581ad@pitrou.net> Message-ID: <20110209182708.6ec1af48@pitrou.net> On Wed, 9 Feb 2011 09:17:46 -0800 Alexandre Conrad wrote: > 2011/2/8 Antoine Pitrou : > > The fact that pyc files were not named ".foo.pyc" hints that we > > want them to be visible, IMO. > > Also, I'm not sure how a single __pycache__ directory is worse than N > > pyc files. > > Maybe one of the purpose of __pycache__ was to "hide" the existing .pyc files. Well, it does. That doesn't meant it has to be hidden itself. Besides, the main purpose is to allow for multiple cache files per source file (originally a Debian/Ubuntu request, but potentially useful for other people). That it makes directories cleaner than the old scheme is merely a side effect. Regards Antoine. From guido at python.org Wed Feb 9 18:29:16 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 9 Feb 2011 09:29:16 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <20110209001522.47f581ad@pitrou.net> References: <20110209001522.47f581ad@pitrou.net> Message-ID: On Tue, Feb 8, 2011 at 3:15 PM, Antoine Pitrou wrote: > On Tue, 8 Feb 2011 14:47:37 -0800 > Raymond Hettinger > wrote: >> It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. >> >> The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). ?Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). > > The fact that pyc files were not named ".foo.pyc" hints that we > want them to be visible, IMO. Right. This was quite the conscious decision when we discussed the new scheme. We want the .pyc files to be out of the way, but we DON'T want them to be completely invisible. -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Wed Feb 9 18:30:07 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 9 Feb 2011 18:30:07 +0100 Subject: [Python-ideas] Changing the name of __pycache__ References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> Message-ID: <20110209183007.40fa6b02@pitrou.net> On Wed, 09 Feb 2011 09:07:05 -0800 Ethan Furman wrote: > +1 on ".pycache" as well. Well, unless you propose postponing the forthcoming 3.2 release for that, it's probably too late anyway. (and of course it's not "just a #define"; there are tests, and probably importlib and other modules relying on it; and the PEP to update too) That said, I think it is useful that casual users of Python are aware that Python does cache bytecode files. It's not a complex enough notion that there's any point in hiding these from them. After all, explicit is better than implicit. Regards Antoine. From raymond.hettinger at gmail.com Wed Feb 9 18:45:39 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 9 Feb 2011 09:45:39 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <20110209183007.40fa6b02@pitrou.net> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> Message-ID: <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> On Feb 9, 2011, at 9:30 AM, Antoine Pitrou wrote: > On Wed, 09 Feb 2011 09:07:05 -0800 > Ethan Furman wrote: >> +1 on ".pycache" as well. > > Well, unless you propose postponing the forthcoming 3.2 release for > that, it's probably too late anyway. Yes, I propose that we do that now (3.2rc2). It is a simple exercise with sed to change it and not hard to get right. We've gotten +1 on .pycache from: Mark Lemburg, Ethan Furman, David Malcolm, and me. AFAICT, the only thing going for __pycache__ is that that is was is already in the tree. So far, no one has said they prefer that name to .pycache. > (and of course it's not "just a #define"; there are tests, and probably > importlib and other modules relying on it; and the PEP to update too) > > That said, I think it is useful that casual users of Python are aware > that Python does cache bytecode files. It's not a complex enough notion > that there's any point in hiding these from them. After all, explicit > is better than implicit. The dot-file naming convention is pretty well established. People use "ls" much more than they use "ls -a" because they usually don't want to see those files or directories. IOW, implicit is better when we're talking about system files and caching and whatnot. Raymond From solipsis at pitrou.net Wed Feb 9 19:00:45 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 09 Feb 2011 19:00:45 +0100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: <1297274445.3731.10.camel@localhost.localdomain> > > That said, I think it is useful that casual users of Python are aware > > that Python does cache bytecode files. It's not a complex enough notion > > that there's any point in hiding these from them. After all, explicit > > is better than implicit. > > The dot-file naming convention is pretty well established. Well, so what? Should we use any file naming convention even when it's not appropriate? > People use "ls" much more than they use "ls -a" because > they usually don't want to see those files or directories. I'm not sure people who would be annoyed by a __pycache__ directory really want to see any Python source code anyway. You only list Python directories when you are exploring or hacking something, so clearly you are interested in knowing how things work. > IOW, implicit is better when we're talking about system files > and caching and whatnot. Why so? My /var/cache directory is not named /var/.cache, and it's full of non-dotfile entries. From ethan at stoneleaf.us Wed Feb 9 19:16:50 2011 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 09 Feb 2011 10:16:50 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <1297274445.3731.10.camel@localhost.localdomain> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> <1297274445.3731.10.camel@localhost.localdomain> Message-ID: <4D52DA12.6060304@stoneleaf.us> Antoine Pitrou wrote: > Raymond Hettinger wrote: >> IOW, implicit is better when we're talking about system files >> and caching and whatnot. > > Why so? My /var/cache directory is not named /var/.cache, and it's full > of non-dotfile entries. Huh? Are you saying we should have /var/pycache? 'Cause I'm cool with that. ;) ~Ethan~ From raymond.hettinger at gmail.com Wed Feb 9 19:09:09 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 9 Feb 2011 10:09:09 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <20110209001522.47f581ad@pitrou.net> Message-ID: <9E7BF7F1-1F69-4849-81E8-3E2B1009D572@gmail.com> On Feb 9, 2011, at 9:29 AM, Guido van Rossum wrote: > We want the .pyc files to be out of the way, but we DON'T want > them to be completely invisible. That settles it. Thanks for chiming in. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Feb 9 19:10:24 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 9 Feb 2011 13:10:24 -0500 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: On Wed, Feb 9, 2011 at 12:45 PM, Raymond Hettinger wrote: > > On Feb 9, 2011, at 9:30 AM, Antoine Pitrou wrote: > >> On Wed, 09 Feb 2011 09:07:05 -0800 >> Ethan Furman wrote: >>> +1 on ".pycache" as well. >> >> Well, unless you propose postponing the forthcoming 3.2 release for >> that, it's probably too late anyway. > > Yes, I propose that we do that now (3.2rc2). > > It is a simple exercise with sed to change it > and not hard to get right. > > We've gotten +1 on .pycache from: > ? Mark Lemburg, Ethan Furman, David Malcolm, and me. > FWIW, count me as "-1". I would say: "Don't fix it if it ain't broke". I've just recently started to remember where .pyc files went in recent versions and don't want to relearn the directory name again or worse learn how to figure out the name that changes from setup to setup. I'd say visibility of __pychache__ is a virtue. Users who don read "what's new" are likely to notice it and learn about the new feature. From brett at python.org Wed Feb 9 19:15:40 2011 From: brett at python.org (Brett Cannon) Date: Wed, 9 Feb 2011 10:15:40 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: On Wed, Feb 9, 2011 at 09:45, Raymond Hettinger wrote: > > On Feb 9, 2011, at 9:30 AM, Antoine Pitrou wrote: > >> On Wed, 09 Feb 2011 09:07:05 -0800 >> Ethan Furman wrote: >>> +1 on ".pycache" as well. >> >> Well, unless you propose postponing the forthcoming 3.2 release for >> that, it's probably too late anyway. > > Yes, I propose that we do that now (3.2rc2). > > It is a simple exercise with sed to change it > and not hard to get right. > > We've gotten +1 on .pycache from: > ? Mark Lemburg, Ethan Furman, David Malcolm, and me. > > AFAICT, the only thing going for __pycache__ is that that > is was is already in the tree. ?So far, no one has said they > prefer that name to .pycache. I honestly didn't think this was going to go that far, but since it is, I will say that I prefer __pycache__. I like visibly knowing that CPython has created files that it is relying upon instead of having to explicitly make sure I do `ls -a` to find out. > > >> (and of course it's not "just a #define"; there are tests, and probably >> importlib and other modules relying on it; and the PEP to update too) >> >> That said, I think it is useful that casual users of Python are aware >> that Python does cache bytecode files. It's not a complex enough notion >> that there's any point in hiding these from them. After all, explicit >> is better than implicit. > > The dot-file naming convention is pretty well established. > People use "ls" much more than they use "ls -a" because > they usually don't want to see those files or directories. On UNIX. This does not extend to other platforms like Windows. > > IOW, implicit is better when we're talking about system files > and caching and whatnot. I disagree. I hate the amount of dot files that accumulate in my home directory from various apps that I have used. So when I do have to go digging for some config file I suddenly discover that a bunch of apps have installed "hidden" config files that I now have to either delete or mentally ignore. From rrr at ronadam.com Wed Feb 9 19:40:06 2011 From: rrr at ronadam.com (Ron Adam) Date: Wed, 09 Feb 2011 12:40:06 -0600 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> Message-ID: <4D52DF86.2000906@ronadam.com> On 02/09/2011 10:02 AM, Raymond Hettinger wrote: > > On Feb 9, 2011, at 1:07 AM, M.-A. Lemburg wrote: > >> Raymond Hettinger wrote: >>> It would be great if there was some way to change the name to .pycache so that it doesn't pollute directory listings. >>> >>> The dot-naming convention seems to be widely used (.bashrc, .emacs, .hgignore, etc.). Ideally, we should follow that convention also or at least provide a way to make the change locally (perhaps an environment variable). >> >> While I don't the like name either, I think it's important that this >> particular aspect is not configurable: there are tools relying on >> finding the .pyc files based on the location of the .py files >> and those don't necessarily run in the same environment as the >> application, e.g. think of all the freeze tools, or situations >> where the application itself runs as daemon under a different >> user account than the one used to administer the application. > > The #define for the name is on line 115 in Python/import.c. > > If a consensus were to emerge, it would still be possible to > change the name from "__pycache__" to ".pycache". -1 I personally don't like hidden directories and files of any type. I think it is very good that python avoids those where it can. Hidden files and directories have their own problems. They can be forgotten or missed. If there is a problem associated with a hidden file, it can lead to a lot of wasted time when people look for problems else where because they are not readily aware the hidden files or directories. Please don't use hidden files or directories, the use of a single directory is the best balance of keeping things out of the way, yet not hiding them totally. Ron From rrr at ronadam.com Wed Feb 9 19:48:15 2011 From: rrr at ronadam.com (Ron Adam) Date: Wed, 09 Feb 2011 12:48:15 -0600 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: <4D52E16F.6020601@ronadam.com> On 02/09/2011 12:15 PM, Brett Cannon wrote: >> IOW, implicit is better when we're talking about system files >> and caching and whatnot. > > I disagree. I hate the amount of dot files that accumulate in my home > directory from various apps that I have used. So when I do have to go > digging for some config file I suddenly discover that a bunch of apps > have installed "hidden" config files that I now have to either delete > or mentally ignore. I completely agree! It would have been much better if these were all put into a single directory and not hidden. Just as python does. ;-) I also hate the practice of hiding file extensions. Cheers, Ron From mal at egenix.com Wed Feb 9 19:59:15 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 09 Feb 2011 19:59:15 +0100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> Message-ID: <4D52E403.5050205@egenix.com> Nick Coghlan wrote: > On Wed, Feb 9, 2011 at 9:16 PM, M.-A. Lemburg wrote: >> Ok, but why don't those pyc files support the same add-on >> as the files in the __pycache__ dir ? > > Because the idea was mainly to retain the legacy .pyc support so we > didn't break any sourceless distributions that already worked, not to > encourage more of them. If people want to target a specific > interpreter and ship sourceless, they can do that, or they can target > multiple interpreter implementations by shipping the source. Why alienate sourceless distributions by not supporting the same logic in the main package dir ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 09 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From steve at pearwood.info Wed Feb 9 20:14:24 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 10 Feb 2011 06:14:24 +1100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> Message-ID: <4D52E790.5050107@pearwood.info> Raymond Hettinger wrote: > If a consensus were to emerge, it would still be possible to > change the name from "__pycache__" to ".pycache". -1. Please don't encourage the Unix anti-pattern of scattering invisible breadcrumbs all throughout your work-area. Besides, unless I'm misinformed, such dot files aren't invisible in Windows systems (or Mac?), so the fundamental assumption that changing the name will make it invisible will be wrong for many, perhaps most, users. I don't particularly like the name __pycache__ but it does match the Python convention of using double-underscore names. Otherwise, it risks clashing with a user's own directory. I *far* prefer it over .pycache. -- Steven From alexandre.conrad at gmail.com Wed Feb 9 21:56:40 2011 From: alexandre.conrad at gmail.com (Alexandre Conrad) Date: Wed, 9 Feb 2011 12:56:40 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D52E790.5050107@pearwood.info> References: <4D52593C.5080408@egenix.com> <4D52E790.5050107@pearwood.info> Message-ID: 2011/2/9 Steven D'Aprano : > Raymond Hettinger wrote: > >> If a consensus were to emerge, it would still be possible to change the >> name from "__pycache__" to ".pycache". > > -1. > > Please don't encourage the Unix anti-pattern of scattering invisible > breadcrumbs all throughout your work-area. > > Besides, unless I'm misinformed, such dot files aren't invisible in Windows > systems (or Mac?), so the fundamental assumption that changing the name will > make it invisible will be wrong for many, perhaps most, users. Yes, I did think about that afterwards. > I don't particularly like the name __pycache__ but it does match the Python > convention of using double-underscore names. Otherwise, it risks clashing > with a user's own directory. I *far* prefer it over .pycache. "dunder" naming is a Python convention and is OK for Python code. Even though I am not a big fan of the __init__.py file, at least the user created it for his python package to be seen. Whereas __pycache__ conveys the idea that it is has a special meaning (such as __init__.py) and suggests it may alter your application's behavior, which does not. Just nitpicking, I guess. So I remove my +1 for the ".pycache" idea (for the reason that being hidden is platform specific) and my call is now +0. -- Alex | twitter.com/alexconrad From greg.ewing at canterbury.ac.nz Wed Feb 9 22:36:34 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Feb 2011 10:36:34 +1300 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: <4D5308E2.6040304@canterbury.ac.nz> Brett Cannon wrote: > I like visibly knowing that > CPython has created files that it is relying upon instead of having to > explicitly make sure I do `ls -a` to find out. Also please keep in mind: * The MacOSX Finder doesn't have any equivalent of the '-a' option; dot-files are completely invisible to it. * Dot-files have no special meaning in Windows, so '.pycache' would be just as visible as '__pycache__' there. -- Greg From greg.ewing at canterbury.ac.nz Wed Feb 9 22:50:08 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Feb 2011 10:50:08 +1300 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52E790.5050107@pearwood.info> Message-ID: <4D530C10.6020204@canterbury.ac.nz> Alexandre Conrad wrote: > Even > though I am not a big fan of the __init__.py file, at least the user > created it for his python package to be seen. I don't see how the interpreter creating a __pycache__ directory is any more mysterious than it creating a bunch of .pyc files. And the presence of the word 'cache' in the name gives one a fairly good clue as to what it's about. -- Greg From ben+python at benfinney.id.au Wed Feb 9 22:58:19 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Thu, 10 Feb 2011 08:58:19 +1100 Subject: [Python-ideas] Changing the name of __pycache__ References: <4D52593C.5080408@egenix.com> <4D52E790.5050107@pearwood.info> Message-ID: <87d3n0q39g.fsf@benfinney.id.au> Steven D'Aprano writes: > Raymond Hettinger wrote: > > > If a consensus were to emerge, it would still be possible to change > > the name from "__pycache__" to ".pycache". +1 from me. > -1. > > Please don't encourage the Unix anti-pattern of scattering invisible > breadcrumbs all throughout your work-area. The anti-pattern is the scattering of breadcrumbs. I agree with discouraging that practice. But Python already breaks that, and PEP 3147 is an attempt at making the practice less messy. The use of leading-dot names to make them invisible is a good feature. If breadcrumbs must be scattered, at least keep them out of the way in normal use. > Besides, unless I'm misinformed, such dot files aren't invisible in > Windows systems (or Mac?), so the fundamental assumption that changing > the name will make it invisible will be wrong for many, perhaps most, > users. Then ?.pycache? is no harm on such systems, surely? -- \ ?Those are my principles. If you don't like them I have | `\ others.? ?Groucho Marx | _o__) | Ben Finney From ncoghlan at gmail.com Wed Feb 9 23:38:57 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Feb 2011 08:38:57 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: On Thu, Feb 10, 2011 at 3:45 AM, Raymond Hettinger wrote: > We've gotten +1 on .pycache from: > ? Mark Lemburg, Ethan Furman, David Malcolm, and me. > > AFAICT, the only thing going for __pycache__ is that that > is was is already in the tree. ?So far, no one has said they > prefer that name to .pycache. I expect a lot folks involved in the original choice of __pycache__ as the name figured this discussion would fizzle out, and hence didn't reply, especially since PEP 3147 already made this choice explicit with Guido backing the idea for the same reasons given in this thread (http://www.python.org/dev/peps/pep-3147/#pyc) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Feb 9 23:44:06 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Feb 2011 08:44:06 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D52E403.5050205@egenix.com> References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> Message-ID: On Thu, Feb 10, 2011 at 4:59 AM, M.-A. Lemburg wrote: > > Why alienate sourceless distributions by not supporting the same > logic in the main package dir ? Because a large number of the people involved in the PEP 3147 discussions wanted to drop support for sourceless imports completely, including the folks doing the implementation work. Retaining legacy sourceless imports was a compromise that preserved existing functionality without increasing the implementation effort needed for the PEP. Changing that will require a patch and advocacy from folks that are actually *fans* of sourceless distribution, as opposed to merely tolerating them as an arguably necessary evil. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From raymond.hettinger at gmail.com Thu Feb 10 00:49:22 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 9 Feb 2011 15:49:22 -0800 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: On Feb 9, 2011, at 2:38 PM, Nick Coghlan wrote: > I expect a lot folks involved in the original choice of __pycache__ as > the name figured this discussion would fizzle out, and hence didn't > reply, especially since PEP 3147 already made this choice explicit > with Guido backing the idea for the same reasons given in this thread > (http://www.python.org/dev/peps/pep-3147/#pyc) Thanks for the link. I already dropped/retracted this idea after Guido chimed-in this morning, and I had not been aware that there was a previous discussion (I read tons of python email, docs, peps, etc but missed this particular discussion). It seems that a lot of people on this list (Guido most importantly) want to see the directory, and folks think the bytecode cache is somewhat different than .svn or .hg directories. Raymond From mal at egenix.com Thu Feb 10 01:35:01 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 10 Feb 2011 01:35:01 +0100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> Message-ID: <4D5332B5.3000907@egenix.com> Nick Coghlan wrote: > On Thu, Feb 10, 2011 at 4:59 AM, M.-A. Lemburg wrote: >> >> Why alienate sourceless distributions by not supporting the same >> logic in the main package dir ? > > Because a large number of the people involved in the PEP 3147 > discussions wanted to drop support for sourceless imports completely, > including the folks doing the implementation work. Retaining legacy > sourceless imports was a compromise that preserved existing > functionality without increasing the implementation effort needed for > the PEP. I must have missed that discussion and couldn't find it in the python-dev archives either. Otherwise I would have chimed in earlier. Do you have a pointer ? > Changing that will require a patch and advocacy from folks that are > actually *fans* of sourceless distribution, as opposed to merely > tolerating them as an arguably necessary evil. What's evil about a sourceless distribution ? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 10 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From greg.ewing at canterbury.ac.nz Thu Feb 10 03:43:06 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 10 Feb 2011 15:43:06 +1300 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> Message-ID: <4D5350BA.6090805@canterbury.ac.nz> On 10/02/11 11:44, Nick Coghlan wrote: > Changing that will require a patch and advocacy from folks that are > actually *fans* of sourceless distribution, as opposed to merely > tolerating them as an arguably necessary evil. Labelling sourceless distributions as a "necessary evil" comes across to me as a political stance, not based on any technical argument. I don't think that this kind of ideological thinking should have any place in deciding what goes into Python. -- Greg From stephen at xemacs.org Thu Feb 10 08:04:46 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Thu, 10 Feb 2011 16:04:46 +0900 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D5350BA.6090805@canterbury.ac.nz> References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> <4D5350BA.6090805@canterbury.ac.nz> Message-ID: <87wrl8e5f5.fsf@uwakimon.sk.tsukuba.ac.jp> Greg Ewing writes: > On 10/02/11 11:44, Nick Coghlan wrote: > > > Changing that will require a patch and advocacy from folks that are > > actually *fans* of sourceless distribution, as opposed to merely > > tolerating them as an arguably necessary evil. > > Labelling sourceless distributions as a "necessary evil" comes > across to me as a political stance, not based on any technical > argument. I don't think that this kind of ideological thinking > should have any place in deciding what goes into Python. Note that Nick is *not* doing any labeling in that post (I don't know what his actual opinion is). He is saying that the people doing the work don't want to do this, and it's up to those who want it to do that work, which at this point will include "not screwing up the existing proposal." I grant that it's likely that there will be ideology-based responses of "oh, let's not do that at all", but that's not where Nick is coming from in the quoted post. From tjreedy at udel.edu Thu Feb 10 08:31:40 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 10 Feb 2011 02:31:40 -0500 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: On 2/9/2011 12:45 PM, Raymond Hettinger wrote: > We've gotten +1 on .pycache from: > Mark Lemburg, Ethan Furman, David Malcolm, and me. A very biased sample. Knowing/remembering that the matter was already discussed and decided, I had no reason to say anything. On Windows, trying to rename a folder from 'New Folder' to '.pycache' (or anything beginning with '.') FAILS with "You must type a file name." Ditto for ordinary files. Whether the illegal name can be forced with Windows API calls that bypass the user-level check, I do not know. > AFAICT, the only thing going for __pycache__ is that that > is was is already in the tree. On Windows, it is a legal filename ;-). -- Terry Jan Reedy From ncoghlan at gmail.com Thu Feb 10 12:42:24 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Feb 2011 21:42:24 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52BD31.6040805@egenix.com> <4D52C9B9.5070709@stoneleaf.us> <20110209183007.40fa6b02@pitrou.net> <3B3983FF-A488-4873-9650-3797FF720A81@gmail.com> Message-ID: On Thu, Feb 10, 2011 at 5:31 PM, Terry Reedy wrote: > On Windows, trying to rename a folder from 'New Folder' to '.pycache' (or > anything beginning with '.') FAILS with "You must type a file name." Ditto > for ordinary files. Whether the illegal name can be forced with Windows API > calls that bypass the user-level check, I do not know. You can create dot-files from the Windows command prompt. Windows Explorer just has additional prejudices as to what constitutes a valid filename. Dot-files are definitely a point of pain when dealing with Unix-ish programs on Windows, though. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Feb 10 13:04:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Feb 2011 22:04:35 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <87wrl8e5f5.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> <4D5350BA.6090805@canterbury.ac.nz> <87wrl8e5f5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Thu, Feb 10, 2011 at 5:04 PM, Stephen J. Turnbull wrote: > Greg Ewing writes: > ?> On 10/02/11 11:44, Nick Coghlan wrote: > ?> > ?> > Changing that will require a patch and advocacy from folks that are > ?> > actually *fans* of sourceless distribution, as opposed to merely > ?> > tolerating them as an arguably necessary evil. > ?> > ?> Labelling sourceless distributions as a "necessary evil" comes > ?> across to me as a political stance, not based on any technical > ?> argument. I don't think that this kind of ideological thinking > ?> should have any place in deciding what goes into Python. > > Note that Nick is *not* doing any labeling in that post (I don't know > what his actual opinion is). ?He is saying that the people doing the > work don't want to do this, and it's up to those who want it to do > that work, which at this point will include "not screwing up the > existing proposal." Yeah, I was just intending to relay the tone of the original discussion (which definitely acquired an ideological flavour at times). As I recall, I was one of those arguing that there are valid, practical use cases for sourceless distribution. However, those use cases were adequately met by the simple solution in the PEP (i.e. retaining support for substituting a bytecode file in place of a source file simply by changing the extension), so that compromise satisfied all parties involved in the discussion at the time. The best reference I found summarising the situation is here: http://www.mail-archive.com/python-dev at python.org/msg45924.html So apparently there was a fair bit of in person discussion at the PyCon 2010 language summit as well. The flurry of PyCon related list activity around that time would also explain why several developers missed the discussion. All that said, you could definitely extend the PEP 3147 idea in 3.3 to allow sourceless imports into multiple Python interpreters in a single directory, I'm just not sure I see the point. The whole goal of PEP 3147 is to allow multiple interpreters to share a single source file without fighting over a single cache location, while still keeping the cached files near the original source files. There's no such sharing benefit when it comes to sourceless distributions, so why not simply have a separate directory hierarchy per interpreter and use the basic file name? Is the desire to do this really common enough to add yet-another-stat-call to the import sequence? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Thu Feb 10 14:17:51 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Feb 2011 23:17:51 +1000 Subject: [Python-ideas] Alternative formatting styles for logging events in Python 3.3 Message-ID: Via the new "style" parameter to logging.Formatter objects, Python 3.2 adds support for newer formatting styles (str.format, string.template) when defining output formats for log messages. However, actual logging calls are still constrained to using %-formatting if they want to benefit from the "lazy formatting" feature (you can obviously generate pre-formatted messages any way you like). For 3.3, I'd like to propose extending this flexibility to the input side as well: 1. Add an optional style parameter to logging.Logger and logging.getLogger. This would then become the "default style" for any messages logged using that logger. In the case of getLogger, if the logger already exists and the styles don't match, raise an exception. 2. Add an optional style parameter to the Logger event recording methods (debug(), info(), et al) and the module level convenience functions. If supplied, this would override the default choice configured in the logger for that particular message. By default, all loggers (including the root logger) would continue to expect %-formatting. However, applications and libraries would be free to use the alternative formatting for their own logging without affecting other loggers. Thoughts? Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mal at egenix.com Thu Feb 10 14:31:43 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 10 Feb 2011 14:31:43 +0100 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> <4D5350BA.6090805@canterbury.ac.nz> <87wrl8e5f5.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <4D53E8BF.2090806@egenix.com> Nick Coghlan wrote: > On Thu, Feb 10, 2011 at 5:04 PM, Stephen J. Turnbull wrote: >> Greg Ewing writes: >> > On 10/02/11 11:44, Nick Coghlan wrote: >> > >> > > Changing that will require a patch and advocacy from folks that are >> > > actually *fans* of sourceless distribution, as opposed to merely >> > > tolerating them as an arguably necessary evil. >> > >> > Labelling sourceless distributions as a "necessary evil" comes >> > across to me as a political stance, not based on any technical >> > argument. I don't think that this kind of ideological thinking >> > should have any place in deciding what goes into Python. >> >> Note that Nick is *not* doing any labeling in that post (I don't know >> what his actual opinion is). He is saying that the people doing the >> work don't want to do this, and it's up to those who want it to do >> that work, which at this point will include "not screwing up the >> existing proposal." > > Yeah, I was just intending to relay the tone of the original > discussion (which definitely acquired an ideological flavour at > times). As I recall, I was one of those arguing that there are valid, > practical use cases for sourceless distribution. However, those use > cases were adequately met by the simple solution in the PEP (i.e. > retaining support for substituting a bytecode file in place of a > source file simply by changing the extension), so that compromise > satisfied all parties involved in the discussion at the time. > > The best reference I found summarising the situation is here: > http://www.mail-archive.com/python-dev at python.org/msg45924.html Thanks for the link. Now I know why I didn't spot this... I would never have assumed such a discussion under a subject line "__file__" :-) > So apparently there was a fair bit of in person discussion at the > PyCon 2010 language summit as well. The flurry of PyCon related list > activity around that time would also explain why several developers > missed the discussion. > > All that said, you could definitely extend the PEP 3147 idea in 3.3 to > allow sourceless imports into multiple Python interpreters in a single > directory, I'm just not sure I see the point. The point is that creating sourceless distros is quite easy in Python2 and it's a feature commercial apps often want to use. They provide a way to protect your code, but are also useful to trim down the size of a distribution (the source code is not needed to use a package). > The whole goal of PEP 3147 is to allow multiple interpreters to share > a single source file without fighting over a single cache location, > while still keeping the cached files near the original source files. > There's no such sharing benefit when it comes to sourceless > distributions, so why not simply have a separate directory hierarchy > per interpreter and use the basic file name? Is the desire to do this > really common enough to add yet-another-stat-call to the import > sequence? Oh yes there is :-) People often download the wrong archives for their Python version and having the possibility to ship the files for all supported Python versions in one package would make them very happy - pretty much for the same reasons the PEP makes distro maintainers happy. (And I don't really see why Linux distro maintainers are any more special than people wanting to create sourceless distributions.) Also note that this won't be another stat call everybody will have to pay for: the import logic would only fall back to the alternative pyc location in case it doesn't find the .py file, so sourceful distributions would not see any extra stat calls. You do add another stat call before raising the ImportError, but that's not really all that much of an issue. Besides, it seems noone is really worried about stat calls anymore anyway... just check how many stat calls are needed to get an interpreter started up with a few eggs sitting in site-packages. Let's add this to Python 3.3. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 10 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From ncoghlan at gmail.com Thu Feb 10 14:54:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 10 Feb 2011 23:54:09 +1000 Subject: [Python-ideas] Changing the name of __pycache__ In-Reply-To: <4D53E8BF.2090806@egenix.com> References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> <4D5350BA.6090805@canterbury.ac.nz> <87wrl8e5f5.fsf@uwakimon.sk.tsukuba.ac.jp> <4D53E8BF.2090806@egenix.com> Message-ID: On Thu, Feb 10, 2011 at 11:31 PM, M.-A. Lemburg wrote: > Oh yes there is :-) People often download the wrong archives > for their Python version and having the possibility to ship > the files for all supported Python versions in one package > would make them very happy - pretty much for the same reasons > the PEP makes distro maintainers happy. (And I don't really > see why Linux distro maintainers are any more special than > people wanting to create sourceless distributions.) A fair point. > Also note that this won't be another stat call everybody will have > to pay for: the import logic would only fall back to the alternative > pyc location in case it doesn't find the .py file, so sourceful > distributions would not see any extra stat calls. > > You do add another stat call before raising the ImportError, but > that's not really all that much of an issue. Our stat call counts are actually per-directory-on-sys.path, so even sourceful distributions see the count go up. > Besides, it seems noone is really worried about stat calls anymore > anyway... just check how many stat calls are needed to get an > interpreter started up with a few eggs sitting in site-packages. I forgot which list it was on (it might even have been the tracker), but there was certainly a request to look at reducing the number of stat calls for Python 3.3. > Let's add this to Python 3.3. I certainly wouldn't object to such a change. A PEP might be advisable to hammer out details like what (if anything) to do with __file__ and the compileall command line update to request inclusion of the magic tag in the generated bytecode files. Ideally, such an interface would allow the bytecode compilation to target a separate directory to further simplify the task of generating bytecode that is separated from its original source code. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From solipsis at pitrou.net Thu Feb 10 16:05:36 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 10 Feb 2011 16:05:36 +0100 Subject: [Python-ideas] Changing the name of __pycache__ References: <4D52593C.5080408@egenix.com> <4D52778C.7000109@egenix.com> <4D52E403.5050205@egenix.com> <4D5350BA.6090805@canterbury.ac.nz> Message-ID: <20110210160536.6e3f60d4@pitrou.net> On Thu, 10 Feb 2011 15:43:06 +1300 Greg Ewing wrote: > On 10/02/11 11:44, Nick Coghlan wrote: > > > Changing that will require a patch and advocacy from folks that are > > actually *fans* of sourceless distribution, as opposed to merely > > tolerating them as an arguably necessary evil. > > Labelling sourceless distributions as a "necessary evil" comes > across to me as a political stance, not based on any technical > argument. The poor debuggability of compiled-only code is certainly a technical argument (and a rather strong one). You might not like the fact that a technical argument can be used to inform political opinions, but that doesn't make that argument any less true. Regards Antoine. From ncoghlan at gmail.com Fri Feb 11 00:00:27 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 11 Feb 2011 09:00:27 +1000 Subject: [Python-ideas] Alternative formatting styles for logging events in Python 3.3 In-Reply-To: <773057.21981.qm@web25805.mail.ukl.yahoo.com> References: <773057.21981.qm@web25805.mail.ukl.yahoo.com> Message-ID: On Fri, Feb 11, 2011 at 3:21 AM, Vinay Sajip wrote: > Perhaps this style could be emphasised in the stdlib documentation. The > XXXMessage classes could be brought into stdlib (not something I'm particularly > advocating, mind you). Given that the ability to pass in something other than a %-formatting format string isn't even *mentioned* in the API documentation for logging.debug [1][2] this trick could definitely use some additional exposure. While this existing capability definitely makes the per-event part of my suggestion redundant, I think the per-logger part of it still has some merit. If loggers are defined as inheriting their formatting style from their parent loggers when an explicit style isn't provided, then an application or library that wants to use an alternate formatting style only needs to set it up once on their primary logger and away they go. (Creating the root logger with a formatting style other than '%' would obviously be a bad idea, so it may be worth issuing an explicit warning when someone does that) Cheers, Nick. [1] http://docs.python.org/dev/library/logging#logging.Logger.debug [2] http://docs.python.org/dev/library/logging#logging.debug -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From kgeza7 at gmail.com Sat Feb 12 04:48:38 2011 From: kgeza7 at gmail.com (=?ISO-8859-1?B?R+l6YQ==?=) Date: Fri, 11 Feb 2011 19:48:38 -0800 Subject: [Python-ideas] adding possibility for declaring a function in Matlab's way In-Reply-To: References: Message-ID: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> It would be nice if you could write a function header like this (besides, of course, the current way): def result=functionname(params): ... result=something This would suffice for most functions, since you usually return one type of value, and it can be very convenient in certain cases. This is the way it is done e.g. in Matlab, which also has a huge user base. Some more details to the idea: - The return values are initialized to None. - Setting the return values does not need to be the last line in the function. - You can use the "return" statement just as before, but without arguments, to return from anywhere in the code. - If you specify arguments to the "return" statement, Python stops with an exception. - The return value can be a tuple: def (result1, result2, result3)=functionname(parameters) Some advantages: - You can easily see what the function returns, without having to read the function body, or hoping to find it in the comments. - You can initialize the return values (if None is not good enough), and then care about the cases where they change. - If you change the setup of the return value (e.g. insert a new item into the tuple), you do not need to change the "return" statement at possibly several places in the function body. - It is very easy to write the function call prototype: just copy the function declaration without the "def" and final colon. Python GUIs will be able to do the same, thus not only giving you the function parameter template automatically, but also the return value template. Some disadvantages: - I suggest it as an addition to the current way, so there isn't any serious disadvantage. One person may decide to use one way, one the other. - Of course, if you mix the two types of function declarations in your software, you may need to look at the function header to see which one you used in the specific case. - You need to be aware of both ways when reading someone else's code --- which is not hard, as both ways are quite easy to read. The idea at this stage of Python development may be surprising, but I hope that nevertheless you will consider it seriously. There has been a lot of experience and developlment regarding this in connection with Matlab, and I am sure that many of you know better than me how it would fit into Python's philosophy, and what consequences adding it may have. Thanks for your time, and best regard! G?za From tjreedy at udel.edu Sat Feb 12 05:05:20 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 11 Feb 2011 23:05:20 -0500 Subject: [Python-ideas] adding possibility for declaring a function in Matlab's way In-Reply-To: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> References: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> Message-ID: On 2/11/2011 10:48 PM, G?za wrote: > It would be nice if you could write a function header like this > (besides, of course, the current way): > > def result=functionname(params): > ... > result=something > > This would suffice for most functions, since you usually return one type > of value, and it can be very convenient in certain cases. This is the > way it is done e.g. in Matlab, which also has a huge user base. Perhaps you should also suggest to the Matlab people that they add Python-style declarations to Matlab;-! After all, Python also has a huge user base. > Some disadvantages: > - I suggest it as an addition to the current way, so there isn't any > serious disadvantage. One person may decide to use one way, one the other. This is a huge disadvantage. Everyone would have to learn two equivalent syntaxes instead of one, which would make the language much more difficult to learn. Python's syntax is essentially frozen except for possible minor additions that show some real gain. -- Terry Jan Reedy From bruce at leapyear.org Sat Feb 12 05:10:30 2011 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 11 Feb 2011 20:10:30 -0800 Subject: [Python-ideas] adding possibility for declaring a function in Matlab's way In-Reply-To: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> References: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> Message-ID: Before suggesting "improvements" to Python (or anything else for that matter), it's helpful to identify exactly what problem you are trying to solve. I don't see one. And having multiple entirely different ways to do things for no good reason mean code is harder to read. Google TOOWTDI for more info. If you love this paradigm I suggest you write it this way: def foo(): global result result = None if bar() is not None: raise UnnecessaryException return result def bar(): pass # real code goes here --- Bruce New Puzzazz newsletter: http://j.mp/puzzazz-news-2011-02 On Fri, Feb 11, 2011 at 7:48 PM, G?za wrote: > It would be nice if you could write a function header like this (besides, > of course, the current way): > > def result=functionname(params): > ... > result=something > > This would suffice for most functions, since you usually return one type of > value, and it can be very convenient in certain cases. This is the way it is > done e.g. in Matlab, which also has a huge user base. > > Some more details to the idea: > - The return values are initialized to None. > - Setting the return values does not need to be the last line in the > function. > - You can use the "return" statement just as before, but without arguments, > to return from anywhere in the code. > - If you specify arguments to the "return" statement, Python stops with an > exception. > - The return value can be a tuple: def (result1, result2, > result3)=functionname(parameters) > > Some advantages: > - You can easily see what the function returns, without having to read the > function body, or hoping to find it in the comments. > - You can initialize the return values (if None is not good enough), and > then care about the cases where they change. > - If you change the setup of the return value (e.g. insert a new item into > the tuple), you do not need to change the "return" statement at possibly > several places in the function body. > - It is very easy to write the function call prototype: just copy the > function declaration without the "def" and final colon. Python GUIs will be > able to do the same, thus not only giving you the function parameter > template automatically, but also the return value template. > > Some disadvantages: > - I suggest it as an addition to the current way, so there isn't any > serious disadvantage. One person may decide to use one way, one the other. > - Of course, if you mix the two types of function declarations in your > software, you may need to look at the function header to see which one you > used in the specific case. > - You need to be aware of both ways when reading someone else's code --- > which is not hard, as both ways are quite easy to read. > > The idea at this stage of Python development may be surprising, but I hope > that nevertheless you will consider it seriously. > There has been a lot of experience and developlment regarding this in > connection with Matlab, > and I am sure that many of you know better than me how it would fit into > Python's philosophy, and what consequences adding it may have. > > Thanks for your time, and best regard! > G?za > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sat Feb 12 07:42:41 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 12 Feb 2011 15:42:41 +0900 Subject: [Python-ideas] adding possibility for declaring a function in Matlab's way In-Reply-To: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> References: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> Message-ID: <87ei7dkb32.fsf@uwakimon.sk.tsukuba.ac.jp> G?za writes: > Some advantages: > - You can easily see what the function returns, without having to read the > function body, or hoping to find it in the comments. This is probably true in Matlab, and *if* the programmer gives it a good name. But many programmers will care more about saving (len variable-name) keystrokes that about giving good names, so you will literally see def result = foo (**args): # code goes here So, is result a list? A function? An instance of some class? Maybe it's polymorphic. -1 overall; the syntax we have already is readable enough. Also, some of the things you might want to use this for probably can be done with decoraters. From stefan_ml at behnel.de Sat Feb 12 08:36:33 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sat, 12 Feb 2011 08:36:33 +0100 Subject: [Python-ideas] adding possibility for declaring a function in Matlab's way In-Reply-To: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> References: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> Message-ID: G?za, 12.02.2011 04:48: > It would be nice if you could write a function header like this (besides, > of course, the current way): > > def result=functionname(params): > ... > result=something Maybe there should be a way to define a function that only returns a value based on an expression and that doesn't require a redundant 'return' statement. You know, like lambda. *wink* Stefan From azrael.zila at gmail.com Sat Feb 12 12:46:01 2011 From: azrael.zila at gmail.com (Arthur) Date: Sat, 12 Feb 2011 09:46:01 -0200 Subject: [Python-ideas] Python-ideas Digest, Vol 51, Issue 16 In-Reply-To: References: Message-ID: You can return a tuple instead change the syntax... def res1, res2 = foo(): #code here can be def foo(): #code here return (res1,res2) At? mais! Ass.: Arthur Juli?o ------------------------------------------------------------------------------------------------ "Quero que a estrada venha sempre at? voc? e que o vento esteja sempre a seu favor, quero que haja sempre uma cerveja em sua m?o e que esteja ao seu lado seu grande amor." (Tempo Ruim - A Arte do Insulto - Matanza) 2011/2/12 > Send Python-ideas mailing list submissions to > python-ideas at python.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://mail.python.org/mailman/listinfo/python-ideas > or, via email, send a message with subject or body 'help' to > python-ideas-request at python.org > > You can reach the person managing the list at > python-ideas-owner at python.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Python-ideas digest..." > > > Today's Topics: > > 1. adding possibility for declaring a function in Matlab's way (G?za) > 2. Re: adding possibility for declaring a function in Matlab's > way (Terry Reedy) > 3. Re: adding possibility for declaring a function in Matlab's > way (Bruce Leban) > 4. adding possibility for declaring a function in Matlab's way > (Stephen J. Turnbull) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Fri, 11 Feb 2011 19:48:38 -0800 > From: G?za > To: "python-ideas" > Subject: [Python-ideas] adding possibility for declaring a function in > Matlab's way > Message-ID: <7C2FD842074A470387F4F83B280C22A3 at GezaVAIO> > Content-Type: text/plain; format=flowed; charset="ISO-8859-1"; > reply-type=original > > It would be nice if you could write a function header like this (besides, > of > course, the current way): > > def result=functionname(params): > ... > result=something > > This would suffice for most functions, since you usually return one type of > value, and it can be very convenient in certain cases. This is the way it > is > done e.g. in Matlab, which also has a huge user base. > > Some more details to the idea: > - The return values are initialized to None. > - Setting the return values does not need to be the last line in the > function. > - You can use the "return" statement just as before, but without arguments, > to return from anywhere in the code. > - If you specify arguments to the "return" statement, Python stops with an > exception. > - The return value can be a tuple: def (result1, result2, > result3)=functionname(parameters) > > Some advantages: > - You can easily see what the function returns, without having to read the > function body, or hoping to find it in the comments. > - You can initialize the return values (if None is not good enough), and > then care about the cases where they change. > - If you change the setup of the return value (e.g. insert a new item into > the tuple), you do not need to change the "return" statement at possibly > several places in the function body. > - It is very easy to write the function call prototype: just copy the > function declaration without the "def" and final colon. Python GUIs will be > able to do the same, thus not only giving you the function parameter > template automatically, but also the return value template. > > Some disadvantages: > - I suggest it as an addition to the current way, so there isn't any > serious > disadvantage. One person may decide to use one way, one the other. > - Of course, if you mix the two types of function declarations in your > software, you may need to look at the function header to see which one you > used in the specific case. > - You need to be aware of both ways when reading someone else's code --- > which is not hard, as both ways are quite easy to read. > > The idea at this stage of Python development may be surprising, but I hope > that nevertheless you will consider it seriously. > There has been a lot of experience and developlment regarding this in > connection with Matlab, > and I am sure that many of you know better than me how it would fit into > Python's philosophy, and what consequences adding it may have. > > Thanks for your time, and best regard! > G?za > > > > ------------------------------ > > Message: 2 > Date: Fri, 11 Feb 2011 23:05:20 -0500 > From: Terry Reedy > To: python-ideas at python.org > Subject: Re: [Python-ideas] adding possibility for declaring a > function in Matlab's way > Message-ID: > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > On 2/11/2011 10:48 PM, G?za wrote: > > It would be nice if you could write a function header like this > > (besides, of course, the current way): > > > > def result=functionname(params): > > ... > > result=something > > > > This would suffice for most functions, since you usually return one type > > of value, and it can be very convenient in certain cases. This is the > > way it is done e.g. in Matlab, which also has a huge user base. > > Perhaps you should also suggest to the Matlab people that they add > Python-style declarations to Matlab;-! After all, Python also has a huge > user base. > > > Some disadvantages: > > - I suggest it as an addition to the current way, so there isn't any > > serious disadvantage. One person may decide to use one way, one the > other. > > This is a huge disadvantage. Everyone would have to learn two equivalent > syntaxes instead of one, which would make the language much more > difficult to learn. > > Python's syntax is essentially frozen except for possible minor > additions that show some real gain. > > -- > Terry Jan Reedy > > > > > ------------------------------ > > Message: 3 > Date: Fri, 11 Feb 2011 20:10:30 -0800 > From: Bruce Leban > To: G?za > Cc: python-ideas > Subject: Re: [Python-ideas] adding possibility for declaring a > function in Matlab's way > Message-ID: > > Content-Type: text/plain; charset="iso-8859-1" > > Before suggesting "improvements" to Python (or anything else for that > matter), it's helpful to identify exactly what problem you are trying to > solve. I don't see one. And having multiple entirely different ways to do > things for no good reason mean code is harder to read. Google TOOWTDI for > more info. > > If you love this paradigm I suggest you write it this way: > > def foo(): > global result > result = None > if bar() is not None: > raise UnnecessaryException > return result > def bar(): > pass # real code goes here > > > --- Bruce > New Puzzazz newsletter: http://j.mp/puzzazz-news-2011-02 > > > > On Fri, Feb 11, 2011 at 7:48 PM, G?za wrote: > > > It would be nice if you could write a function header like this (besides, > > of course, the current way): > > > > def result=functionname(params): > > ... > > result=something > > > > This would suffice for most functions, since you usually return one type > of > > value, and it can be very convenient in certain cases. This is the way it > is > > done e.g. in Matlab, which also has a huge user base. > > > > Some more details to the idea: > > - The return values are initialized to None. > > - Setting the return values does not need to be the last line in the > > function. > > - You can use the "return" statement just as before, but without > arguments, > > to return from anywhere in the code. > > - If you specify arguments to the "return" statement, Python stops with > an > > exception. > > - The return value can be a tuple: def (result1, result2, > > result3)=functionname(parameters) > > > > Some advantages: > > - You can easily see what the function returns, without having to read > the > > function body, or hoping to find it in the comments. > > - You can initialize the return values (if None is not good enough), and > > then care about the cases where they change. > > - If you change the setup of the return value (e.g. insert a new item > into > > the tuple), you do not need to change the "return" statement at possibly > > several places in the function body. > > - It is very easy to write the function call prototype: just copy the > > function declaration without the "def" and final colon. Python GUIs will > be > > able to do the same, thus not only giving you the function parameter > > template automatically, but also the return value template. > > > > Some disadvantages: > > - I suggest it as an addition to the current way, so there isn't any > > serious disadvantage. One person may decide to use one way, one the > other. > > - Of course, if you mix the two types of function declarations in your > > software, you may need to look at the function header to see which one > you > > used in the specific case. > > - You need to be aware of both ways when reading someone else's code --- > > which is not hard, as both ways are quite easy to read. > > > > The idea at this stage of Python development may be surprising, but I > hope > > that nevertheless you will consider it seriously. > > There has been a lot of experience and developlment regarding this in > > connection with Matlab, > > and I am sure that many of you know better than me how it would fit into > > Python's philosophy, and what consequences adding it may have. > > > > Thanks for your time, and best regard! > > G?za > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://mail.python.org/pipermail/python-ideas/attachments/20110211/ba82e444/attachment-0001.html > > > > ------------------------------ > > Message: 4 > Date: Sat, 12 Feb 2011 15:42:41 +0900 > From: "Stephen J. Turnbull" > To: G?za > Cc: python-ideas > Subject: [Python-ideas] adding possibility for declaring a function in > Matlab's way > Message-ID: <87ei7dkb32.fsf at uwakimon.sk.tsukuba.ac.jp> > Content-Type: text/plain; charset=iso-8859-1 > > G?za writes: > > > Some advantages: > > - You can easily see what the function returns, without having to read > the > > function body, or hoping to find it in the comments. > > This is probably true in Matlab, and *if* the programmer gives it a > good name. But many programmers will care more about saving (len > variable-name) keystrokes that about giving good names, so you will > literally see > > def result = foo (**args): > # code goes here > > So, is result a list? A function? An instance of some class? Maybe > it's polymorphic. > > -1 overall; the syntax we have already is readable enough. > > Also, some of the things you might want to use this for probably can > be done with decoraters. > > > > > > > ------------------------------ > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > > End of Python-ideas Digest, Vol 51, Issue 16 > ******************************************** > -------------- next part -------------- An HTML attachment was scrubbed... URL: From vinay_sajip at yahoo.co.uk Sat Feb 12 20:58:24 2011 From: vinay_sajip at yahoo.co.uk (Vinay Sajip) Date: Sat, 12 Feb 2011 19:58:24 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?Alternative_formatting_styles_for_loggin?= =?utf-8?q?g_events=09in_Python_3=2E3?= References: <773057.21981.qm@web25805.mail.ukl.yahoo.com> Message-ID: Nick Coghlan writes: > Given that the ability to pass in something other than a %-formatting > format string isn't even *mentioned* in the API documentation for > logging.debug [1][2] this trick could definitely use some additional > exposure. I'll update the documentation soon. Of course you're not passing a str.format string directly to the logging call, but a BraceMessage style object. This usage is documented, but I agree a link from the logger.debug() etc. calls would be an improvement. > While this existing capability definitely makes the per-event part of > my suggestion redundant, I think the per-logger part of it still has > some merit. If loggers are defined as inheriting their formatting > style from their parent loggers when an explicit style isn't provided, > then an application or library that wants to use an alternate > formatting style only needs to set it up once on their primary logger > and away they go. (Creating the root logger with a formatting style > other than '%' would obviously be a bad idea, so it may be worth > issuing an explicit warning when someone does that) An application or library can still easily use the BraceMessage/DollarMessage approach which I outlined in my reply to your first post, which got to you but was bounced from python-ideas. This does not introduce the complication of handling the root logger's style, or what to do if a using application or logger in the same part of the hierarchy wants to use a different style (perhaps this could happen with namespace packages). Since my original reply got lost from python-ideas, I'll just reproduce it here: > From: Nick Coghlan > Via the new "style" parameter to logging.Formatter objects, Python 3.2 > adds support for newer formatting styles (str.format, string.template) > when defining output formats for log messages. However, actual logging > calls are still constrained to using %-formatting if they want to > benefit from the "lazy formatting" feature (you can obviously generate > pre-formatted messages any way you like). Actually, that's not the case. For example, the following script: #!/usr/bin/env python import logging class BraceMessage(object): def __init__(self, fmt, *args, **kwargs): self._fmt = fmt self._args = args self._kwargs = kwargs def __str__(self): return self._fmt.format(*self._args, **self._kwargs) _ = BraceMessage def main(): logger = logging.getLogger('fmttest') logger.debug(_('Message {verb} using {0}', 'braces', verb='formatted')) if __name__ == '__main__': root = logging.getLogger() root.addHandler(logging.StreamHandler()) root.setLevel(logging.DEBUG) main() works correctly today on Python 2.6, 2.7, 3.0, 3.1 and 3.2 to print Message formatted using braces as you might expect. Furthermore, the actual formatting is deferred or "lazy" (exactly as for %-formatting). Note that you could have also used a DollarMessage class which uses string.Template for formatting. Note that this allows flexible formatting at a per-call (rather than per-logger) level. The use of an alias for the class would make things more readable (I used _ just for convenience, and I'm of course aware of its usual aliasing to gettext, but a suitably brief alternative could be used instead). Note that DollarMessage/BraceMessage are described here http://plumberjack.blogspot.com/2010/10/supporting-alternative-formatting.html and already available in the logutils package, see http://code.google.com/p/logutils/ and http://packages.python.org/logutils/whatsnew.html Perhaps this style could be emphasised in the stdlib documentation. The XXXMessage classes could be brought into stdlib (not something I'm particularly advocating, mind you). Regards, Vinay Sajip From greg.ewing at canterbury.ac.nz Sat Feb 12 23:41:06 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 13 Feb 2011 11:41:06 +1300 Subject: [Python-ideas] adding possibility for declaring a function in Matlab's way In-Reply-To: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> References: <7C2FD842074A470387F4F83B280C22A3@GezaVAIO> Message-ID: <4D570C82.5090200@canterbury.ac.nz> G?za wrote: > def result=functionname(params): > ... > result=something This doesn't seem to be a great improvement over def functionname(params): ... result = something ... return result About the only one of your points it doesn't cover is returning from the middle of the function without an explicit value, which many people would regard as bad style anyway. > - It is very easy to write the function call prototype: just copy the > function declaration without the "def" and final colon. And delete any 'self' argument, default values, annotations, * and ** arguments, etc. etc. just as you have to do today... -- Greg From ncoghlan at gmail.com Sun Feb 13 02:30:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 13 Feb 2011 11:30:53 +1000 Subject: [Python-ideas] Alternative formatting styles for logging events in Python 3.3 In-Reply-To: References: <773057.21981.qm@web25805.mail.ukl.yahoo.com> Message-ID: On Sun, Feb 13, 2011 at 5:58 AM, Vinay Sajip wrote: > An application or library can still easily use the BraceMessage/DollarMessage > approach which I outlined in my reply to your first post, which got to you but > was bounced from python-ideas. This does not introduce the complication of > handling the root logger's style, or what to do if a using application or logger > in the same part of the hierarchy wants to use a different style (perhaps this > could happen with namespace packages). Very good points. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From kgeza7 at gmail.com Mon Feb 14 10:47:59 2011 From: kgeza7 at gmail.com (=?iso-8859-1?B?R+l6YQ==?=) Date: Mon, 14 Feb 2011 01:47:59 -0800 Subject: [Python-ideas] adding possibility for declaring a function in Matlab's way In-Reply-To: References: Message-ID: <6F0CAD58DCDF40CDBA300BFCF6A10713@GezaVAIO> All, Thank you for the constructive remarks, especially Bruce Leban for pointing out the TOOWTDI principle, which is why it wouldn't fit into the Python philosophy, and Stephen J. Turnbull for the idea of doing it with decorators. I could actually implement something similar to what I wanted with decorators, although it still needs improvement. All the best, G?za P.S. This is somewhat off-topic, but if someone has ideas on improving it, personal replies are welcome. Two things that are not really nice about it: one, you need to specify the variable names between quotes, two, you may get "unused variable" warnings for the return variables. See the code below. (I intentionally did not try to initialize the local variables to None, as I realized that this is the behavior of Matlab, too, which makes sense: You should be told if you forget to calculate the value of a return variable.) ''' MtFn.py version 1.0 Uses code from persistent_locals2 by Pietro Berkes and Andrea Maffezzoli. @author: Geza Kiss ''' import sys class MtFn: def __init__(self, sRetVars): self.lRetVars=[s.strip() for s in sRetVars.split(',')] self._locals = {} def __call__(self, function): def Mt2PyFn(*args): def tracer(frame, event, arg): if event=='return': self._locals = frame.f_locals.copy() prev_tracer=sys.setprofile(tracer) try: res=function(*args) finally: sys.setprofile(prev_tracer) if res: raise ValueError("You must not return a value in Matlab-like (@MtFn) function '%s'." % (function.__name__)) try: return tuple([self._locals[var] for var in self.lRetVars]) except KeyError as exc: raise KeyError("Return value '%s' is not set in Matlab-like (@MtFn) function '%s'." % (exc.args[0], function.__name__)) return Mt2PyFn ---------------- ''' MtFnTest.py ''' from MtFn import MtFn @MtFn('a,b') def fun(x,y): a=x+y b=x-y print fun(5,6) From ronaldoussoren at mac.com Tue Feb 15 13:23:35 2011 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 15 Feb 2011 13:23:35 +0100 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <4D507391.202@canterbury.ac.nz> References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> Message-ID: On 7 Feb, 2011, at 23:34, Greg Ewing wrote: > Steven D'Aprano wrote: > >> Do you have a reliable source for that claim about "most" people that is relevant to Python coders? We're not all using Microsoft VisualStudio :) > > I'm not talking about IDEs. I'm talking about things like the > Terminal in MacOSX, the cmd window in Windows, and equivalent > things in the Linux and X11 worlds. It's very rare nowadays > for anyone to be using a command-line style interface in > anything that doesn't have scroll bars attached to it. Yes, but that doesn't mean I want help(something) to dump loads of text in my terminal window and require me to scoll back in the terminal to see the bit I want. > >> What you're describing *is* a pager. > > Yes, of course, but it's one better matched to the characteristics > of the environment I usually find myself working in nowadays. > Using a pager designed for glass ttys in a Terminal window is > actually *worse* in many ways than just dumping the text out > with no pager at all. I don't agree. I'm quite happy with help in its current form. Ronald From ncoghlan at gmail.com Tue Feb 15 16:18:53 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 16 Feb 2011 01:18:53 +1000 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> Message-ID: On Tue, Feb 15, 2011 at 10:23 PM, Ronald Oussoren wrote: > > I don't agree. I'm quite happy with help in its current form. Indeed, if the default pager isn't suitable for the default terminal program, that sounds like something to bring up with the OS developer. (I have no problems with the default setup in Kubuntu) Separating out the string generation from the display process for 3.3. may be a reasonable idea, though (similar to the separation of dis.code_info() and dis.show_code() in 3.2) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Wed Feb 16 02:08:42 2011 From: rrr at ronadam.com (Ron Adam) Date: Tue, 15 Feb 2011 19:08:42 -0600 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> Message-ID: <4D5B239A.1080009@ronadam.com> On 02/15/2011 09:18 AM, Nick Coghlan wrote: > On Tue, Feb 15, 2011 at 10:23 PM, Ronald Oussoren > wrote: >> >> I don't agree. I'm quite happy with help in its current form. > > Indeed, if the default pager isn't suitable for the default terminal > program, that sounds like something to bring up with the OS developer. > (I have no problems with the default setup in Kubuntu) > > Separating out the string generation from the display process for 3.3. > may be a reasonable idea, though (similar to the separation of > dis.code_info() and dis.show_code() in 3.2) That would be a good first step to any further improvements as well. I was playing around with this today, and noticed the pager isn't used for everything, just for individual object results no matter how short or long. It's not used with any of the following commands: 'help', 'keywords', 'topics', 'symbols', or 'modules'. And if you give 'modules' a key, then the synopsis function uses print(). For example, help('modules test') results in a very long list that uses print() to output each line. Some parts of pydoc are designed so you can replace stdin and stdout, but others parts still don't seem to work that way. Does anyone actually use that feature? Cheers, Ron From rrr at ronadam.com Sat Feb 19 17:36:06 2011 From: rrr at ronadam.com (Ron Adam) Date: Sat, 19 Feb 2011 10:36:06 -0600 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> Message-ID: <4D5FF176.9080300@ronadam.com> On 02/15/2011 09:18 AM, Nick Coghlan wrote: > Separating out the string generation from the display process for 3.3. > may be a reasonable idea, though (similar to the separation of > dis.code_info() and dis.show_code() in 3.2) The frustrating part is trying to do this in a way that is acceptable. Actually doing it is not that difficult. I've been working on it by chipping away from the outside, but it's going to be a very slow process if we need to depreciate each existing API, and then add alternative new API's as we go. Here is a patch on rietveld which includes the items listed below. http://codereview.appspot.com/4173064 * Remove old web server stuff depreciated in 3.2. (Will be done in 3.3) * Separate the topics and topic retrieval parts into a single HelpData class with a single method to get the topic text and xrefs list. We can make this better by making the three dictionaries (keywords, topics, and symbols) use the same value formats. Currently there are slightly different ways they each store their value. It may be possible to have the xrefs auto generated and stored along with the topic text. * Rewrote the helper class by using the cmd.Cmd class. This works out nicely. After doing this, and a few other things to make it all work, there is only a single call to the pager in the Helper.default() method. It's behavior is not changed in any way. In fact, it could very easily be moved out of pydoc and refactored at this point. But I don't think we can do all of this at one time for backward compatibility reasons.(?) So currently I'm looking for guidance on what and how I can best go ahead with some parts of this. Cheers, Ron (Away for most of today. will respond to comments this evening.) From masklinn at masklinn.net Sat Feb 19 18:12:04 2011 From: masklinn at masklinn.net (Masklinn) Date: Sat, 19 Feb 2011 18:12:04 +0100 Subject: [Python-ideas] wsgiref.simple_server should mount and serve a provided WSGI application script Message-ID: Many (most?) WSGI servers use WSGI application scripts (a Python script with a `.py` or `.wsgi` extension generally, providing a global `application` variable) to setup and mount applications. Currently, `python -m wsgiref.simple_server` mounts a trivial "hello world" application and opens it in the web browser. It would be nice if `python -m wsgiref.simple_server file` mounted and served the application set up by the script instead. From masklinn at masklinn.net Sat Feb 19 18:57:13 2011 From: masklinn at masklinn.net (Masklinn) Date: Sat, 19 Feb 2011 18:57:13 +0100 Subject: [Python-ideas] Cleanup, uniformization and documentation of stdlib module reactions to `python -m ` Message-ID: <10B8F4C0-D80E-4E84-B54B-1F92497C64DB@masklinn.net> There are three ways a Python module can react to `-m`: do nothing, selftest (via doctests or via python code) and do something "useful" (for Python users). The former is fine, but the latter two have more issues: For modules which selftest, the self-tests/sanity check are generally undocumented and potentially long-running or destructive. It would be nice if those were protected behind an argument (e.g. `-x` for `?execute`) just in case For "useful" modules, the issues are far more complex: 1. Many (if not most) are under- or un-documented, no information being available in their module documentation, and/or via the command-line (ftplib, modulefinder, dis), this makes them into little more than easter eggs. Some are documented in the standard library documentation but not the command-line (http.server), others are documented on the command-line but not in their module documentation (smtpd). It would be nice if all of these tools were documented in both places, but the bare minimum would be for all of them to provide CLI help (one which explains at the very least what the command does, here again some of the commands providing a "cli help" give little more information than a list of options: no information on the command itself, and no documentation of what the various options do). 2. The arguments parsing is inconsistent, I've seen all of manual parsing (ftplib, http.server, pstats), getopt (smtpd, timeit, trace, webbrowser), optparse (calendar, cProfile, uu) and argparse (compileall, nntplib), and the reactions to standard flags (e.g. `-h`/`?help`) is a crapshoot, modules may or may not crash completely or display parse errors (when provided with a `-h` flag). All modules should use the same arguments parsing library (argparse), which would automatically yield a correct reaction to `-h` across the board. Even modules not using options themselves (e.g. http.server) should do so, so they can react correctly to `-h`. There are many hidden gems in the stdlib, and CLI usage of modules is one of them: some are pretty well-known (timeit, http.server), others are more rarely known (urllib, cProfiler) but I'd wager the vast majority are almost never used because the vast majority of Python developers have no idea they can be used that way. From ncoghlan at gmail.com Sun Feb 20 01:35:17 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 20 Feb 2011 10:35:17 +1000 Subject: [Python-ideas] Cleaner separation of help() and interactive help. In-Reply-To: <4D5FF176.9080300@ronadam.com> References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> <4D5FF176.9080300@ronadam.com> Message-ID: On Sun, Feb 20, 2011 at 2:36 AM, Ron Adam wrote: > But I don't think we can do all of this at one time for backward > compatibility reasons.(?) ?So currently I'm looking for guidance on what and > how I can best go ahead with some parts of this. I believe the approach we ended up using for the HTML parts in 3.2 (i.e. leave the "old" way around, but deprecated, for anyone using the undocumented-but-public APIs, while adding new APIs for the new, better way) should work in this case as well. It does delay the removal of the old code until 3.4, but it's the most conservative way to give people a chance to fix anything that breaks (the downside of course being that anyone using the APIs has to do some version specific tap-dancing to select which API to use). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Mon Feb 21 19:56:21 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 21 Feb 2011 12:56:21 -0600 Subject: [Python-ideas] Depreciation / Replacements -was- Re: Cleaner separation of help() and interactive help. In-Reply-To: References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> <4D5FF176.9080300@ronadam.com> Message-ID: <4D62B555.60209@ronadam.com> On 02/19/2011 06:35 PM, Nick Coghlan wrote: > On Sun, Feb 20, 2011 at 2:36 AM, Ron Adam wrote: >> But I don't think we can do all of this at one time for backward >> compatibility reasons.(?) So currently I'm looking for guidance on what and >> how I can best go ahead with some parts of this. > > I believe the approach we ended up using for the HTML parts in 3.2 > (i.e. leave the "old" way around, but deprecated, for anyone using the > undocumented-but-public APIs, while adding new APIs for the new, > better way) should work in this case as well. It does delay the > removal of the old code until 3.4, but it's the most conservative way > to give people a chance to fix anything that breaks (the downside of > course being that anyone using the APIs has to do some version > specific tap-dancing to select which API to use). The server replacement wasn't too bad as the old server and tkinter gui panel were not written so they could be used independently of pydoc. The new server and supporting code was either made private or put inside a function body so they could changed easily and/or be moved out of pydoc and be made available for other modules to be used. So what we did made sense in the context of what was done. As we go further with updating pydoc, it's going to have a fair amount of re-factoring. If we follow the same pattern as the server upgrade, much of the existing pydoc API will be renamed and/or removed. I'm not sure that makes as much sense. At what point is it better to depreciate the whole module and create a new replacement module? And conversely, are there limits on how much, and how fast, a module can be changed? Pydoc fortunately is almost exclusively used "as is" rather than as an extension module. So it is less likely that re-factoring it will break other peoples programs. (possible though) With other modules, we'd probably freeze the current module and create a new one with a different name, if it was going to require major re-factoring. In those cases, how long do the old modules versions stick around? Cheers, Ron From tjreedy at udel.edu Wed Feb 23 02:27:41 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 22 Feb 2011 20:27:41 -0500 Subject: [Python-ideas] Depreciation / Replacements -was- Re: Cleaner separation of help() and interactive help. In-Reply-To: <4D62B555.60209@ronadam.com> References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> <4D5FF176.9080300@ronadam.com> <4D62B555.60209@ronadam.com> Message-ID: On 2/21/2011 1:56 PM, Ron Adam wrote: > As we go further with updating pydoc, it's going to have a fair amount > of re-factoring. If we follow the same pattern as the server upgrade, > much of the existing pydoc API will be renamed and/or removed. I'm not > sure that makes as much sense. > > At what point is it better to depreciate the whole module and create a > new replacement module? > > And conversely, are there limits on how much, and how fast, a module can > be changed? > > > Pydoc fortunately is almost exclusively used "as is" rather than as an > extension module. So it is less likely that re-factoring it will break > other peoples programs. (possible though) As I said somewhere on the tracker, the only (intended) public apis of the pydoc module are the help function and the command-line interface. I consider the rest private and subject to change. -- Terry Jan Reedy From rrr at ronadam.com Thu Feb 24 05:44:41 2011 From: rrr at ronadam.com (Ron Adam) Date: Wed, 23 Feb 2011 22:44:41 -0600 Subject: [Python-ideas] Depreciation / Replacements -was- Re: Cleaner separation of help() and interactive help. In-Reply-To: References: <4D5060C0.3050802@canterbury.ac.nz> <4D506D5C.70207@pearwood.info> <4D507391.202@canterbury.ac.nz> <4D5FF176.9080300@ronadam.com> <4D62B555.60209@ronadam.com> Message-ID: <4D65E239.9020805@ronadam.com> On 02/22/2011 07:27 PM, Terry Reedy wrote: > On 2/21/2011 1:56 PM, Ron Adam wrote: > >> As we go further with updating pydoc, it's going to have a fair amount >> of re-factoring. If we follow the same pattern as the server upgrade, >> much of the existing pydoc API will be renamed and/or removed. I'm not >> sure that makes as much sense. >> >> At what point is it better to depreciate the whole module and create a >> new replacement module? >> >> And conversely, are there limits on how much, and how fast, a module can >> be changed? >> >> >> Pydoc fortunately is almost exclusively used "as is" rather than as an >> extension module. So it is less likely that re-factoring it will break >> other peoples programs. (possible though) > > As I said somewhere on the tracker, the only (intended) public apis of the > pydoc module are the help function and the command-line interface. I > consider the rest private and subject to change. I agree, but I've gotten different opinions on that. It certainly would make it easier to re-factor if we can get a consensus on just what the public API is. What do you think about extending help() so it is the only pydoc API? In other words, everything is done through the help function either by directly calling help(something) or by passing args from the command line to help, ... help(args.) This already works, but not for everything. Cheers, Ron From ncoghlan at gmail.com Thu Feb 24 13:23:49 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 24 Feb 2011 22:23:49 +1000 Subject: [Python-ideas] A couple of with statement ideas Message-ID: I've been considering which broad concerns I want to focus on for 3.3, and have come to the conclusion that I'm going to have plenty to keep me busy between helping to smooth over some of the rough edges left over from the PEP 3118 implementation, as well as championing the module aliasing concept I first brought up a few weeks ago. That means there are a couple of with statement ideas that I still like, but almost certainly won't have the time to champion myself. So, I'm lobbing them over the fence in a half-baked form. If anyone feels strongly enough about them to get them into a PEP-worthy state, go right ahead (that's the whole point of this message). If nobody else cares enough to pick them up... well, that will simply justify my decision that they weren't important enough to worry about. Idea 1: Implicit context managers PEP 343 included the idea of a "__context__" method that permitted the creation of objects with an implicit associated context manager. It was eventually dropped because we couldn't find a good way to explain it to users at the time. Instead, you see things like "decimal.localcontext()" and various other objects that need to be called every time you want to use them as a context manager. If you want an object to have a "native" context manager, you have to either write __enter__ and __exit__ manually, or use a mixin or class decorator that adapts the __enter__/__exit__ interface to a single method (e.g. the ContextManager class and manage_context method from GarlicSim that Ram Rachum posted about here some time ago). Now that everyone has had time to get used to the way context managers work, I believe the concept may usefully make a return using the "implicit context manager" terminology. While the "__context__" name has since been claimed by PEP 3134, the alternative name used in PEP 346 ("__with__") is still a possibility. The idea of bringing back this concept would be to allow an object to implement *either* a __with__ method that returns a context manager, or else implement __enter__/__exit__ directly. The behaviour would be similar to the relationship between iterators and iterables, except that a missing "__with__" implementation would imply the "return self" semantics that iterators must implement explicitly. Anyone picking up this idea should be prepared for a lot of pushback, as the last few years have shown us that the with statement is perfectly usable without this feature, and the GarlicSim example shows that this is already feasible using existing mechanisms (even I am at best lukewarm on the concept). Idea 2: Son of PEP 377 (letting context managers skip the body of the with statement) The niggles that lead me to write PEP 377 still bug me. There are some nested with statements that are perfectly valid as inline code but will throw RuntimeError if you make them into a single context manager via contextlib.contextmanager. That PEP was rightly rejected as having too great an impact on "normal" code for something that is a comparatively exotic corner case. As I noted in issue 5251, I've since thought of an alternative, lower impact solution that may prove more acceptable to Guido and others: an optional "__entered__" method for context managers that, if present, would be executed *inside* the scope of the try/finally block before the body of the with statement itself started executing. Any exceptions raised in that method would be passed to the __exit__ method, thus increasing the flexibility of the context management protocol, while having minimal impact Most significantly, contextlib.GeneratorContextManager could be adjusted to make use of this feature to correctly skip the body of the with statement when the internal generator doesn't yield (attempt to invoke __enter__ a second time would still trigger RuntimeError, though). The translation of the problematic nested with statements into a single generator based context manager would then work correctly, functioning in the same way as the equivalent inline code. I really like this idea, since it provides a genuinely new capability to the context management protocol in a relatively low impact manner. However, as noted in the intro, there are other, more immediately practical, questions I want to deal with first, so who knows if or when I'll be able to devote a significant amount of time to this one. So, if anyone is feeling particularly keen to champion a PEP and dig into the internals of contextlib and CPython's with statement implementation, there's a couple of ideas for you to mull over :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Thu Feb 24 22:59:05 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 25 Feb 2011 08:59:05 +1100 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: Message-ID: <4D66D4A9.3010509@pearwood.info> Nick Coghlan wrote: > Now that everyone has had time to get used to the way context managers > work, "Everyone"? On comp.lang.python and the python tutor mailing list, I don't believe I've seen any questions about the use of or writing of context managers, which implies either that the feature is so intuitive and simple that there's no questions to ask, or that they aren't (yet) widely used. I'd like to comment on the rest of your post, but I don't have a clue what you're talking about *wink* -- Steven From ncoghlan at gmail.com Thu Feb 24 23:14:08 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 25 Feb 2011 08:14:08 +1000 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: <4D66D4A9.3010509@pearwood.info> References: <4D66D4A9.3010509@pearwood.info> Message-ID: On Fri, Feb 25, 2011 at 7:59 AM, Steven D'Aprano wrote: > Nick Coghlan wrote: > >> Now that everyone has had time to get used to the way context managers >> work, > > "Everyone"? It's a true statement, given a suitably small definition of "everyone" :) It's at least a much larger set than it was back when AMK noticed the deep terminology confusion in the first version of the with statement and context management documentation (which was when Guido applied the Zen and dropped the __context__ method from the protocol). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From greg.ewing at canterbury.ac.nz Fri Feb 25 21:55:50 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 26 Feb 2011 09:55:50 +1300 Subject: [Python-ideas] A couple of with statement ideas References: <4D66D4A9.3010509@pearwood.info> Message-ID: From: Nick Coghlan > It's at least a much larger set than it was back when AMK noticed the > deep terminology confusion in the first version of the with statement > and context management documentation (which was when Guido applied the > Zen and dropped the __context__ method from the protocol). I'm in favour of the idea, but the terminology problem still needs to be solved. I think it's important that the name of the object implementing this protocol not have the word "context" in it *anywhere*. I like __with__ as the special method name, as it very obviously suggests a tight connection with the with-statement. The only term I can think of right now for the object is "withable object". It's a severe abuse of the English language, I know, but unfortunately there doesn't seem to be a concise verb meaning "enter a temporary execution context". -- Greg This email may be confidential and subject to legal privilege, it may not reflect the views of the University of Canterbury, and it is not guaranteed to be virus free. If you are not an intended recipient, please notify the sender immediately and erase all copies of the message and any attachments. Please refer to http://www.canterbury.ac.nz/emaildisclaimer for more information. From bruce at leapyear.org Fri Feb 25 22:14:40 2011 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 25 Feb 2011 13:14:40 -0800 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: <4D66D4A9.3010509@pearwood.info> Message-ID: On Fri, Feb 25, 2011 at 12:55 PM, Greg Ewing wrote: > From: Nick Coghlan > > It's at least a much larger set than it was back when AMK noticed the > > deep terminology confusion in the first version of the with statement > > and context management documentation (which was when Guido applied the > > Zen and dropped the __context__ method from the protocol). > > I'm in favour of the idea, but the terminology problem still > needs to be solved. I think it's important that the name of the > object implementing this protocol not have the word "context" in > it *anywhere*. > > I like __with__ as the special method name, as it very obviously > suggests a tight connection with the with-statement. > If the field returns a context manager, then the natural name to my mind would be __context_manager__. What I don't like about __with__ is that it's not a noun and doesn't tell me what value the attribute has or what I would do with it. Why do you think "it's important that the name ... not have the word "context" in it *anywhere*"? --- Bruce New Puzzazz newsletter: http://j.mp/puzzazz-news-2011-02 Make your web app more secure: http://j.mp/gruyere-security > -------------- next part -------------- An HTML attachment was scrubbed... URL: From raymond.hettinger at gmail.com Fri Feb 25 22:45:31 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 25 Feb 2011 13:45:31 -0800 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: <4D66D4A9.3010509@pearwood.info> Message-ID: <7B327FCF-FC14-4F55-BD23-AC80BC18EB54@gmail.com> > > I like __with__ as the special method name, as it very obviously > suggests a tight connection with the with-statement. > +1 I really like the tight association with the with-statement. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Fri Feb 25 22:49:25 2011 From: cs at zip.com.au (Cameron Simpson) Date: Sat, 26 Feb 2011 08:49:25 +1100 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: Message-ID: <20110225214925.GA15019@cskk.homeip.net> On 25Feb2011 13:14, Bruce Leban wrote: | On Fri, Feb 25, 2011 at 12:55 PM, Greg Ewing | wrote: | > From: Nick Coghlan | > > It's at least a much larger set than it was back when AMK noticed the | > > deep terminology confusion in the first version of the with statement | > > and context management documentation (which was when Guido applied the | > > Zen and dropped the __context__ method from the protocol). | > | > I'm in favour of the idea, but the terminology problem still | > needs to be solved. I think it's important that the name of the | > object implementing this protocol not have the word "context" in | > it *anywhere*. | > | > I like __with__ as the special method name, as it very obviously | > suggests a tight connection with the with-statement. | | If the field returns a context manager, then the natural name to my mind | would be __context_manager__. It's very long... but accurate. | What I don't like about __with__ is that it's not a noun and doesn't tell me | what value the attribute has or what I would do with it. "enter" and "exit" aren't nouns either. I guess they are events though, whereas __with__ is supposed to return something. Grammar aside I like __with__, personally, since __context__ seems to be out. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ BROCCOLI!! THE ONLY VEGETABLE THAT SOUNDS LIKE AN ADVERB!! - ken at aiai.ed.ac.uk (Ken Johnson) From greg.ewing at canterbury.ac.nz Fri Feb 25 23:46:07 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 26 Feb 2011 11:46:07 +1300 Subject: [Python-ideas] A couple of with statement ideas References: <4D66D4A9.3010509@pearwood.info> Message-ID: From: Bruce Leban [mailto:bruce at leapyear.org] > Why do you think "it's important that the name ... not have the word > "context" in it *anywhere*"? Maybe it's not that important. I'm just trying to avoid any chance of lingering confusion. -- Greg This email may be confidential and subject to legal privilege, it may not reflect the views of the University of Canterbury, and it is not guaranteed to be virus free. If you are not an intended recipient, please notify the sender immediately and erase all copies of the message and any attachments. Please refer to http://www.canterbury.ac.nz/emaildisclaimer for more information. From ncoghlan at gmail.com Sat Feb 26 01:46:24 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 26 Feb 2011 10:46:24 +1000 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: <4D66D4A9.3010509@pearwood.info> Message-ID: On Sat, Feb 26, 2011 at 8:46 AM, Greg Ewing wrote: > From: Bruce Leban [mailto:bruce at leapyear.org] > >> Why do you think "it's important that the name ... not have the word >> "context" in it *anywhere*"? > > Maybe it's not that important. I'm just trying to avoid any > chance of lingering confusion. The best I have at the moment is "objects with an implicit context manager", but that's still a huge improvement over where we were when PEP 343 was implemented. Back then, we weren't sure whether "context manager" referred to the objects with __context__ or the objects with __enter__/__exit__. Since Guido decided to drop the first option completely, the question is now solidly resolved in favour of the latter, so it opens up the possibility of adding back a __with__, __cm__, __context_manager__ or __manager__ method using a new term or phrase for the objects that implement it. The iterator/iterable precedent suggests manager->manageable as a possibility, but "manageable objects" isn't easy to write *or* to say. "Managed objects" could work, though (despite being slightly less technically correct). It sounds like there's enough interest in that idea that it's worth pursuing - it still needs someone to write/champion the PEP though. No comments on the PEP 377 variant though, which is a little disappointing. I see the fact that, depending on the details of cmA() and cmB() this code: @contextmanager def cmAB(): with cmA(), cmB(): yield with cmAB(): # Do stuff may throw RuntimeError, while the inline equivalent works just fine: with cmA(), cmB(): # Do stuff as a bit of a design flaw in the underlying structure of the context management protocol. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jsbueno at python.org.br Sat Feb 26 01:58:40 2011 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 25 Feb 2011 21:58:40 -0300 Subject: [Python-ideas] [Python-Dev] Let's get PEP 380 into Python 3.3 In-Reply-To: References: Message-ID: On Fri, Feb 25, 2011 at 8:01 PM, Greg Ewing wrote: > From: Guido van Rossum > >> (OTOH I am not much enamored with cofunctions, PEP 3152.) > > That's okay, I don't like it much myself in its current form. > I plan to revisit it at some point, but there's no hurry. I've just gone through PEP 3152 - and the first though that occurred me is that a decorator is perfectly usable instead of the new proposed keyword "codef". (It would need to be able to set special attributes in the function to indicate its nature) Besides not adding a new keyword, it would allow for different (concurrently running? ) types of co-functions to be created and benefit from the other mechanisms. But maybe considerations about this should be take place on python-ideas only? > -- > Greg > From mono9lith at gmail.com Sat Feb 26 05:12:14 2011 From: mono9lith at gmail.com (Alexander) Date: Sat, 26 Feb 2011 12:12:14 +0800 Subject: [Python-ideas] Support for regular expression syntax Message-ID: <4D687D9E.7090201@gmail.com> Hello. Maybe this topic has been discussed. I would like to have the support of alphabetic rules, such as \p{...}, in re module. From steve at pearwood.info Sat Feb 26 05:58:11 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 26 Feb 2011 15:58:11 +1100 Subject: [Python-ideas] Support for regular expression syntax In-Reply-To: <4D687D9E.7090201@gmail.com> References: <4D687D9E.7090201@gmail.com> Message-ID: <4D688863.7050505@pearwood.info> Alexander wrote: > Hello. Maybe this topic has been discussed. I would like to have the > support of alphabetic rules, such as \p{...}, in re module. Perhaps you could start the discussion by telling us what alphabetic rules such as \p{...} do, and which regex engines support them? -- Steven From mrts.pydev at gmail.com Sat Feb 26 15:03:49 2011 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Sat, 26 Feb 2011 16:03:49 +0200 Subject: [Python-ideas] str.split() oddness Message-ID: IMHO, x.join(a).split(x) should be "idempotent" in regard to a. >>> foo = ['a', 'b', 'c'] >>> assert '|'.join(foo).split('|') == foo >>> foo = ['a'] >>> assert '|'.join(foo).split('|') == foo >>> foo = [] >>> assert ' '.join(foo).split() == foo And now the odd exception to the rule: >>> assert '|'.join(foo).split('|') == foo Traceback (most recent call last): File "", line 1, in AssertionError That forces one to write special case code when using custom separators. Consider: # clean baz = dict(chunk.split('=') for chunk in baz.split()) # ugly baz = (dict(chunk.split('=') for chunk in baz.split("|")) if baz else {}) Our younger cousin Ruby has no such idiosyncrasies: >> foo = [] >> foo.join('|').split('|') == foo => true What is the reason for that oddity? Can we amend it? Best regards, Mart S?mermaa From ncoghlan at gmail.com Sat Feb 26 15:19:40 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 27 Feb 2011 00:19:40 +1000 Subject: [Python-ideas] Support for regular expression syntax In-Reply-To: <4D688863.7050505@pearwood.info> References: <4D687D9E.7090201@gmail.com> <4D688863.7050505@pearwood.info> Message-ID: On Sat, Feb 26, 2011 at 2:58 PM, Steven D'Aprano wrote: > Alexander wrote: >> >> Hello. Maybe this topic has been discussed. I would like to have the >> support of alphabetic rules, such as \p{...}, in re module. > > > Perhaps you could start the discussion by telling us what alphabetic rules > such as \p{...} do, and which regex engines support them? I found it on the last bullet point here: https://secure.wikimedia.org/wikipedia/en/wiki/Regular_expression#Unicode Apparently Perl and Java have constructs that allowing querying the Unicode character database inline within a regex. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jsbueno at python.org.br Sat Feb 26 16:44:27 2011 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Sat, 26 Feb 2011 12:44:27 -0300 Subject: [Python-ideas] str.split() oddness In-Reply-To: References: Message-ID: On Sat, Feb 26, 2011 at 11:03 AM, Mart S?mermaa wrote: > IMHO, x.join(a).split(x) should be "idempotent" > in regard to a. > >>>> foo = ['a', 'b', 'c'] >>>> assert '|'.join(foo).split('|') == foo >>>> foo = ['a'] >>>> assert '|'.join(foo).split('|') == foo >>>> foo = [] >>>> assert ' '.join(foo).split() == foo > > And now the odd exception to the rule: > >>>> assert '|'.join(foo).split('|') == foo > Traceback (most recent call last): > ?File "", line 1, in > AssertionError > > That forces one to write special case code when using > custom separators. Consider: > > # clean > baz = dict(chunk.split('=') for chunk in baz.split()) > # ugly > baz = (dict(chunk.split('=') for chunk in baz.split("|")) if baz else {}) > > Our younger cousin Ruby has no such idiosyncrasies: It is no idiosyncrazy -- Split returns what it should return - a list with an empty string: >>> ''.split("|") [''] and it would break a lot of code if it didn't. Filtering out lists with empty string does not see a big issue compared to the inconsistencies that would arise from any different behavior for split. Any list of strings does the roundtrip with a join->split sequence. Lists of any other elements, or empty lists don't. js -><- > >>> foo = [] >>> foo.join('|').split('|') == foo > => true > > What is the reason for that oddity? Can we amend it? > > Best regards, > Mart S?mermaa > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From python at mrabarnett.plus.com Sat Feb 26 19:08:57 2011 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 26 Feb 2011 18:08:57 +0000 Subject: [Python-ideas] Support for regular expression syntax In-Reply-To: References: <4D687D9E.7090201@gmail.com> <4D688863.7050505@pearwood.info> Message-ID: <4D6941B9.4080907@mrabarnett.plus.com> On 26/02/2011 14:19, Nick Coghlan wrote: > On Sat, Feb 26, 2011 at 2:58 PM, Steven D'Aprano wrote: >> Alexander wrote: >>> >>> Hello. Maybe this topic has been discussed. I would like to have the >>> support of alphabetic rules, such as \p{...}, in re module. >> >> >> Perhaps you could start the discussion by telling us what alphabetic rules >> such as \p{...} do, and which regex engines support them? > > I found it on the last bullet point here: > https://secure.wikimedia.org/wikipedia/en/wiki/Regular_expression#Unicode > > Apparently Perl and Java have constructs that allowing querying the > Unicode character database inline within a regex. > They are supported in the new regex implementation, available at PyPI: http://pypi.python.org/pypi/regex From tjreedy at udel.edu Sat Feb 26 19:52:19 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 26 Feb 2011 13:52:19 -0500 Subject: [Python-ideas] str.split() oddness In-Reply-To: References: Message-ID: On 2/26/2011 9:03 AM, Mart S?mermaa wrote: > IMHO, x.join(a).split(x) should be "idempotent" > in regard to a. Given that x.join is *not* 1 to 1, >>> 'a'.join([]) '' >>> 'a'.join(['']) '' it cannot have an inverse for all outputs. In particular, ''.split('a') cannot be both [] and ['']. This could only be fixed by changing the definition of join to not allow joining on [], but that would not be convenient. I believe joining is otherwise 1 to 1 and invertible for non-empty lists. Of course, join input a can be any iterable of strings, whereas split produces a list, so your equality test can only work for list inputs unless generalized to c.join(a).split(c) == list(a). ''.split('a') == [''], not [], by the definition of s.split(c): a list of pieces of s that were previously joined by c. In particular, string_not_containing_sep.split(sep) == [string_not_containing_sep]. Note that empty pieces are inserted for repeated seps so that splitting on seps (unlike splitting on 'whitespace') *is* 1 to 1. 'abc'.split('b') == ['a','c'] 'abbc'.split('b') == ['a','','c'] (whereas 'a c'.split() and 'a c'.split() are both ['a','c']) Therefore, sep splitting does have an inverse: c.join(s.split(c)) == s The doc for str.split specifies the above and makes clear that splitting with and without a separator are slightly different functions. >>>> assert ' '.join(foo).split() == foo You have pulled a fast one here. ' ' does not equal 'whitespace' ;-) If x in your original expression is nothing (to indicate 'whitespace'), then your desired equality becomes .join(a).split() == a which is not legal ;-). Some of the above is a rewording and expansion upon what Joao already said, which was all correct. -- Terry Jan Reedy From arnodel at gmail.com Sat Feb 26 22:31:06 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sat, 26 Feb 2011 21:31:06 +0000 Subject: [Python-ideas] str.split() oddness In-Reply-To: References: Message-ID: On 26 February 2011 14:03, Mart S?mermaa wrote: > IMHO, x.join(a).split(x) should be "idempotent" > in regard to a. Idempotent is the wrong word here. A function f is idempotent if f(f(x)) == f(x) for all x. What you are stating is that given: f_s(x) = s.join(x) g_s(x) = x.split(s) Then for all s and x, g_s(f_s(x)) == x. If this condition is satisfied then f_s and g_s are said to be each other's inverse. First you have to define clearly the domain of both functions for this to make sense. It seems that you consider the following domains: Domain of g_s = all strings Domain of f_s = all lists of strings which do not contain s Note that the domain of f_s is already quite complicated. As you point out, it can't work. As f_s([]) == f_s(['']) == '', g_s('') can't be both [] and ['']. But if you change the domain of f_s to: Domain of f_s = all non-empty lists of strings which do not contain s Then f_s and g_s are indeed the inverse of each other. Note also that in ruby, [''].join(s).split(s) == [''] evaluates to false. So the problem is also present with ruby. Ruby decided that ''.split(s) is [], whereas Python decided that ''.split(s) is ['']. The only solution would be to raise an exception when joining an empty list, which I guess is not very desirable. -- Arnaud From greg.ewing at canterbury.ac.nz Sun Feb 27 21:41:20 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 28 Feb 2011 09:41:20 +1300 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: <4D66D4A9.3010509@pearwood.info> Message-ID: <4D6AB6F0.3080506@canterbury.ac.nz> Nick Coghlan wrote: > The iterator/iterable precedent suggests manager->manageable as a > possibility, but "manageable objects" isn't easy to write *or* to say. > "Managed objects" could work, though (despite being slightly less > technically correct). Urk. Maybe eliminating the word "context" is the wrong thing to do, because "managed object" sounds far too vague -- it's far from clear *how* it's being managed. Also highly likely to be confused somehow with "managed code" in the .NET world (which is a confusingly vague term in itself). My current thought is "context manager provider", long-winded though it is. -- Greg From python at mrabarnett.plus.com Sun Feb 27 22:17:07 2011 From: python at mrabarnett.plus.com (MRAB) Date: Sun, 27 Feb 2011 21:17:07 +0000 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: <7B327FCF-FC14-4F55-BD23-AC80BC18EB54@gmail.com> References: <4D66D4A9.3010509@pearwood.info> <7B327FCF-FC14-4F55-BD23-AC80BC18EB54@gmail.com> Message-ID: <4D6ABF53.5040804@mrabarnett.plus.com> On 25/02/2011 21:45, Raymond Hettinger wrote: > >> >> I like __with__ as the special method name, as it very obviously >> suggests a tight connection with the with-statement. >> > > +1 > I really like the tight association with the with-statement. > Although it's weird English, as Greg said, I'd prefer __with__ and "withable object". From mrts.pydev at gmail.com Sun Feb 27 23:18:28 2011 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Mon, 28 Feb 2011 00:18:28 +0200 Subject: [Python-ideas] str.split() oddness In-Reply-To: References: Message-ID: On Sat, Feb 26, 2011 at 11:31 PM, Arnaud Delobelle wrote: > On 26 February 2011 14:03, Mart S?mermaa wrote: >> IMHO, x.join(a).split(x) should be "idempotent" >> in regard to a. > > Idempotent is the wrong word here. I should have said "identity function" instead. Sorry for the confusion. (Identity function is idempotent though [1].) ~ Terry, thanks for pointing out that as string_not_containing_sep.split(sep) == [string_not_containing_sep], therefore ''.split('b') == ['']. That's the gist of it. I would like to question that reasoning though. '' (the empty string) is "nothing", the zero element [2] of strings. The problem is that it is treated as "something". I would say that precisely because it is the zero element, ''.split('b') should read "applying the split operator with any argument to the zero element of strings results in the zero element of lists" and therefore ''.split('b') == ''.split() == [] (like in Ruby). And sorry for using "zero element" loosely, I hope it's understandable what I mean from context. ~ Knowing that reasoning and the inconvenient special casing that it causes in actual code, would you still design split() as ''.split('b') == [''] today? [1] http://en.wikipedia.org/wiki/Idempotence [2] http://en.wikipedia.org/wiki/Zero_element From taleinat at gmail.com Sun Feb 27 23:36:59 2011 From: taleinat at gmail.com (Tal Einat) Date: Mon, 28 Feb 2011 00:36:59 +0200 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: <4D66D4A9.3010509@pearwood.info> Message-ID: Greg Ewing wrote: > From: Nick Coghlan > > It's at least a much larger set than it was back when AMK noticed the > > deep terminology confusion in the first version of the with statement > > and context management documentation (which was when Guido applied the > > Zen and dropped the __context__ method from the protocol). > > I'm in favour of the idea, but the terminology problem still > needs to be solved. I think it's important that the name of the > object implementing this protocol not have the word "context" in > it *anywhere*. > > I like __with__ as the special method name, as it very obviously > suggests a tight connection with the with-statement. > > The only term I can think of right now for the object is > "withable object". It's a severe abuse of the English language, > I know, but unfortunately there doesn't seem to be a concise > verb meaning "enter a temporary execution context". > "Inquisitionize"? It's even Pythonic! ;) Unless I misunderstood, this (__with__ or whatever it ends up being called) would be an alternate method of implementing a context manager, so why not just call these "context managers" just like objects with __enter__ and __exit__ are? - Tal Einat -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 28 00:50:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 28 Feb 2011 09:50:09 +1000 Subject: [Python-ideas] str.split() oddness In-Reply-To: References: Message-ID: On Mon, Feb 28, 2011 at 8:18 AM, Mart S?mermaa wrote: > Knowing that reasoning and the inconvenient special casing > that it causes in actual code, would you still design split() as > ''.split('b') == [''] today? No, but that isn't really the question we need to ask. The more important question is, given that it *does* behave this way now, is changing it worth the inevitable hassle? How would we get there from here without gratuitously breaking working programs? So, even though I agree that Ruby's semantics are probably better in this case, I don't see it as sufficiently important to justify the breakage involved in fixing it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 28 00:58:51 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 28 Feb 2011 09:58:51 +1000 Subject: [Python-ideas] A couple of with statement ideas In-Reply-To: References: <4D66D4A9.3010509@pearwood.info> Message-ID: On Mon, Feb 28, 2011 at 8:36 AM, Tal Einat wrote: > Unless I misunderstood, this (__with__ or whatever it ends up being called) > would be an alternate method of implementing a context manager, so why not > just call these "context managers" just like objects with __enter__ and > __exit__ are? Because that's precisely the terminology confusion that got __context__ dropped from PEP 343 in the first place. To make the standard comparison, even iterators and iterables are not the same thing, even though the former are a subset of the latter. In this case, where objects would be expected to implement __with__ or __enter__/__exit__, but never both, the distinction should be kept even more clear. Basically, __with__ should be a context manager factory function, while context managers themselves are still required to implement __enter__/__exit__. Personally, my preference still goes to "objects with an implicit context manager". That said, until someone steps forward to write the PEP and make the case for bringing this idea back *at all*, the detailed terminology discussion is fairly moot. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Mon Feb 28 01:11:47 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 27 Feb 2011 19:11:47 -0500 Subject: [Python-ideas] str.split() oddness In-Reply-To: References: Message-ID: On 2/27/2011 5:18 PM, Mart S?mermaa wrote: > On Sat, Feb 26, 2011 at 11:31 PM, Arnaud Delobelle wrote: >> On 26 February 2011 14:03, Mart S?mermaa wrote: >>> IMHO, x.join(a).split(x) should be [invertible) >>> with respect to a. > > Terry, thanks for pointing out that as > string_not_containing_sep.split(sep) == [string_not_containing_sep], > therefore > ''.split('b') == ['']. Let me generalize this as follows: len(s.split(c)) == s.count(c)+1 and specialize this as follows: (n*c).split(c) == (n+1)*[''] > That's the gist of it. That, and the fact the .join is not 1 to 1 and therefore inherently not completely invertible, despite your wishes that it be so. > I would like to question that reasoning though. Even though it is coherent and sound? Why? > '' (the > empty string) is "nothing", the zero element [2] of strings. So what. That is no reason in itself to break the general pattern. > The problem is that it is treated as "something". In what sense? Of course, it is a string object. > I would say that precisely because it is the zero element, > ''.split('b') > should read > "applying the split operator with any argument to the zero > element of strings results in the zero element of lists" Sorry, I do not see that all all. This ad hoc special case rule 1. makes no particular sense to me, except to produce the result you want; 2. breaks the invariant above, and all special cases thereof;' 3. requires the addition of a special case in the algorithm; 4. causes << 'x'.join(['']).split('x') == [''] >> to because False, when you say it should be True, as it is now. > and therefore > ''.split('b') == ''.split() == [] > (like in Ruby). > Knowing that reasoning I do not see any reasoning other that 'do what Ruby does'. Why did Ruby change? Really thought out? or accident? > and the inconvenient special casing that it causes in actual code, I do not remember even one example, let alone a broad survey of use cases. > would you still design split() as > ''.split('b') == [''] today? I did not design it, but as you can guess from the above... yes. What I might change today is to make split lazy by returning an interator rather than a list. Otherwise, the definition of s.split(c) as s split at each occurence of c is quite coherent and without need of an arbitrary special case. I see this as somewhat similar to 0**0==1 resulting from a uniform coherent rule: for n a count, x**n is 1 multiplied by x n times. Whereas some claim that it should be special cased as 0 or disallowed. -- Terry Jan Reedy From guido at python.org Mon Feb 28 01:13:45 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 27 Feb 2011 16:13:45 -0800 Subject: [Python-ideas] str.split() oddness In-Reply-To: References: Message-ID: Does Ruby in general leave out empty strings from the result? What does it return when "x,,y" is split on "," ? ["x", "", "y"] or ["x", "y"]? In Python the generalization is that since "xx".split(",") is ["xx"], and "x",split(",") is ["x"], it naturally follows that "".split(",") is [""]. On Sun, Feb 27, 2011 at 2:18 PM, Mart S?mermaa wrote: > On Sat, Feb 26, 2011 at 11:31 PM, Arnaud Delobelle wrote: >> On 26 February 2011 14:03, Mart S?mermaa wrote: >>> IMHO, x.join(a).split(x) should be "idempotent" >>> in regard to a. >> >> Idempotent is the wrong word here. > > I should have said "identity function" instead. > Sorry for the confusion. > (Identity function is idempotent though [1].) > > ~ > > Terry, thanks for pointing out that as > > ?string_not_containing_sep.split(sep) == [string_not_containing_sep], > > therefore > > ?''.split('b') == ['']. > > That's the gist of it. > > I would like to question that reasoning though. '' (the > empty string) is "nothing", the zero element [2] of strings. > The problem is that it is treated as "something". I would > say that precisely because it is the zero element, > > ?''.split('b') > > should read > > ?"applying the split operator with any argument to the zero > ? element of strings results in the zero element of lists" > > and therefore > > ?''.split('b') == ''.split() == [] > > (like in Ruby). And sorry for using "zero element" loosely, > I hope it's understandable what I mean from context. > > ~ > > Knowing that reasoning and the inconvenient special casing > that it causes in actual code, would you still design split() as > ''.split('b') == [''] today? > > [1] http://en.wikipedia.org/wiki/Idempotence > [2] http://en.wikipedia.org/wiki/Zero_element > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From andy at insectnation.org Mon Feb 28 01:14:12 2011 From: andy at insectnation.org (Andy Buckley) Date: Mon, 28 Feb 2011 01:14:12 +0100 Subject: [Python-ideas] str.split with multiple individual split characters Message-ID: <4D6AE8D4.5080709@insectnation.org> Here's another str.split() suggestion, this time an extension (Pythonic, I think) rather than a change of semantics. There are cases where, especially in handling user input, I'd like to be able to treat any of a series of possible delimiters as acceptable. Let's say that I want commas, underscores, and hyphens to all be treated as delimiters (as I did in some code I was writing today). I guessed, based on some other Python std lib behaviours, that this might work: usertokens = userstr.split([",", "_", "-"]) It doesn't work though, since the sep argument *has* to be a string. I think it would be nice for an extension like this to be supported, although I would guess a 90% probability of there being an insightful reason for why it's not such a great idea after all* ;-) Unlike many extensions, I don't think that the general solution to this is *very* quick and idiomatic in current Python. As for a compelling use-case... well, I'm very sympathetic to not adding functions for which there is no demand (I forget the relevant acronym) but this is a case where I suddenly found that I did have that problem to solve and that Python didn't have the nice built-in answer that I semi-expected it to. Extension of single arguments to iterables of them is quite a common Python design feature: one of those things where you think "ooh, this really is a nice, consistent, powerful language" when you find it. So I hope that this suggestion finds some favour. Best wishes, Andy [*] Such as "how do you distinguish between a string, which is iterable over its characters, and a list/tuple/blah of individual strings?" Well, that doesn't strike me as too big a technical issue, but maybe it is. From guido at python.org Mon Feb 28 01:21:16 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 27 Feb 2011 16:21:16 -0800 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: <4D6AE8D4.5080709@insectnation.org> References: <4D6AE8D4.5080709@insectnation.org> Message-ID: It's so easy to do this using re.split() that it's not worth the added complexity in str.split(). On Sun, Feb 27, 2011 at 4:14 PM, Andy Buckley wrote: > Here's another str.split() suggestion, this time an extension (Pythonic, > I think) rather than a change of semantics. > > There are cases where, especially in handling user input, I'd like to be > able to treat any of a series of possible delimiters as acceptable. > Let's say that I want commas, underscores, and hyphens to all be treated > as delimiters (as I did in some code I was writing today). I guessed, > based on some other Python std lib behaviours, that this might work: > > usertokens = userstr.split([",", "_", "-"]) > > It doesn't work though, since the sep argument *has* to be a string. I > think it would be nice for an extension like this to be supported, > although I would guess a 90% probability of there being an insightful > reason for why it's not such a great idea after all* ;-) > > Unlike many extensions, I don't think that the general solution to this > is *very* quick and idiomatic in current Python. As for a compelling > use-case... well, I'm very sympathetic to not adding functions for which > there is no demand (I forget the relevant acronym) but this is a case > where I suddenly found that I did have that problem to solve and that > Python didn't have the nice built-in answer that I semi-expected it to. > Extension of single arguments to iterables of them is quite a common > Python design feature: one of those things where you think "ooh, this > really is a nice, consistent, powerful language" when you find it. So I > hope that this suggestion finds some favour. > > Best wishes, > Andy > > [*] Such as "how do you distinguish between a string, which is iterable > over its characters, and a list/tuple/blah of individual strings?" Well, > that doesn't strike me as too big a technical issue, but maybe it is. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From python at mrabarnett.plus.com Mon Feb 28 01:36:20 2011 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 28 Feb 2011 00:36:20 +0000 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: <4D6AE8D4.5080709@insectnation.org> References: <4D6AE8D4.5080709@insectnation.org> Message-ID: <4D6AEE04.4030103@mrabarnett.plus.com> On 28/02/2011 00:14, Andy Buckley wrote: > Here's another str.split() suggestion, this time an extension (Pythonic, > I think) rather than a change of semantics. > > There are cases where, especially in handling user input, I'd like to be > able to treat any of a series of possible delimiters as acceptable. > Let's say that I want commas, underscores, and hyphens to all be treated > as delimiters (as I did in some code I was writing today). I guessed, > based on some other Python std lib behaviours, that this might work: > > usertokens = userstr.split([",", "_", "-"]) > > It doesn't work though, since the sep argument *has* to be a string. I > think it would be nice for an extension like this to be supported, > although I would guess a 90% probability of there being an insightful > reason for why it's not such a great idea after all* ;-) > > Unlike many extensions, I don't think that the general solution to this > is *very* quick and idiomatic in current Python. As for a compelling > use-case... well, I'm very sympathetic to not adding functions for which > there is no demand (I forget the relevant acronym) but this is a case > where I suddenly found that I did have that problem to solve and that > Python didn't have the nice built-in answer that I semi-expected it to. > Extension of single arguments to iterables of them is quite a common > Python design feature: one of those things where you think "ooh, this > really is a nice, consistent, powerful language" when you find it. So I > hope that this suggestion finds some favour. > > Best wishes, > Andy > > [*] Such as "how do you distinguish between a string, which is iterable > over its characters, and a list/tuple/blah of individual strings?" Well, > that doesn't strike me as too big a technical issue, but maybe it is. There are a number of additions which could be useful, such as splitting on multiple separators (compare with str.startswith and str.endswith) and stripping leading and/or trailing /strings/ (perhaps str.stripstr, str.lstripstr and str.rstripstr), but it does come down to use cases. As has been pointed out previously, it's easy to keep adding stuff, but once something is added we'll be stuck with it forever (virtually), so we need to be careful. The relevant acronym, by the way, is "YAGNI" ("You Aren't Going to Need It" or "You Ain't Gonna Need It"). From tjreedy at udel.edu Mon Feb 28 02:07:07 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 27 Feb 2011 20:07:07 -0500 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: <4D6AE8D4.5080709@insectnation.org> References: <4D6AE8D4.5080709@insectnation.org> Message-ID: On 2/27/2011 7:14 PM, Andy Buckley wrote: > usertokens = userstr.split([",", "_", "-"]) re beginner here; I let IDLE tell me the arg order: >>> import re; re.split('[,_-]','a_b,c-d') ['a', 'b', 'c', 'd'] Python-list is good for such questions. -- Terry Jan Reedy From raymond.hettinger at gmail.com Mon Feb 28 02:12:10 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Sun, 27 Feb 2011 17:12:10 -0800 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: <4D6AEE04.4030103@mrabarnett.plus.com> References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> Message-ID: <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> On Feb 27, 2011, at 4:36 PM, MRAB wrote: > > As has been pointed out previously, it's easy to keep adding stuff, but > once something is added we'll be stuck with it forever (virtually), so > we need to be careful. The real problem is that str.split() is already at its usability limits. The two separate algorithms are a perpetual source of confusion. It took years to get the documentation to be as accurate and helpful as they are now. Extending str.split() in any way would make the problem worse, so it shouldn't be touched again. It would helpful to consider its API to be frozen. Any bright ideas for additional capabilities should be aimed at new methods, modules, or recipes but not at str.split() itself. Useful as it is, we're fortunate that str.splitlines() was implementation as a separate method rather than as an extension to str.split(). IMO, this should be the model for the future. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From cmjohnson.mailinglist at gmail.com Mon Feb 28 03:06:34 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sun, 27 Feb 2011 16:06:34 -1000 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> Message-ID: FWIW, I'd like it if something like this functionality existed in the basic string methods. I'm aware of re.split, but in spite of learning regular expressions two or three times already, I use them so infrequently, I had already forgotten how to make it work and which characters are special characters (I find this the hardest thing to remember with regular expressions). So, I would appreciate it if something like s.multisplit(["-", "_", ","]) existed. Still, there is a simple enough non-regular expressions way of doing such a split: s = s.replace(",", "-").replace("_", "-") items = s.split("-") So, I don't think this is an urgent need. It's more of an "it would be nice if" but I don't know how to square that against the maintenance costs. From tjreedy at udel.edu Mon Feb 28 05:40:58 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 27 Feb 2011 23:40:58 -0500 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> Message-ID: On 2/27/2011 9:06 PM, Carl M. Johnson wrote: > FWIW, I'd like it if something like this functionality existed in the > basic string methods. I'm aware of re.split, but in spite of learning > regular expressions two or three times already, I use them so > infrequently, I had already forgotten how to make it work I found it so easy to get your particular use case -- multiple individual chars -- right on my first attempt that I have trouble being sympathetic. In the IDLE shell, I just typed re.split( and the tool tip just popped up with (pattern, string, ...). The only thing I had to remember is that brackets [] defines such sets. > and which characters are special characters It turns out that within a set pattern, special chars are generally not special. However, extra backslashes do not hurt even when not needed. Perhaps the str.split entry should have a cross-reference to re.split. -- Terry Jan Reedy From cmjohnson.mailinglist at gmail.com Mon Feb 28 06:03:02 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sun, 27 Feb 2011 19:03:02 -1000 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> Message-ID: On Sun, Feb 27, 2011 at 6:40 PM, Terry Reedy wrote: > I found it so easy to get your particular use case -- multiple individual > chars -- right on my first attempt that I have trouble being sympathetic. In > the IDLE shell, I just typed re.split( and the tool tip just popped up with > (pattern, string, ...). The only thing I had to remember is that brackets [] > defines such sets. Yes, but brackets defining such sets is the exact thing that I had forgotten! :-P > It turns out that within a set pattern, special chars are generally not > special. However, extra backslashes do not hurt even when not needed. Things like this are what make me think it is impossible for regular expressions, as useful as they are, to be really Pythonic. There are too many "convenient" special cases. Anyway, you'll get no argument from me: Regexes are easy once you know regexes. For whatever reason though, I've never been able to successfully, permanently learn regexes. I'm just trying to make the case that it's tough for some users to have to learn a whole separate language in order to do a certain kind of string split more simply. Then again that's not to say that there needs to be such functionality. After all, love them or hate them, there are a lot of tasks for which regexes are just the simplest way to get the job done. It's just that users like me (if there are any) who find regexes hard to get to stick would appreciate being able to avoid learning them for a little longer. From stephen at xemacs.org Mon Feb 28 07:19:13 2011 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 28 Feb 2011 15:19:13 +0900 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> Message-ID: <87hbbod6la.fsf@uwakimon.sk.tsukuba.ac.jp> Carl M. Johnson writes: > Anyway, you'll get no argument from me: Regexes are easy once you know > regexes. For whatever reason though, I've never been able to > successfully, permanently learn regexes. How about learning them long enough to write >>> def multisplit (source, char1, char2): ... return re.split("".join(["[",char1,char2,"]"]),source) ... >>> multisplit ("a-b_c","_","-") ['a', 'b', 'c'] or a generalization as needed? I'm not unsympathetic to the need, but there are just too many Zen or near-Zen principles violated by this proposal. I'm getting old and cranky enough myself that I have to explicitly remind myself to do this kind of thing, but arguing against the Zen doesn't work very well, even here on python-ideas. Life is easier for me when I remember to help myself! From ncoghlan at gmail.com Mon Feb 28 07:20:41 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 28 Feb 2011 16:20:41 +1000 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> Message-ID: On Mon, Feb 28, 2011 at 3:03 PM, Carl M. Johnson wrote: > Anyway, you'll get no argument from me: Regexes are easy once you know > regexes. For whatever reason though, I've never been able to > successfully, permanently learn regexes. Neither have I, I just remember where to find the (quite readable) reference to their syntax in the Python documentation (http://docs.python.org/library/re). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From bruce at leapyear.org Mon Feb 28 07:51:51 2011 From: bruce at leapyear.org (Bruce Leban) Date: Sun, 27 Feb 2011 22:51:51 -0800 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: <87hbbod6la.fsf@uwakimon.sk.tsukuba.ac.jp> References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> <87hbbod6la.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: On Sun, Feb 27, 2011 at 10:19 PM, Stephen J. Turnbull wrote: > def multisplit (source, char1, char2): > ... return re.split("".join(["[",char1,char2,"]"]),source) > actually you need re.escape there in case one of the characters is \ or ]. And if remembering [...] is hard using | makes this a bit more general (accepting multi-character separators) def multisplit(source, *separators): return re.split('|'.join([re.escape(t) for t in separators]), source) multisplit(s, '\r\n', '\r', '\n') Bonus points if you see the problem with the above. Correct code below spoiler space . . . . . . . . . . . The problem is that an |-separated regex matches in order, so if a longer separator appears after a shorter one, the shorter one will take precedence. def multisplit(source, *separators): return re.split('|'.join([re.escape(t) for t in sorted(separators, key=len, reverse=True)]), source) -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Mon Feb 28 09:15:30 2011 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 28 Feb 2011 21:15:30 +1300 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> Message-ID: <4D6B59A2.3090401@canterbury.ac.nz> Guido van Rossum wrote: > It's so easy to do this using re.split() that it's not worth the added > complexity in str.split(). Also I'm not sure it would be all that useful in practice in the simple form proposed. Whenever I've wanted something like that I've also wanted to know *which* separator occurred at each split point. This is also fairly easy to do with re.split(). -- Greg From steve at pearwood.info Mon Feb 28 11:23:38 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 28 Feb 2011 21:23:38 +1100 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> Message-ID: <4D6B77AA.4090604@pearwood.info> Guido van Rossum wrote: > It's so easy to do this using re.split() that it's not worth the added > complexity in str.split(). Easy, but slow. If performance is important, it looks to me like re.split is the wrong solution. Using Python 3.1: >>> from re import split >>> def split_str(s, *args): # quick, dirty and inefficient multi-split ... for a in args[1:]: ... s = s.replace(a, args[0]) ... return s.split(args[0]) ... >>> text = "abc.d-ef_g:h;ijklmn+opqrstu|vw-x_y.z"*1000 >>> assert split(r'[.\-_:;+|]', text) == split_str(text, *'.-_:;+|') >>> >>> from timeit import Timer >>> t1 = Timer("split(r'[.\-_:;+|]', text)", ... "from re import split; from __main__ import text") >>> t2 = Timer("split_str(text, *'.-_:;+|')", ... "from __main__ import split_str, text") >>> >>> min(t1.repeat(number=10000, repeat=5)) 72.31230521202087 >>> min(t2.repeat(number=10000, repeat=5)) 17.375113010406494 -- Steven From stefan_ml at behnel.de Mon Feb 28 11:57:36 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 28 Feb 2011 11:57:36 +0100 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: <4D6B77AA.4090604@pearwood.info> References: <4D6AE8D4.5080709@insectnation.org> <4D6B77AA.4090604@pearwood.info> Message-ID: Steven D'Aprano, 28.02.2011 11:23: > Guido van Rossum wrote: >> It's so easy to do this using re.split() that it's not worth the added >> complexity in str.split(). > > Easy, but slow. If performance is important, it looks to me like re.split > is the wrong solution. Using Python 3.1: > > > >>> from re import split > >>> def split_str(s, *args): # quick, dirty and inefficient multi-split > ... for a in args[1:]: > ... s = s.replace(a, args[0]) > ... return s.split(args[0]) > ... > >>> text = "abc.d-ef_g:h;ijklmn+opqrstu|vw-x_y.z"*1000 > >>> assert split(r'[.\-_:;+|]', text) == split_str(text, *'.-_:;+|') > >>> > >>> from timeit import Timer > >>> t1 = Timer("split(r'[.\-_:;+|]', text)", > ... "from re import split; from __main__ import text") > >>> t2 = Timer("split_str(text, *'.-_:;+|')", > ... "from __main__ import split_str, text") > >>> > >>> min(t1.repeat(number=10000, repeat=5)) > 72.31230521202087 > >>> min(t2.repeat(number=10000, repeat=5)) > 17.375113010406494 You forgot to do the precompilation. Here's what I get: >>> t1 = Timer("split(text)", "import re; from __main__ import text; \ ... split=re.compile(r'[.\-_:;+|]').split") >>> min(t1.repeat(number=1000, repeat=3)) 3.9842870235443115 >>> min(t2.repeat(number=1000, repeat=3)) 0.9261999130249023 Still a factor of 4, using Py3.2. Anyone wants to try it with the alternative regex packages? Stefan From steve at pearwood.info Mon Feb 28 12:15:21 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 28 Feb 2011 22:15:21 +1100 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> Message-ID: <4D6B83C9.2080503@pearwood.info> Carl M. Johnson wrote: > Anyway, you'll get no argument from me: Regexes are easy once you know > regexes. For whatever reason though, I've never been able to > successfully, permanently learn regexes. I'm just trying to make the > case that it's tough for some users to have to learn a whole separate > language in order to do a certain kind of string split more simply. I would say, *easy* regexes are easy once you know regexes. But in general, not so much... even Larry Wall is rethinking a lot of regex culture and syntax: http://dev.perl.org/perl6/doc/design/apo/A05.html But this case is relatively easy, although there is at least one obvious trap for the unwary: forgetting to escape the split chars. > Then again that's not to say that there needs to be such > functionality. After all, love them or hate them, there are a lot of > tasks for which regexes are just the simplest way to get the job done. > It's just that users like me (if there are any) who find regexes hard > to get to stick would appreciate being able to avoid learning them for > a little longer. I can sympathise with that. Regexes are essentially another programming language (albeit not Turing Complete), and everything we love about Python, regexes are the opposite. They're as far from executable pseudo-code as it's possible to get without becoming one of those esoteric languages that have three commands and one data type... *wink* Anyway, for what it's worth, when I think about the times I've needed something like a multi-split, it has been for mini-parsers. I think a cross between split and partition would be more useful: multisplit(source, seps, maxsplit=None) => [(substring, sep), ...] Here's a pure-Python implementation, limited to single character separators: def multisplit(source, seps, maxsplit=None): def find_first(): for i, c in enumerate(source): if c in seps: return i return -1 count = 0 while True: if maxsplit is not None and count >= maxsplit: yield (source, '') break p = find_first() if p >= 0: yield (source[:p], source[p]) count += 1 source = source[p+1:] else: yield (source, '') break -- Steven From steve at pearwood.info Mon Feb 28 12:20:42 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 28 Feb 2011 22:20:42 +1100 Subject: [Python-ideas] str.split with multiple individual split characters In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6B77AA.4090604@pearwood.info> Message-ID: <4D6B850A.2040808@pearwood.info> Stefan Behnel wrote: > Steven D'Aprano, 28.02.2011 11:23: >> Guido van Rossum wrote: >>> It's so easy to do this using re.split() that it's not worth the added >>> complexity in str.split(). >> >> Easy, but slow. If performance is important, it looks to me like re.split >> is the wrong solution. Using Python 3.1: [...] > You forgot to do the precompilation. Here's what I get: The re module caches the last 100(?) patterns used, so it only needs compiling once. The other 49,999 times it will be fetched from the cache. -- Steven From mwm at mired.org Mon Feb 28 17:04:06 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 28 Feb 2011 11:04:06 -0500 Subject: [Python-ideas] New pattern-matching library (was: str.split with multiple individual split characters) In-Reply-To: <4D6B83C9.2080503@pearwood.info> References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> <4D6B83C9.2080503@pearwood.info> Message-ID: <20110228110406.3ae7fec5@bhuda.mired.org> Ok, with everyone at least noticing that regular expressions are hard, if not actively complaining about it (including apparently Larry wall), maybe it's time to add a second pattern matching library - one that's more pythonic? There are any number of languages with readable pattern matching - Icon, Snobol and REXX all come to my mind. Searching pypi for "snobol" reveals two snobol string matching libraries, and I found one on the web based on icon. Possibly we should investigate adding one of those to the standard library, along with a cross-reference from the regexp documentation? http://www.mired.org/consulting.html Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From guido at python.org Mon Feb 28 18:15:36 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Feb 2011 09:15:36 -0800 Subject: [Python-ideas] New pattern-matching library (was: str.split with multiple individual split characters) In-Reply-To: <20110228110406.3ae7fec5@bhuda.mired.org> References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> <4D6B83C9.2080503@pearwood.info> <20110228110406.3ae7fec5@bhuda.mired.org> Message-ID: On Mon, Feb 28, 2011 at 8:04 AM, Mike Meyer wrote: > Ok, with everyone at least noticing that regular expressions are hard, > if not actively complaining about it (including apparently Larry wall), > maybe it's time to add a second pattern matching library - one that's > more pythonic? > > There are any number of languages with readable pattern matching - > Icon, Snobol and REXX all come to my mind. Searching pypi for "snobol" > reveals two snobol string matching libraries, and I found one on the > web based on icon. > > Possibly we should investigate adding one of those to the standard > library, along with a cross-reference from the regexp documentation? It's been tried before without much success. I think it may have been a decade ago that Ka-Ping Yee created a pattern matching library that used function calls (and operator overloading? I can't recall) to generate patterns -- compiling to re patterns underneath. It didn't get much use. I fear that regular expressions have this market cornered, and there isn't anything possible that is so much better that it'll drive them out. That doesn't mean you shouldn't try -- I've been wrong before. But maybe instead of striving for stdlib inclusion (which these days is pretty much only accessible for proven successful 3rd party libraries), you should try to create a new 3rd party pattern matching library. While admittedly this gives it a disadvantage to the re module, I really don't think we should experiment in the stdlib, since the release cycle and backwards compatibility requirements make the necessary experimentation too cumbersome. On the third hand, I could see this as an area where a pure library-based approach will always be doomed, and where a proposal to add new syntax would actually make sense. Of course that still has the same problems due to release time and policy. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Mon Feb 28 18:25:37 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 01 Mar 2011 04:25:37 +1100 Subject: [Python-ideas] New pattern-matching library In-Reply-To: <20110228110406.3ae7fec5@bhuda.mired.org> References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> <4D6B83C9.2080503@pearwood.info> <20110228110406.3ae7fec5@bhuda.mired.org> Message-ID: <4D6BDA91.4060103@pearwood.info> Mike Meyer wrote: > There are any number of languages with readable pattern matching - > Icon, Snobol and REXX all come to my mind. Searching pypi for "snobol" > reveals two snobol string matching libraries, and I found one on the > web based on icon. > > Possibly we should investigate adding one of those to the standard > library, along with a cross-reference from the regexp documentation? I've only checked out snopy: http://snopy.sourceforge.net/user-guide.html As far as I can tell, that far from ready for production, and it looks like it hasn't been updated since 2002. I am interested in string-rewriting rules, Markov algorithms and the like, so speaking in the abstract, +1 on the concept. But concretely, I don't think the standard library is the place for such experiments. I think that somebody would need to develop a good quality pattern matcher which gets good real-world testing before it could be considered for the standard library. -- Steven From dirkjan at ochtman.nl Mon Feb 28 18:27:20 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 28 Feb 2011 18:27:20 +0100 Subject: [Python-ideas] Coercing str.join() argument elements to str Message-ID: I just saw someone mention this on Twitter, and I know that I've been bitten by it before. Seems like it wouldn't break any existing code... The worst that could happen is that someone gets nonsensical strings in new code instead of an exception. >>> ', '.join(range(5)) Traceback (most recent call last): File "", line 1, in TypeError: sequence item 0: expected string, int found >>> ', '.join(str(i) for i in range(5)) '0, 1, 2, 3, 4' Has this been discussed before? If not, what would be reasons not to do this? Cheers, Dirkjan From _ at lvh.cc Mon Feb 28 18:30:58 2011 From: _ at lvh.cc (Laurens Van Houtven) Date: Mon, 28 Feb 2011 18:30:58 +0100 Subject: [Python-ideas] Coercing str.join() argument elements to str In-Reply-To: References: Message-ID: Yep. http://mail.python.org/pipermail/python-ideas/2010-October/008358.html cheers lvh -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Feb 28 18:35:09 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Feb 2011 09:35:09 -0800 Subject: [Python-ideas] Coercing str.join() argument elements to str In-Reply-To: References: Message-ID: On Mon, Feb 28, 2011 at 9:27 AM, Dirkjan Ochtman wrote: > I just saw someone mention this on Twitter, and I know that I've been > bitten by it before. Seems like it wouldn't break any existing code... > The worst that could happen is that someone gets nonsensical strings > in new code instead of an exception. > >>>> ', '.join(range(5)) > Traceback (most recent call last): > ?File "", line 1, in > TypeError: sequence item 0: expected string, int found >>>> ', '.join(str(i) for i in range(5)) > '0, 1, 2, 3, 4' > > Has this been discussed before? If not, what would be reasons not to do this? It comes up occasionally and is aways rejected on the same grounds as rejecting '1' + 2. I.e., we believe that the current approach catches more bugs. -- --Guido van Rossum (python.org/~guido) From mwm at mired.org Mon Feb 28 19:23:55 2011 From: mwm at mired.org (Mike Meyer) Date: Mon, 28 Feb 2011 13:23:55 -0500 Subject: [Python-ideas] Fw: New pattern-matching library (was: str.split with multiple individual split characters) Message-ID: <20110228132355.105c95d9@bhuda.mired.org> I accidental dropped the list from my reply to his comment. I'm forwarding that and his reply (with permission). To: Mike Meyer Subject: Re: [Python-ideas] New pattern-matching library (was: str.split with multiple individual split characters) On Mon, Feb 28, 2011 at 9:40 AM, Mike Meyer wrote: > On Mon, 28 Feb 2011 09:15:36 -0800 > Guido van Rossum wrote: >> That doesn't mean you shouldn't try -- I've been wrong before. But >> maybe instead of striving for stdlib inclusion (which these days is >> pretty much only accessible for proven successful 3rd party >> libraries), you should try to create a new 3rd party pattern matching >> library. While admittedly this gives it a disadvantage to the re >> module, I really don't think we should experiment in the stdlib, since >> the release cycle and backwards compatibility requirements make the >> necessary experimentation too cumbersome. > > How about if it's bundled with new language functionality? Take Icon's > failure/backtrack/alternation features, add them to Python, then put > together a string parsing facility that leverages that? You'd have to write a thorough PEP, do a reference implementation, and go through a long process of discussion before getting it accepted. It can be done, but nothing of the scale has been done in a long time (most syntax PEPs these days are mere tweaks in comparison). Even before writing the PEP you should probably start with a serious brainstorm. Are you up for it? I support you (or anyone) doing it, and may possibly comment in various stages on ideas or designs, but I can't pull this cart myself, nor can I guarantee success. Maybe you can find a few other like-minded folks to help with the various stages of the design? How much deep knowledge about Python's syntax and semantics do you have? PS. Did you intentionally drop python-ideas from the CC header? -- --Guido van Rossum (python.org/~guido) -- Mike Meyer http://www.mired.org/consulting.html Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From python at zesty.ca Mon Feb 28 19:53:28 2011 From: python at zesty.ca (Ka-Ping Yee) Date: Mon, 28 Feb 2011 10:53:28 -0800 (PST) Subject: [Python-ideas] New pattern-matching library (was: str.split with multiple individual split characters) In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> <4D6B83C9.2080503@pearwood.info> <20110228110406.3ae7fec5@bhuda.mired.org> Message-ID: On Mon, 28 Feb 2011, Guido van Rossum wrote: > On Mon, Feb 28, 2011 at 8:04 AM, Mike Meyer wrote: >> Possibly we should investigate adding one of those to the standard >> library, along with a cross-reference from the regexp documentation? > > It's been tried before without much success. I think it may have been > a decade ago that Ka-Ping Yee created a pattern matching library that > used function calls (and operator overloading? I can't recall) to > generate patterns -- compiling to re patterns underneath. It didn't > get much use. Yes, there was operator overloading. The expressions looked like this: letter + 3*digits + anyspace + either(some(digits), some(letters)) If anyone is curious, the module is available here: http://zesty.ca/python/rxb.py You're welcome to experiment with it, modify it, use it as a starting point for your own pattern matcher if you like. --Ping From cool-rr at cool-rr.com Mon Feb 28 22:21:52 2011 From: cool-rr at cool-rr.com (cool-RR) Date: Mon, 28 Feb 2011 16:21:52 -0500 Subject: [Python-ideas] class ModuleNotFoundError(ImportError) Message-ID: There are many programs out there, including Django, that "carefully import" a module by doing: try: import simplejson except ImportError: import whatever_instead as simplejson # or whatever This is problematic because sometimes an `ImportError` is raised not because the module is missing, but because there's some error in the module, or because the module raises an `ImportError` itself. Then the exception gets totally swallowed, resulting in delightful debugging sessions. What do you think about having an exception `ModuleNotFoundError` which is a subclass of `ImportError`? Then people could `except ModuleNotFoundError` and be sure that it was caused by a module not existing. This will be a much better way of "carefully importing" a module. Would this be backwards-compatible? Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Feb 28 22:28:40 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 28 Feb 2011 13:28:40 -0800 Subject: [Python-ideas] class ModuleNotFoundError(ImportError) In-Reply-To: References: Message-ID: On Mon, Feb 28, 2011 at 1:21 PM, cool-RR wrote: > There are many programs out there, including Django, that "carefully import" > a module by doing: > ?? ?try: > ?? ??? ?import simplejson > ?? ?except ImportError: > ?? ??? ?import whatever_instead as simplejson > ?? ??? ?# or whatever > This is problematic because sometimes an `ImportError` is raised not because > the module is missing, but because there's some error in the module, or > because the module raises an `ImportError` itself. Then the exception gets > totally swallowed, resulting in delightful debugging sessions. > What do you think about having an exception `ModuleNotFoundError` which is a > subclass of `ImportError`? Then people could `except?ModuleNotFoundError` > and be sure that it was caused by a module not existing. This will be a much > better way of "carefully importing" a module. Would this be > backwards-compatible? The most problematic issue is actually that the imported module (above, simplejson) itself imports a non-existent module. Since that would still raise ModuleNotFoundError, your proposal doesn't really fix the problem. I think modules raising ImportError for some other reason is rare. What might perhaps help is if ImportError had the name of the module that could not be imported as an attribute. Then the code could be rewritten as: try: import simplejson except ImportError, err: if e.module_name != 'simplejson': raise -- --Guido van Rossum (python.org/~guido) From fuzzyman at gmail.com Mon Feb 28 22:45:55 2011 From: fuzzyman at gmail.com (Michael Foord) Date: Mon, 28 Feb 2011 21:45:55 +0000 Subject: [Python-ideas] class ModuleNotFoundError(ImportError) In-Reply-To: References: Message-ID: On 28 February 2011 21:28, Guido van Rossum wrote: > On Mon, Feb 28, 2011 at 1:21 PM, cool-RR wrote: > > There are many programs out there, including Django, that "carefully > import" > > a module by doing: > > try: > > import simplejson > > except ImportError: > > import whatever_instead as simplejson > > # or whatever > > This is problematic because sometimes an `ImportError` is raised not > because > > the module is missing, but because there's some error in the module, or > > because the module raises an `ImportError` itself. Then the exception > gets > > totally swallowed, resulting in delightful debugging sessions. > > What do you think about having an exception `ModuleNotFoundError` which > is a > > subclass of `ImportError`? Then people could `except ModuleNotFoundError` > > and be sure that it was caused by a module not existing. This will be a > much > > better way of "carefully importing" a module. Would this be > > backwards-compatible? > > The most problematic issue is actually that the imported module > (above, simplejson) itself imports a non-existent module. Since that > would still raise ModuleNotFoundError, your proposal doesn't really > fix the problem. > > I think modules raising ImportError for some other reason is rare. > > What might perhaps help is if ImportError had the name of the module > that could not be imported as an attribute. Then the code could be > rewritten as: > > try: > import simplejson > except ImportError, err: > if e.module_name != 'simplejson': > raise > > > +1 Michael > -- > --Guido van Rossum (python.org/~guido) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Feb 28 23:09:19 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 Mar 2011 08:09:19 +1000 Subject: [Python-ideas] Coercing str.join() argument elements to str In-Reply-To: References: Message-ID: On Tue, Mar 1, 2011 at 3:35 AM, Guido van Rossum wrote: > It comes up occasionally and is aways rejected on the same grounds as > rejecting '1' + 2. I.e., we believe that the current approach catches > more bugs. With the advent of bytes.join, this is even more true than ever: >>> data = b'1', b'2', b'3', b'4' >>> data (b'1', b'2', b'3', b'4') >>> b''.join(data) # Intended operation b'1234' >>> ''.join(data) # Lack of coercion means a typo gives an immediate error Traceback (most recent call last): File "", line 1, in TypeError: sequence item 0: expected str instance, bytes found >>> ''.join(map(str, data)) # Implicit coercion would convert exception into bad output "b'1'b'2'b'3'b'4'" Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 28 23:18:43 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 Mar 2011 08:18:43 +1000 Subject: [Python-ideas] New pattern-matching library (was: str.split with multiple individual split characters) In-Reply-To: References: <4D6AE8D4.5080709@insectnation.org> <4D6AEE04.4030103@mrabarnett.plus.com> <1D473466-7593-450E-A6AD-039246BE014A@gmail.com> <4D6B83C9.2080503@pearwood.info> <20110228110406.3ae7fec5@bhuda.mired.org> Message-ID: On Tue, Mar 1, 2011 at 3:15 AM, Guido van Rossum wrote: > On the third hand, I could see this as an area where a pure > library-based approach will always be doomed, and where a proposal to > add new syntax would actually make sense. Of course that still has the > same problems due to release time and policy. I suspect one of the core issues isn't so much that regex syntax is arcane, ugly and hard to remember (although those don't help), but the fact that fully general string pattern matching is inherently hard to remember due to the wide range of options. There's a reason glob-style matching is limited to a couple of simple wildcard characters. As as code based alternatives to regexes go, the one I see come up most often as a suggested, working, alternative is pyparsing (although I've never tried it myself). For example: http://stackoverflow.com/questions/3673388/python-replacing-regex-with-bnf-or-pyparsing Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Feb 28 23:35:28 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 1 Mar 2011 08:35:28 +1000 Subject: [Python-ideas] class ModuleNotFoundError(ImportError) In-Reply-To: References: Message-ID: On Tue, Mar 1, 2011 at 7:28 AM, Guido van Rossum wrote: > What might perhaps help is if ImportError had the name of the module > that could not be imported as an attribute. Then the code could be > rewritten as: > > try: > ?import simplejson > except ImportError, err: > ?if e.module_name != 'simplejson': > ? ?raise > ? Logged the suggestion: http://bugs.python.org/issue11356 Perhaps it it worth revisiting the old "import x or y or z as whatever" syntax proposal for 3.3, since it could handle this idiom internally (although deciding what, if anything to do for "from" style imports is a hassle) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cool-rr at cool-rr.com Mon Feb 28 23:41:10 2011 From: cool-rr at cool-rr.com (cool-RR) Date: Mon, 28 Feb 2011 17:41:10 -0500 Subject: [Python-ideas] class ModuleNotFoundError(ImportError) In-Reply-To: References: Message-ID: On Mon, Feb 28, 2011 at 4:28 PM, Guido van Rossum wrote: > On Mon, Feb 28, 2011 at 1:21 PM, cool-RR wrote: > > There are many programs out there, including Django, that "carefully > import" > > a module by doing: > > try: > > import simplejson > > except ImportError: > > import whatever_instead as simplejson > > # or whatever > > This is problematic because sometimes an `ImportError` is raised not > because > > the module is missing, but because there's some error in the module, or > > because the module raises an `ImportError` itself. Then the exception > gets > > totally swallowed, resulting in delightful debugging sessions. > > What do you think about having an exception `ModuleNotFoundError` which > is a > > subclass of `ImportError`? Then people could `except ModuleNotFoundError` > > and be sure that it was caused by a module not existing. This will be a > much > > better way of "carefully importing" a module. Would this be > > backwards-compatible? > > The most problematic issue is actually that the imported module > (above, simplejson) itself imports a non-existent module. Since that > would still raise ModuleNotFoundError, your proposal doesn't really > fix the problem. > > I think modules raising ImportError for some other reason is rare. > > What might perhaps help is if ImportError had the name of the module > that could not be imported as an attribute. Then the code could be > rewritten as: > > try: > import simplejson > except ImportError, err: > if e.module_name != 'simplejson': > raise > > > -- > --Guido van Rossum (python.org/~guido) > I think modules sometimes raise `ImportError` because of problematic circular imports. The `e.module_name != 'simplejson'` suggestion might miss that, no? Would a combination of the `module_name` suggestion with the `ModuleNotFoundError` suggestion solve it? Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Feb 28 23:49:49 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 01 Mar 2011 09:49:49 +1100 Subject: [Python-ideas] Coercing str.join() argument elements to str In-Reply-To: References: Message-ID: <4D6C268D.8070001@pearwood.info> Dirkjan Ochtman wrote: > I just saw someone mention this on Twitter, and I know that I've been > bitten by it before. Seems like it wouldn't break any existing code... > The worst that could happen is that someone gets nonsensical strings > in new code instead of an exception. "I find it amusing when novice programmers believe their main job is preventing programs from crashing. ... More experienced programmers realize that correct code is great, code that crashes could use improvement, but incorrect code that doesn?t crash is a horrible nightmare." -- Chris Smith http://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/ -- Steven