From ncoghlan at gmail.com Sat Jan 1 01:43:48 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Jan 2011 10:43:48 +1000 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On Sat, Jan 1, 2011 at 7:51 AM, Guido van Rossum wrote: > and of course for more fun you can make it more dynamic (think > obfuscated code contests). Not to mention the champions of obfuscation for CPython: doing the same things from an extension module, or by using ctypes to invoke the C API (although such mechanisms are obviously outside the language definition itself, they're still technically legal for non-portable CPython code) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Sat Jan 1 01:50:09 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 1 Jan 2011 10:50:09 +1000 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On Sat, Jan 1, 2011 at 4:49 AM, Guido van Rossum wrote: > (FWIW, optimizing "x[i] = i" would be much simpler -- I don't really > care about the argument that a debugger might interfere. But again, > apart from the simplest cases, it requires a sophisticated parser to > determine that it is really safe to do so.) Back on topic, we've certainly made much bigger bytecode changes that would appear differently in a debugger. Collapsing most of the with statement entry overhead into the single SETUP_WITH opcode is the biggest recent(-ish) example that comes to mind. A more general peephole optimisation that picks up a repeated load operation in a sequence of load commands and replaces it with a single load and some stack rotations may be feasible, but I'm not entirely sure that would actually be an optimisation (especially for LOAD_FAST) - reordering the stack may be slower than the load operation. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From guido at python.org Sat Jan 1 02:17:20 2011 From: guido at python.org (Guido van Rossum) Date: Fri, 31 Dec 2010 17:17:20 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On Fri, Dec 31, 2010 at 4:43 PM, Nick Coghlan wrote: > On Sat, Jan 1, 2011 at 7:51 AM, Guido van Rossum wrote: >> and of course for more fun you can make it more dynamic (think >> obfuscated code contests). > > Not to mention the champions of obfuscation for CPython: doing the > same things from an extension module, or by using ctypes to invoke the > C API (although such mechanisms are obviously outside the language > definition itself, they're still technically legal for non-portable > CPython code) Hm. I wouldn't even call such things "legal" -- rather accidents of the implementation. If someone depended on such an effect, and we changed things to make that no longer work, good luck arguing that we violated a compatibility promise. -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Sat Jan 1 02:52:54 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 01 Jan 2011 12:52:54 +1100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: <4D1E88F6.9000701@pearwood.info> Guido van Rossum wrote: > [Changed subject *and* list] > >> 2010/12/31 Maciej Fijalkowski >>> How do you know that range is a builtin you're thinking >>> about and not some other object? > > On Fri, Dec 31, 2010 at 7:02 AM, Cesare Di Mauro > wrote: >> By a special opcode which could do this work. ]:-) > > That can't be the answer, because then the question would become "how > does the compiler know it can use the special opcode". This particular > issue (generating special opcodes for certain builtins) has actually > been discussed many times before. Alas, given Python's extremely > dynamic promises it is very hard to do it in a way that is > *guaranteed* not to change the semantics. Just tossing ideas out here... pardon me if they've been discussed before, but I read the three PEPs you mentioned later (266, 267 and 280) and they didn't cover any of this. I wonder whether we need to make that guarantee? Perhaps we should distinguish between "safe" optimizations, like constant folding which can't change behaviour, and "unsafe" optimizations which can go wrong under (presumably) rare circumstances. The compiler can continue to apply whatever safe optimizations it likes, but unsafe optimizations must be explicitly asked for by the user. If subtle or not subtle bugs occur, well, Python does allow people to shoot themselves in the foot. There's precedence for this. Both -O and -OO optimization switches potentially change behaviour. -O *should* be safe if code only uses asserts for assertions, but many people (especially beginners) use assert for input checking. If their code breaks under -O they have nobody to blame but themselves. Might we not say that -OO will optimize access to builtins, and if things break, the solution is not to use -OO? [...] > Now, *in practice* such manipulations are rare (with the possible > exception of people replacing open() with something providing hooks > for e.g. a virtual filesystem) and there is probably some benefit to > be had. (I expect that the biggest benefit might well be from > replacing len() with an opcode.) I have in the past proposed to change > the official semantics of the language subtly to allow such > optimizations (i.e. recognizing builtins and replacing them with > dedicated opcodes). There should also be a simple way to disable them, > e.g. by setting "len = len" at the top of a module, one would be > signalling that len() is not to be replaced by an opcode. But it > remains messy and nobody has really gotten very far with implementing > this. It is certainly not "low-hanging fruit" to do it properly. Here's another thought... suppose (say) "builtin" became a reserved word. builtin.range (for example) would always refer to the built-in range, and could be optimized by the compiler. It wouldn't do much for the general case of wanting to optimize non-built-in globals, but this could be optimized safely: def f(): for i in builtin.range(10): builtin.print(i) while this would keep the current semantics: def f(): for i in range(10): print(i) -- Steven From cesare.di.mauro at gmail.com Sat Jan 1 10:32:30 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Sat, 1 Jan 2011 10:32:30 +0100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: 2010/12/31 Guido van Rossum > [Changed subject *and* list] > > > 2010/12/31 Maciej Fijalkowski > >> How do you know that range is a builtin you're thinking > >> about and not some other object? > > On Fri, Dec 31, 2010 at 7:02 AM, Cesare Di Mauro > wrote: > > By a special opcode which could do this work. ]:-) > > That can't be the answer, because then the question would become "how > does the compiler know it can use the special opcode". This particular > issue (generating special opcodes for certain builtins) has actually > been discussed many times before. Alas, given Python's extremely > dynamic promises it is very hard to do it in a way that is > *guaranteed* not to change the semantics. For example, I could have > replaced builtins['range'] with something else; or I could have > inserted a variable named 'range' into the module's __dict__. (Note > that I am not talking about just creating a global variable named > 'range' in the module; those the compiler could recognize. I am > talking about interceptions that a compiler cannot see, assuming it > compiles each module independently, i.e. without whole-program > optimizations.) > Yes, I know it, but the special opcode which I was talking about has a very different usage. The primary goal was to speed-up fors, generating specialized code when the proper range builtin is found at runtime, and it's convenient to have such optimized code. As you stated, the compiler don't know if range is a builtin until runtime (at the precise moment of for execution), so it'll generate two different code paths. The function's bytecode will look like that: 0 SETUP_LOOP 62 2 JUMP_IF_BUILTIN_OR_LOAD_GLOBAL 'range', 40 # Usual, slow, code starts here 40 LOAD_CONSTS (4, 3, 2, 1, 0) # Loads the tuple on the stack 44 LOAD_FAST_MORE_TIMES x, 5 # Loads x 5 times on the stack 46 LOAD_CONSTS (4, 3, 2, 1, 0) # Loads the tuple on the stack 48 STACK_ZIP 3, 5 # "zips" 3 sequences of 5 elements each on the stack 52 STORE_SUBSCR 54 STORE_SUBSCR 56 STORE_SUBSCR 58 STORE_SUBSCR 60 STORE_SUBSCR 62 POP_BLOCK 64 RETURN_CONST 'None' It's just an example; cde can be different based on the compiler optimizations and opcodes available in the virtual machine. The most important thing is that the semantic will be preserved (I never intended to drop it! ;) > Now, *in practice* such manipulations are rare (with the possible > exception of people replacing open() with something providing hooks > for e.g. a virtual filesystem) and there is probably some benefit to > be had. (I expect that the biggest benefit might well be from > replacing len() with an opcode.) I have in the past proposed to change > the official semantics of the language subtly to allow such > optimizations (i.e. recognizing builtins and replacing them with > dedicated opcodes). There should also be a simple way to disable them, > e.g. by setting "len = len" at the top of a module, one would be > signalling that len() is not to be replaced by an opcode. But it > remains messy and nobody has really gotten very far with implementing > this. It is certainly not "low-hanging fruit" to do it properly. > > I should also refer people interested in this subject to at least > three PEPs that were written about this topic: PEP 266, PEP 267 and > PEP 280. All three have been deferred, since nobody was bold enough to > implement at least one of them well enough to be able to tell if it > was even worth the trouble. I read them, and they are interesting, but my case is quite different. > I haven't read either of those in a long > time, and they may well be outdated by current JIT technology. I just > want to warn folks that it's not such a simple matter to replace "for > i in range(....):" with a special opcode. > May be trying to optimize the current non-JITed Python implementation is a death binary. JITs are evolving so much that all the things we have discussed here are already took into account. (FWIW, optimizing "x[i] = i" would be much simpler -- I don't really > care about the argument that a debugger might interfere. But again, > apart from the simplest cases, it requires a sophisticated parser to > determine that it is really safe to do so.) > > -- > --Guido van Rossum (python.org/~guido) > It depends strictly by the goals we want to reach. A more advanced parser with a simple type-inference system can be implemented without requiring a complete parser rebuild, and can give good (albeit not optimal) results. At least lists, dictionaries, tuples, and sets operations, which are very common on Python, can benefit; something for ints, doubles and complexes can be done, also. But looking at the JITs it can be just lost time... Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From cesare.di.mauro at gmail.com Sat Jan 1 10:52:37 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Sat, 1 Jan 2011 10:52:37 +0100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: <4D1E88F6.9000701@pearwood.info> References: <4D1E88F6.9000701@pearwood.info> Message-ID: 2011/1/1 Steven D'Aprano > I wonder whether we need to make that guarantee? Perhaps we should > distinguish between "safe" optimizations, like constant folding which can't > change behaviour, and "unsafe" optimizations which can go wrong under > (presumably) rare circumstances. The compiler can continue to apply whatever > safe optimizations it likes, but unsafe optimizations must be explicitly > asked for by the user. If subtle or not subtle bugs occur, well, Python does > allow people to shoot themselves in the foot. > Do we consider local variable removing (due to internal optimizations) a safe or unsafe operation? Do we consider local variable values "untouchable"? Think about a locals() call that return a list for a variable; lists are mutable objects, so they can be changed by the caller, but the internally generated bytecode can work on a "private" (on stack) copy which doesn't "see" the changes made due to the locals() call. Also, there's the tracing to consider. When trace is enabled, the "handler" cannot find some variables due to some optimizations. Another funny thing that can happen is that if I "group together" some assignment operations into a single, "multiassignment", one (it's another optimization I was thinking about from long time) and you are tracing it, only one tracing event will be generated instead of n. Are such optimizations "legal" / "safe"? For me the answer is yes, because I think that they must be implementation-specific. > > Now, *in practice* such manipulations are rare (with the possible >> exception of people replacing open() with something providing hooks >> for e.g. a virtual filesystem) and there is probably some benefit to >> be had. (I expect that the biggest benefit might well be from >> replacing len() with an opcode.) I have in the past proposed to change >> the official semantics of the language subtly to allow such >> optimizations (i.e. recognizing builtins and replacing them with >> dedicated opcodes). There should also be a simple way to disable them, >> e.g. by setting "len = len" at the top of a module, one would be >> signalling that len() is not to be replaced by an opcode. But it >> remains messy and nobody has really gotten very far with implementing >> this. It is certainly not "low-hanging fruit" to do it properly. >> > > Here's another thought... suppose (say) "builtin" became a reserved word. > builtin.range (for example) would always refer to the built-in range, and > could be optimized by the compiler. It wouldn't do much for the general case > of wanting to optimize non-built-in globals, but this could be optimized > safely: > > def f(): > for i in builtin.range(10): builtin.print(i) > > while this would keep the current semantics: > > > def f(): > for i in range(10): print(i) > > -- > Steven I think that it's not needed. Optimizations must stay behind the scene. We can speedup the code which makes use of builtins without resorting to language changes. JITs already do this, but some ways are possible even on non-JITed VMs. However, they require a longer parse / compile time, which can undesirable. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From mrts.pydev at gmail.com Sat Jan 1 13:57:12 2011 From: mrts.pydev at gmail.com (=?ISO-8859-1?Q?Mart_S=F5mermaa?=) Date: Sat, 1 Jan 2011 14:57:12 +0200 Subject: [Python-ideas] object construction (was: Re: ML Style Pattern Matching for Python) In-Reply-To: <20101219204424.3630e62e@o> References: <201012180021.37232.eike.welk@gmx.net> <201012181223.46790.eike.welk@gmx.net> <4D0D59EF.3070900@pearwood.info> <201012191952.30430.eike.welk@gmx.net> <20101219204424.3630e62e@o> Message-ID: On Sun, Dec 19, 2010 at 9:44 PM, spir wrote: > I find those loads of "self.x=x" in constructors sooo stupid --I want the machine to do it for me. __init__ should only define the essential part of obj construction; while the final constructor would do some mechanical job in addition. Automating that is quite easy with keyword arguments: >>> class Foo(object): ... def __init__(self, **kwargs): ... self.__dict__.update(kwargs) ... >>> f = Foo(a=1, b=2) >>> f.a 1 >>> f.b 2 If you want to play safe, filter out keys that start with '__'. Best regards and a happy new year! Mart S?mermaa From denis.spir at gmail.com Sat Jan 1 15:43:21 2011 From: denis.spir at gmail.com (spir) Date: Sat, 1 Jan 2011 15:43:21 +0100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: <4D1E88F6.9000701@pearwood.info> References: <4D1E88F6.9000701@pearwood.info> Message-ID: <20110101154321.5aadde38@o> On Sat, 01 Jan 2011 12:52:54 +1100 Steven D'Aprano wrote: > > Now, *in practice* such manipulations are rare (with the possible > > exception of people replacing open() with something providing hooks > > for e.g. a virtual filesystem) and there is probably some benefit to > > be had. (I expect that the biggest benefit might well be from > > replacing len() with an opcode.) I have in the past proposed to change > > the official semantics of the language subtly to allow such > > optimizations (i.e. recognizing builtins and replacing them with > > dedicated opcodes). There should also be a simple way to disable them, > > e.g. by setting "len = len" at the top of a module, one would be > > signalling that len() is not to be replaced by an opcode. But it > > remains messy and nobody has really gotten very far with implementing > > this. It is certainly not "low-hanging fruit" to do it properly. > > Here's another thought... suppose (say) "builtin" became a reserved > word. builtin.range (for example) would always refer to the built-in > range, and could be optimized by the compiler. It wouldn't do much for > the general case of wanting to optimize non-built-in globals, but this > could be optimized safely: > > def f(): > for i in builtin.range(10): builtin.print(i) > > while this would keep the current semantics: > > def f(): > for i in range(10): print(i) I had a similar question in a different context (non-python). The point was to prevent core semantic changes in a "pedagogic" mode, such as the sense of an operator on a builtin type. Eg have Real.sum 'untouchable' so that 1.1+2.2 returns what is expected. Instead of protected kywords, my thought went toward read-only (immutable?) containers, where 'container' is a very lose notion including types and scopes that hold them (and even individual objects that can be generated without type). Denis -- -- -- -- -- -- -- vit esse estrany ? spir.wikidot.com From guido at python.org Sat Jan 1 17:41:35 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 1 Jan 2011 08:41:35 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On Sat, Jan 1, 2011 at 5:11 AM, Maciej Fijalkowski wrote: > On Sat, Jan 1, 2011 at 11:32 AM, Cesare Di Mauro wrote: >> Yes, I know it, but the special opcode which I was talking about has a very >> different usage. >> The primary goal was to speed-up fors, generating specialized code when the >> proper range builtin is found at runtime, and it's convenient to have such >> optimized code. >> As you stated, the compiler don't know if range is a builtin until runtime >> (at the precise moment of for execution), so it'll generate two different >> code paths. The function's bytecode will look like that: >> 0 SETUP_LOOP 62 >> 2 JUMP_IF_BUILTIN_OR_LOAD_GLOBAL 'range', 40 >> # Usual, slow, code starts here >> 40 LOAD_CONSTS (4, 3, 2, 1, 0) # Loads the tuple on the stack >> 44 LOAD_FAST_MORE_TIMES x, 5 # Loads x 5 times on the stack >> 46 LOAD_CONSTS (4, 3, 2, 1, 0) # Loads the tuple on the stack >> 48 STACK_ZIP 3, 5 # "zips" 3 sequences of 5 elements each on the stack >> 52 STORE_SUBSCR >> 54 STORE_SUBSCR >> 56 STORE_SUBSCR >> 58 STORE_SUBSCR >> 60 STORE_SUBSCR >> 62 POP_BLOCK >> 64 RETURN_CONST 'None' >> It's just an example; cde can be different based on the compiler >> optimizations and opcodes available in the virtual machine. >> The most important thing is that the semantic will be preserved (I never >> intended to drop it! ;) > > The thing is, having a JIT, this all is completely trivial (as well as > bunch of other stuff like avoiding allocating ints at all). Right. That's a much saner solution than trying to generate bulky bytecode as Cesare proposed. The advantage of a JIT is also that it allows doing these optimizations only in those places where it matters. In general I am not much in favor of trying to optimize Python's bytecode. I prefer the bytecode to be dead simple. This probably makes it an easy target for CS majors interested in code generation, and it probably is a great exercise trying to do something like that, but let's please not confuse that with actual speed improvements to Python -- those come from careful observation (& instrumentation) of real programs, not from looking at toy bytecode samples. (Most of the bytecode improvements that actually made a difference were done in the first 5 years of Python's existence.) > Generating two different code paths has a tendency to lead to code > explosion (even exponential if you're not careful enough), which has > it's own set of problems (including cache locality, because code > executed is no longer a small continuous chunk of memory). What we > (PyPy) do, is to compile only the common path (using JIT) and then > have unlikely path fall back to the interpreter. This generally solves > all of nasty problems you can possibly encounter. Great observation! -- --Guido van Rossum (python.org/~guido) From guido at python.org Sat Jan 1 17:50:08 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 1 Jan 2011 08:50:08 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: <4D1E88F6.9000701@pearwood.info> References: <4D1E88F6.9000701@pearwood.info> Message-ID: On Fri, Dec 31, 2010 at 5:52 PM, Steven D'Aprano wrote: > Guido van Rossum wrote: >> That can't be the answer, because then the question would become "how >> does the compiler know it can use the special opcode". This particular >> issue (generating special opcodes for certain builtins) has actually >> been discussed many times before. Alas, given Python's extremely >> dynamic promises it is very hard to do it in a way that is >> *guaranteed* not to change the semantics. > > Just tossing ideas out here... pardon me if they've been discussed before, > but I read the three PEPs you mentioned later (266, 267 and 280) and they > didn't cover any of this. > > I wonder whether we need to make that guarantee? Perhaps we should > distinguish between "safe" optimizations, like constant folding which can't > change behaviour, (Though notice that our historic track record indicates that they are very dangerous -- we've introduced subtle bugs several times in "trivial" constant folding optimizations.) > and "unsafe" optimizations which can go wrong under > (presumably) rare circumstances. The compiler can continue to apply whatever > safe optimizations it likes, but unsafe optimizations must be explicitly > asked for by the user. If subtle or not subtle bugs occur, well, Python does > allow people to shoot themselves in the foot. > > There's precedence for this. Both -O and -OO optimization switches > potentially change behaviour. -O *should* be safe if code only uses asserts > for assertions, but many people (especially beginners) use assert for input > checking. If their code breaks under -O they have nobody to blame but > themselves. Might we not say that -OO will optimize access to builtins, and > if things break, the solution is not to use -OO? Maybe. But that means it will probably rarely be used -- realistically, who uses -O or -OO? I don't ever. Even so, there would have to be a way to turn the optimization off even under -OO for a particular module or function or code location, or for a particular builtin (again, open() comes to mind). > Here's another thought... suppose (say) "builtin" became a reserved word. > builtin.range (for example) would always refer to the built-in range, and > could be optimized by the compiler. It wouldn't do much for the general case > of wanting to optimize non-built-in globals, but this could be optimized > safely: > > def f(): > ? ?for i in builtin.range(10): builtin.print(i) > > while this would keep the current semantics: > > def f(): > ? ?for i in range(10): print(i) That defaults the wrong way. You want the optimization to work (if the compiler does not see explicit reasons to avoid it) except in rare cases where the compiler cannot know that you're dynamically modifying the ennvironment (globals or builtins). Also I would very much worry that people would start putting this in everywhere out of a mistaken defensive attitude. (Like what has happened to certain micro-optimization patterns, which are being overused, making code less readable.) -- --Guido van Rossum (python.org/~guido) From guido at python.org Sat Jan 1 17:59:57 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 1 Jan 2011 08:59:57 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: <4D1E88F6.9000701@pearwood.info> Message-ID: On Sat, Jan 1, 2011 at 1:52 AM, Cesare Di Mauro wrote: > 2011/1/1 Steven D'Aprano >> >> I wonder whether we need to make that guarantee? Perhaps we should >> distinguish between "safe" optimizations, like constant folding which can't >> change behaviour, and "unsafe" optimizations which can go wrong under >> (presumably) rare circumstances. The compiler can continue to apply whatever >> safe optimizations it likes, but unsafe optimizations must be explicitly >> asked for by the user. If subtle or not subtle bugs occur, well, Python does >> allow people to shoot themselves in the foot. > > Do we consider local variable removing (due to internal optimizations) a > safe or unsafe operation? I would consider it safe unless the function locals() is called directly in the function -- always assuming the compiler does not see obvious other evidence (like a local being used by a nested function). Locals are "special" in many ways already. There should be a way to disable this (globally) in case you want to step through the code with a debugger though -- IDEs like WingIDE and PyCharm make stepping through code very easy to set up, and variable inspection is a big part of the process of debugging this way. It's probably fine if such optimizations are only enabled by -O. Still, I wonder if it isn't much better to try to do this using a JIT instead of messing with the bytecode. You'll find that the current compiler just really doesn't have the datastructures needed to do these kind of optimizations right. > Do we consider local variable values "untouchable"? Think about a locals() > call that return a list for a variable; lists are mutable objects, so they > can be changed by the caller, but the internally generated bytecode can work > on a "private" (on stack) copy which doesn't "see" the changes made due to > the locals() call. Are you sure? locals() makes only a shallow copy, so changes to the list's contents made via the list returned by locals() should be completely visible by the bytecode. > Also, there's the tracing to consider. When trace is enabled, the "handler" > cannot find some variables due to some optimizations. Tracing is a special case of debugging. > Another funny thing that can happen is that if I "group together" some > assignment operations into a single, "multiassignment", one (it's another > optimization I was thinking about from long time) and you are tracing it, > only one tracing event will be generated instead of n. > Are such optimizations "legal" / "safe"? For me the answer is yes, because I > think that they must be implementation-specific. I've traced through C code generated by gcc with an optimization setting. It can be a bit confusing to be jumping around in the optimized code, and it's definitely easier to trace through unoptimized code, but if you have the choice between tracing the (optimized) binary you have, or not tracing at all, I'll take what I can get. Still, when you're planning to trace/debug it's better to have a flag to disable it, and not using -O sounds like the right thing to me. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sat Jan 1 21:30:42 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 01 Jan 2011 15:30:42 -0500 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On 1/1/2011 11:41 AM, Guido van Rossum wrote: > In general I am not much in favor of trying to optimize Python's > bytecode. I prefer the bytecode to be dead simple. I think people constantly underestimate the virtue of Python and CPython simplicity. Projects that depend on a couple of genius ubergeeks die when the ubergeeks leave. The executable-pseudocode simplicity of the language makes it a favorite for scientific programming, spilling over into financial programming. The simplicity of the code allows competent students (and non-CS major adults) become developers. -- Terry Jan Reedy From benjamin at python.org Sat Jan 1 21:35:28 2011 From: benjamin at python.org (Benjamin Peterson) Date: Sat, 1 Jan 2011 20:35:28 +0000 (UTC) Subject: [Python-ideas] Optimizing builtins References: Message-ID: Guido van Rossum writes: > The compiler has no way to notice this when a.py is being compiled. You could still optimize it if you insert a runtime "guard" before the range usage and see if its been overridden. From guido at python.org Sat Jan 1 21:37:05 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 1 Jan 2011 12:37:05 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On Sat, Jan 1, 2011 at 12:30 PM, Terry Reedy wrote: > On 1/1/2011 11:41 AM, Guido van Rossum wrote: > >> In general I am not much in favor of trying to optimize Python's >> bytecode. I prefer the bytecode to be dead simple. > > I think people constantly underestimate the virtue of Python and CPython > simplicity. Projects that depend on a couple of genius ubergeeks die when > the ubergeeks leave. The executable-pseudocode simplicity of the language > makes it a favorite for scientific programming, spilling over into financial > programming. The simplicity of the code allows competent students (and > non-CS major adults) become developers. And, of course, the (relative) simplicity of the implementation will always draw CS students looking for compiler optimization projects (just as the simplicity of the language draws CS students looking to write a complete compiler). But it's one thing to get a degree out of some clever optimization; it's another thing to actually make it stick in the context of CPython, with the concerns you mention (and others I only have in my guts :-). -- --Guido van Rossum (python.org/~guido) From guido at python.org Sat Jan 1 21:37:59 2011 From: guido at python.org (Guido van Rossum) Date: Sat, 1 Jan 2011 12:37:59 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On Sat, Jan 1, 2011 at 12:35 PM, Benjamin Peterson wrote: > Guido van Rossum writes: >> The compiler has no way to notice this when a.py is being compiled. > > You could still optimize it if you insert a runtime "guard" before the range > usage and see if its been overridden. Yeah, that was Cesare's idea. I think that's a great strategy for a JIT compiler, but not appropriate for bytecode (IMO). -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sat Jan 1 23:16:16 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 01 Jan 2011 17:16:16 -0500 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: On 1/1/2011 3:37 PM, Guido van Rossum wrote: > And, of course, the (relative) simplicity of the implementation will > always draw CS students looking for compiler optimization projects And, ironically, slightly reduce the simplicity that attracted them. No one thinks that their straw will break the camel's back (or cause him to drop to his knees), and they are usually right. But when the camel sags, all added straws are equally responsible. > (just as the simplicity of the language draws CS students looking to > write a complete compiler). But it's one thing to get a degree out of > some clever optimization; it's another thing to actually make it stick > in the context of CPython, with the concerns you mention (and others I > only have in my guts :-). For one thing, you have your eye on the camel ;-). And your current job keep you grounded in the needs of real code. (In a current python-list discussion, someone demonstrated with timeit that in late 2.x, each iteration of 'while 1: pass' takes about a microsecond less than for 'while True: pass'. The reason for that, and the disappearance of the difference in 3.x is mildly interesting, but the practical import for any real code that does anything inside the loop is essentially 0.) -- Terry Jan Reedy From cesare.di.mauro at gmail.com Sun Jan 2 07:07:21 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Sun, 2 Jan 2011 07:07:21 +0100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: 2011/1/1 Guido van Rossum > Right. That's a much saner solution than trying to generate bulky > bytecode as Cesare proposed. The advantage of a JIT is also that it > allows doing these optimizations only in those places where it > matters. > > In general I am not much in favor of trying to optimize Python's > bytecode. I prefer the bytecode to be dead simple. If Python direction is to embrace some JIT technology, I fully agree with you: it is best to make VM & compiler simpler. Anyway, and as I already said before, mine were just examples of possible things that can happen with optimizations. > This probably makes > it an easy target for CS majors interested in code generation, and it > probably is a great exercise trying to do something like that, but > let's please not confuse that with actual speed improvements to Python > -- those come from careful observation (& instrumentation) of real > programs, not from looking at toy bytecode samples. (Most of the > bytecode improvements that actually made a difference were done in the > first 5 years of Python's existence.) > > --Guido van Rossum (python.org/~guido) > But research never stops. SETUP_WITH is just a recent example. Also, sometimes completely different ideas can bring some innovation. ;) Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From cesare.di.mauro at gmail.com Sun Jan 2 07:22:36 2011 From: cesare.di.mauro at gmail.com (Cesare Di Mauro) Date: Sun, 2 Jan 2011 07:22:36 +0100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: <4D1E88F6.9000701@pearwood.info> Message-ID: 2011/1/1 Guido van Rossum > On Sat, Jan 1, 2011 at 1:52 AM, Cesare Di Mauro > > Do we consider local variable removing (due to internal optimizations) a > > safe or unsafe operation? > > I would consider it safe unless the function locals() is called > directly in the function -- always assuming the compiler does not see > obvious other evidence (like a local being used by a nested function). > The problem here is that locals is a builtin function, not a keyword, so the compiler must resort to something like the "code fork" that I showed before, if we want to keep the correct language semantic. > Still, I wonder if it isn't much better to try to do this using a JIT > instead of messing with the bytecode. Ditto for me, if a good (and not resource hungry) JIT will come. > You'll find that the current > compiler just really doesn't have the datastructures needed to do > these kind of optimizations right. > Right. Not now, but something can be made if and only if it makes sense. A JIT can make it non-sense, of course. > Do we consider local variable values "untouchable"? Think about a locals() > > call that return a list for a variable; lists are mutable objects, so > they > > can be changed by the caller, but the internally generated bytecode can > work > > on a "private" (on stack) copy which doesn't "see" the changes made due > to > > the locals() call. > > Are you sure? locals() makes only a shallow copy, so changes to the > list's contents made via the list returned by locals() should be > completely visible by the bytecode. > > --Guido van Rossum (python.org/~guido) > Nice to know it. Reading from the doc ( http://docs.python.org/library/functions.html#locals ) it was not clear for me. Thanks. Cesare -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan_ml at behnel.de Sun Jan 2 10:10:51 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 02 Jan 2011 10:10:51 +0100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: Benjamin Peterson, 01.01.2011 21:35: > Guido van Rossum writes: >> The compiler has no way to notice this when a.py is being compiled. > > You could still optimize it if you insert a runtime "guard" before the range > usage and see if its been overridden. The problem here is that you wouldn't save the lookup. So you'd still pay a high price to find out that the builtin has not been overridden. There can be substantial savings for builtins that can be optimised away or replaced by a tighter/adapted implementation. We do that a lot in Cython where builtins are (by default) considered static unless redefined inside of the module. An important example are generator expressions like "any(genexpr)". If the function was known to be builtin at compile time, CPython could generate much simpler byte code for these, dropping the need for a generator and its closure. But as long as you have to check for an override at each call, you end up with the duplicated code (optimised and fall-back version) and an increased entry overhead that may well kill the savings. Stefan From stefan_ml at behnel.de Sun Jan 2 10:38:06 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Sun, 02 Jan 2011 10:38:06 +0100 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: <4D1E88F6.9000701@pearwood.info> Message-ID: Guido van Rossum, 01.01.2011 17:50: > On Fri, Dec 31, 2010 at 5:52 PM, Steven D'Aprano wrote: >> and "unsafe" optimizations which can go wrong under >> (presumably) rare circumstances. The compiler can continue to apply whatever >> safe optimizations it likes, but unsafe optimizations must be explicitly >> asked for by the user. If subtle or not subtle bugs occur, well, Python does >> allow people to shoot themselves in the foot. >> >> There's precedence for this. Both -O and -OO optimization switches >> potentially change behaviour. -O *should* be safe if code only uses asserts >> for assertions, but many people (especially beginners) use assert for input >> checking. If their code breaks under -O they have nobody to blame but >> themselves. Might we not say that -OO will optimize access to builtins, and >> if things break, the solution is not to use -OO? > > Maybe. But that means it will probably rarely be used -- > realistically, who uses -O or -OO? I don't ever. Even so, there would > have to be a way to turn the optimization off even under -OO for a > particular module or function or code location, or for a particular > builtin (again, open() comes to mind). If this ever happes, -O and -OO will no longer be expressive enough (IMHO, -OO currently isn't anyway). There would be a need to support options like "-Ostatic-builtins" and the like. The problem then is how to keep users from applying a particular optimisation to a particular module. New settings in distutils could help to enable optimisations and maybe even to explicitly forbid optimisations, but life would certainly become more error prone for distributors and users. It's hard to keep track of the amount of bug reports and help requests we get from (mostly new) Cython users about a missing "-fno-strict-aliasing" when compiled modules don't work in Python 2. I expect the same to come up when users start to install Python modules with all sorts of great CPython optimisations. Test suites may well fail to catch the one bug that an optimisation triggers. >> Here's another thought... suppose (say) "builtin" became a reserved word. >> builtin.range (for example) would always refer to the built-in range, and >> could be optimized by the compiler. It wouldn't do much for the general case >> of wanting to optimize non-built-in globals, but this could be optimized >> safely: >> >> def f(): >> for i in builtin.range(10): builtin.print(i) >> >> while this would keep the current semantics: >> >> def f(): >> for i in range(10): print(i) > > That defaults the wrong way. My impression exactly, so I'm -1. But the trade-off behind this is: complicating new code versus breaking old code (Python 3 classic). Stefan From cmjohnson.mailinglist at gmail.com Sun Jan 2 10:46:53 2011 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sat, 1 Jan 2011 23:46:53 -1000 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: <4D1E88F6.9000701@pearwood.info> Message-ID: On Sat, Jan 1, 2011 at 11:38 PM, Stefan Behnel wrote: > If this ever happes, -O and -OO will no longer be expressive enough (IMHO, > -OO currently isn't anyway). There would be a need to support options like > "-Ostatic-builtins" and the like. The problem then is how to keep users from > applying a particular optimisation to a particular module. New settings in > distutils could help to enable optimisations and maybe even to explicitly > forbid optimisations, but life would certainly become more error prone for > distributors and users. It's hard to keep track of the amount of bug reports > and help requests we get from (mostly new) Cython users about a missing > "-fno-strict-aliasing" when compiled modules don't work in Python 2. I > expect the same to come up when users start to install Python modules with > all sorts of great CPython optimisations. Test suites may well fail to catch > the one bug that an optimisation triggers. If a flag wouldn't work, what about a pragma? Pragma smell a bit unpythonic to me, but we did have a pragma for source encoding and unicode literals in Python 2, so it's not unprecedented. How much would it solve efficiency-wise if you could just write at the top of a particular module ##ORIGINAL BUILTINS ONLY PLEASE ? Or is this the first step on the dark path to Perl 6? -- Carl Johnson From guido at python.org Sun Jan 2 17:10:44 2011 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Jan 2011 08:10:44 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: <4D1E88F6.9000701@pearwood.info> Message-ID: > If a flag wouldn't work, what about a pragma? Pragma smell a bit > unpythonic to me, but we did have a pragma for source encoding and > unicode literals in Python 2, so it's not unprecedented. How much > would it solve efficiency-wise if you could just write at the top of a > particular module ##ORIGINAL BUILTINS ONLY PLEASE ? Or is this the > first step on the dark path to Perl 6? Again, this would encourage people to put such junk in every module they write, so it would lose its value. At this point in the thread I am tempted to propose an optimization moratorium, just to stop the flood of poorly-thought-through proposals. If you really want to make Python faster, don't waste your time in this thread. Go contribute to PyPy or Unladen Swallow. Or go fix the GIL, so we can use multiple cores. -- --Guido van Rossum (python.org/~guido) From fuzzyman at voidspace.org.uk Mon Jan 3 15:33:06 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 03 Jan 2011 14:33:06 +0000 Subject: [Python-ideas] Optimizing builtins In-Reply-To: References: Message-ID: <4D21DE22.3040409@voidspace.org.uk> On 31/12/2010 21:51, Guido van Rossum wrote: > On Fri, Dec 31, 2010 at 11:59 AM, Michael Foord > wrote: >> >> On 31 December 2010 18:49, Guido van Rossum wrote: >>> [Changed subject *and* list] >>> >>>> 2010/12/31 Maciej Fijalkowski >>>>> How do you know that range is a builtin you're thinking >>>>> about and not some other object? >>> On Fri, Dec 31, 2010 at 7:02 AM, Cesare Di Mauro >>> wrote: >>>> By a special opcode which could do this work. ]:-) >>> That can't be the answer, because then the question would become "how >>> does the compiler know it can use the special opcode". This particular >>> issue (generating special opcodes for certain builtins) has actually >>> been discussed many times before. Alas, given Python's extremely >>> dynamic promises it is very hard to do it in a way that is >>> *guaranteed* not to change the semantics. For example, I could have >>> replaced builtins['range'] with something else; or I could have >>> inserted a variable named 'range' into the module's __dict__. (Note >>> that I am not talking about just creating a global variable named >>> 'range' in the module; those the compiler could recognize. I am >>> talking about interceptions that a compiler cannot see, assuming it >>> compiles each module independently, i.e. without whole-program >>> optimizations.) >>> >>> Now, *in practice* such manipulations are rare >> Actually range is the one I've seen *most* overridden, not in order to >> replace functionality but because range is such a useful (or relevant) >> variable name in all sorts of circumstances... > No, you're misunderstanding. I was not referring to the overriding a > name using Python's regular syntax for defining names. If you set a > (global or local) variable named 'range', the compiler is perfectly > capable of noticing. E.g.: > > range = 42 > def foo(): > for i in range(10): print(i) > Right, in the same way the compiler notices local and global variable use and compiles different bytecode for lookups. It's just that accidentally overriding range is the source of my favourite "confusing Python error messages" story and I look for any opportunity to repeat it. A few years ago I worked for a company where most of the (very talented) developers were new to Python. They called me over to explain what "UnboundLocalError" meant and why they were getting it in what looked (to them) like perfectly valid code. The code looked something like: def something(start, stop): positions = range(start, stop) # more code here... range = process(positions) All the best, Michael Foord > While this will of course fail with a TypeError if you try to execute > it, a (hypothetical) optimizing compiler would have no trouble > noticing that the 'range' in the for-loop must refer to the global > variable of that name, not to the builtin of the same name. > > I was referring to an innocent module containing a use of the builtin > range function, e.g. > > # a.py > def f(): > for i in range(10): print(i) > > which is imported by another module which manipulates a's globals, for example: > > # b.py > import a > a.range = 42 > a.f() > > The compiler has no way to notice this when a.py is being compiled. > > Variants of "hiding" a mutation like this include: > > a.__dict__['range'] = 42 > > or > > import builtins > builtins.range = 42 > > and of course for more fun you can make it more dynamic (think > obfuscated code contests). > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From guido at python.org Mon Jan 3 19:09:07 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Jan 2011 10:09:07 -0800 Subject: [Python-ideas] Optimizing builtins In-Reply-To: <4D21DE22.3040409@voidspace.org.uk> References: <4D21DE22.3040409@voidspace.org.uk> Message-ID: On Mon, Jan 3, 2011 at 6:33 AM, Michael Foord wrote: > A few years ago I worked for a company where most of the (very talented) > developers were new to Python. They called me over to explain what > "UnboundLocalError" meant and why they were getting it in what looked (to > them) like perfectly valid code. The code looked something like: > > def something(start, stop): > ? ?positions = range(start, stop) > > ? ?# more code here... > > ? ?range = process(positions) Yeah, and the really annoying thing for us old-timers is that this used to work (in Python 1.0 or so :-). Once upon a time, looking up locals was as dynamic as looking up globals and builtins is today. Still, I think for optimizing builtins we can do slightly better. -- --Guido van Rossum (python.org/~guido) From rich at noir.com Mon Jan 3 22:09:14 2011 From: rich at noir.com (K. Richard Pixley) Date: Mon, 03 Jan 2011 13:09:14 -0800 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. Message-ID: <4D223AFA.1010802@noir.com> There's a whole matrix of these and I'm wondering why the matrix is currently sparse rather than implementing them all. Or rather, why we can't stack them as: class foo(object): @classmethod @property def bar(cls, ...): ... Essentially the permutation are, I think: {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable attribute}. concreteness implicit first arg type name comments {unadorned} {unadorned} method def foo(): exists now {unadorned} {unadorned} property @property exists now {unadorned} {unadorned} non-callable attribute x = 2 exists now {unadorned} static method @staticmethod exists now {unadorned} static property @staticproperty proposing {unadorned} static non-callable attribute {degenerate case - variables don't have arguments} unnecessary {unadorned} class method @classmethod exists now {unadorned} class property @classproperty or @classmethod;@property proposing {unadorned} class non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract {unadorned} method @abc.abstractmethod exists now abc.abstract {unadorned} property @abc.abstractproperty exists now abc.abstract {unadorned} non-callable attribute @abc.abstractattribute or @abc.abstract;@attribute proposing abc.abstract static method @abc.abstractstaticmethod exists now abc.abstract static property @abc.staticproperty proposing abc.abstract static non-callable attribute {degenerate case - variables don't have arguments} unnecessary abc.abstract class method @abc.abstractclassmethod exists now abc.abstract class property @abc.abstractclassproperty proposing abc.abstract class non-callable attribute {degenerate case - variables don't have arguments} unnecessary I think the meanings of the new ones are pretty straightforward, but in case they are not... @staticproperty - like @property only without an implicit first argument. Allows the property to be called directly from the class without requiring a throw-away instance. @classproperty - like @property, only the implicit first argument to the method is the class. Allows the property to be called directly from the class without requiring a throw-away instance. @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty --rich -------------- next part -------------- An HTML attachment was scrubbed... URL: From dirkjan at ochtman.nl Mon Jan 3 22:22:14 2011 From: dirkjan at ochtman.nl (Dirkjan Ochtman) Date: Mon, 3 Jan 2011 22:22:14 +0100 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: <4D223AFA.1010802@noir.com> References: <4D223AFA.1010802@noir.com> Message-ID: On Mon, Jan 3, 2011 at 22:09, K. Richard Pixley wrote: > I think the meanings of the new ones are pretty straightforward, but in case they are not... > > @staticproperty - like @property only without an implicit first argument.? Allows the property to be called directly from the class without requiring a throw-away instance. > > @classproperty - like @property, only the implicit first argument to the method is the class.? Allows the property to be called directly from the class without requiring a throw-away instance. > > @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses > > @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty > > @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty Do you have actual use cases for these? Cheers, Dirkjan From rich at noir.com Mon Jan 3 23:06:35 2011 From: rich at noir.com (K. Richard Pixley) Date: Mon, 03 Jan 2011 14:06:35 -0800 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: References: <4D223AFA.1010802@noir.com> Message-ID: <4D22486B.1090607@noir.com> On 20110103 13:22, Dirkjan Ochtman wrote: > On Mon, Jan 3, 2011 at 22:09, K. Richard Pixley wrote: >> I think the meanings of the new ones are pretty straightforward, but in case they are not... >> >> @staticproperty - like @property only without an implicit first argument. Allows the property to be called directly from the class without requiring a throw-away instance. >> >> @classproperty - like @property, only the implicit first argument to the method is the class. Allows the property to be called directly from the class without requiring a throw-away instance. >> >> @abc.abstractattribute - a simple, non-callable variable that must be overridden in subclasses >> >> @abc.abstractstaticproperty - like @abc.abstractproperty only for @staticproperty >> >> @abc.abstractclassproperty - like @abc.abstractproperty only for @classproperty > Do you have actual use cases for these? Yes. Here's a toy example for abstractclassproperty: class InstanceKeeper(object): __metaclass__ = abc.ABCMeta @abc.abstractclassproperty def all_instances(cls): raise NotImplementedError class InstancesByList(InstanceKeeper): instances = [] @classproperty def all_instances(cls): return cls.instances class InstancesByDict(InstanceKeeper): instances = {} @classproperty def all_instances(cls): return list(cls.instances) class WhateversByList(InstancesByList): instances = [] ... class OthersByList(InstancesByList): instances = [] ... class StillMoreByDict(InstancesByDict): instances = {} ... class MoreAgainByDict(InstancesByDict): instances = {} ... I'm working on a library for reading and writing ELF format object files. I have a bunch of classes representing various structs. And the structs have, or point to, other structs. I'm using different subclasses to describe different byte ordering, (endianness), and word size, (32 vs 64 bit). Here are examples for the others. class Codable(object): __metaclass__ = abc.ABCMeta @abc.abstractattribute coder = None @classproperty def size(cls): return cls.coder.size class FileHeader(Codable): __metaclass__ = abc.ABCMeta @abc.abstractattribute sectionHeaderClass = None """ Used to create new instances. """" sectionHeader = None @abc.abstractstaticproperty def word_size(): raise NotImplementedError def __new__(...): """ factory function reading the first few bytes and returning an instance of one of the subclasses """ ... def __init__(self, ...): ... self.sectionHeader = self.sectionHeaderClass(...) class Bit64(object): @staticproperty def word_size(): return 64 class Bit32(object): @staticproperty def word_size(): return 32 class FileHeader64l(FileHeader, Bit64): coder = struct.Struct(...) sectionHeaderClass = SectionHeader64l class FileHeader64b(FileHeader, Bit64): coder = struct.Struct(...) sectionHeaderClass = SectionHeader64b class FileHeader32l(FileHeader, Bit32): coder = struct.Struct(...) sectionHeaderClass = SectionHeader32l class FileHeader32b(FileHeader, Bit32): coder = struct.Struct(...) sectionHeaderClass = SectionHeader32b class SectionHeader(Codable): __metaclass__ = ABCMeta @abc.abstractattribute subsectionHeaderClass = None ... class SectionHeader64l(SectionHeader, Bit64): coder = struct.Struct(...) .... --rich From rich at noir.com Mon Jan 3 23:18:42 2011 From: rich at noir.com (K. Richard Pixley) Date: Mon, 03 Jan 2011 14:18:42 -0800 Subject: [Python-ideas] read-only attributes Message-ID: <4D224B42.5070204@noir.com> It seems to me that one of the more common reasons for using @property is to create a read-only attribute. I wonder if it would make sense to simply create a read-only decorator. Compare: class Foo(object): _size = 4 @property def size(self): return _size against: class Foo(object): @read-only size = 4 This gets more interesting if decorators nest: class Foo(object): __metaclass__ = abc.ABCMeta @abstract @classattribute @read-only size = None --rich From solipsis at pitrou.net Mon Jan 3 23:31:11 2011 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 3 Jan 2011 23:31:11 +0100 Subject: [Python-ideas] read-only attributes References: <4D224B42.5070204@noir.com> Message-ID: <20110103233111.034660c3@pitrou.net> On Mon, 03 Jan 2011 14:18:42 -0800 "K. Richard Pixley" wrote: > > class Foo(object): > @read-only > size = 4 What's wrong with: >>> class Foo: ... @property ... def foo(self): ... return 4 ... >>> Foo().foo 4 >>> Foo().foo = 5 Traceback (most recent call last): File "", line 1, in AttributeError: can't set attribute ? From guido at python.org Tue Jan 4 01:56:05 2011 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Jan 2011 16:56:05 -0800 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: <4D22486B.1090607@noir.com> References: <4D223AFA.1010802@noir.com> <4D22486B.1090607@noir.com> Message-ID: > On 20110103 13:22, Dirkjan Ochtman wrote: >> Do you have actual use cases for these? On Mon, Jan 3, 2011 at 2:06 PM, K. Richard Pixley wrote: > Yes. ?Here's a toy example for abstractclassproperty: Um, a toy example is pretty much the opposite of a use case. :-( That said, I am sure there are use cases for static property and class property -- I've run into them myself. An example use case for class property: in App Engine, we have a Model class (it's similar to Django's Model class). A model has a "kind" which is a string (it's the equivalent of an SQL table name). The kind is usually the class name but sometimes there's a need to override it. However once the class is defined it should be considered read-only. Currently our choices are to make this an instance property (but there are some situations where we don't have an instance, e.g. when creating a new instance using a class method); or to make it a class attribute (but this isn't read-only); or to make it a class method (which requires the user to write M.kind() instead of M.kind). If I had class properties I'd use one here. -- --Guido van Rossum (python.org/~guido) From cool-rr at cool-rr.com Tue Jan 4 03:28:25 2011 From: cool-rr at cool-rr.com (cool-RR) Date: Tue, 4 Jan 2011 04:28:25 +0200 Subject: [Python-ideas] An improved `ContextManager` Message-ID: Hello folks. Ever since Michael Foord talked about `ContextDecorator` in python-ideas I've been kicking around an idea for my own take on it. It's a `ContextManager` class which provides the same thing that Foord's `ContextDecorator` does, but also provides a few more goodies, chief of which being the `manage_context` method. I've been working on this for a few days and I think it's ready for review. It's well-tested and extensively documented. I started using it wherever I have context managers in GarlicSim. I'll be happy to get your opinions on my approach and any critiques you may have. If there are no problems with this approach, I'll probably release it with GarlicSim 0.6.1 and blog about it. Here is my `context_manager` module. Here are its tests . Following is the module's docstring which explains the module in more detail. Ram. Defines the `ContextManager` and `ContextManagerType` classes. These classes allow for greater freedom both when (a) defining context managers and when (b) using them. Inherit all your context managers from `ContextManager` (or decorate your generator functions with `ContextManagerType`) to enjoy all the benefits described below. Defining context managers ------------------------- There are 3 different ways in which context managers can be defined, and each has their own advantages and disadvantages over the others. 1. The classic way to define a context manager is to define a class with `__enter__` and `__exit__` methods. This is allowed, and if you do this you should still inherit from `ContextManager`. Example: class MyContextManager(ContextManager): def __enter__(self): pass # preparation def __exit__(self, type_=None, value=None, traceback=None): pass # cleanup 2. As a decorated generator, like so: @ContextManagerType def MyContextManager(): try: yield finally: pass # clean-up This usage is nothing new; It's also available when using the standard library's `contextlib.contextmanager` decorator. One thing that is allowed here that `contextlib` doesn't allow is to yield the context manager itself by doing `yield SelfHook`. 3. The third and novel way is by defining a class with a `manage_context` method which returns a decorator. Example: class MyContextManager(ContextManager): def manage_context(self): do_some_preparation() try: with some_lock: yield self finally: do_some_cleanup() This approach is sometimes cleaner than defining `__enter__` and `__exit__`; Especially when using another context manager inside `manage_context`. In our example we did `with some_lock` in our `manage_context`, which is shorter and more idiomatic than calling `some_lock.__enter__` in an `__enter__` method and `some_lock.__exit__` in an `__exit__` method. These were the different ways of *defining* a context manager. Now let's see the different ways of *using* a context manager: Using context managers ---------------------- There are 2 different ways in which context managers can be used: 1. The plain old honest-to-Guido `with` keyword: with MyContextManager() as my_context_manager: do_stuff() 2. As a decorator to a function @MyContextManager() def do_stuff(): pass # doing stuff When the `do_stuff` function will be called, the context manager will be used. This functionality is also available in the standard library of Python 3.2+ by using `contextlib.ContextDecorator`, but here it is combined with all the other goodies given by `ContextManager`. That's it. Inherit all your context managers from `ContextManager` (or decorate your generator functions with `ContextManagerType`) to enjoy all these benefits. -- Sincerely, Ram Rachum -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott+python-ideas at scottdial.com Tue Jan 4 03:13:02 2011 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Mon, 03 Jan 2011 21:13:02 -0500 Subject: [Python-ideas] read-only attributes In-Reply-To: <20110103233111.034660c3@pitrou.net> References: <4D224B42.5070204@noir.com> <20110103233111.034660c3@pitrou.net> Message-ID: <4D22822E.8020009@scottdial.com> On 1/3/2011 5:31 PM, Antoine Pitrou wrote: > On Mon, 03 Jan 2011 14:18:42 -0800 > "K. Richard Pixley" wrote: >> >> class Foo(object): >> @read-only >> size = 4 > > What's wrong with: > >>>> class Foo: > ... @property > ... def foo(self): > ... return 4 > ... >>>> Foo().foo > 4 >>>> Foo().foo = 5 > Traceback (most recent call last): > File "", line 1, in > AttributeError: can't set attribute > > ? > s/4/Bar()/: >>> class Foo: ... @property ... def foo(self): ... return Bar() -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From fuzzyman at voidspace.org.uk Tue Jan 4 12:00:16 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 4 Jan 2011 11:00:16 +0000 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: References: <4D223AFA.1010802@noir.com> <4D22486B.1090607@noir.com> Message-ID: On 4 January 2011 00:56, Guido van Rossum wrote: > > On 20110103 13:22, Dirkjan Ochtman wrote: > >> Do you have actual use cases for these? > > On Mon, Jan 3, 2011 at 2:06 PM, K. Richard Pixley wrote: > > Yes. Here's a toy example for abstractclassproperty: > > Um, a toy example is pretty much the opposite of a use case. :-( > > That said, I am sure there are use cases for static property and class > property -- I've run into them myself. > > An example use case for class property: in App Engine, we have a Model > class (it's similar to Django's Model class). A model has a "kind" > which is a string (it's the equivalent of an SQL table name). The kind > is usually the class name but sometimes there's a need to override it. > However once the class is defined it should be considered read-only. > Currently our choices are to make this an instance property (but there > are some situations where we don't have an instance, e.g. when > creating a new instance using a class method); or to make it a class > attribute (but this isn't read-only); or to make it a class method > (which requires the user to write M.kind() instead of M.kind). If I > had class properties I'd use one here. > A class property that can be fetched is very easy to implement. Because of asymmetry in the descriptor protocol I don't think you can create a class property with behaviour on set though (unless you use a metaclass I guess). class classproperty(object): def __init__(self, function): self._function = function def __get__(self, instance, owner): return self._function(owner) All the best, Michael > -- > --Guido van Rossum (python.org/~guido ) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Jan 4 15:16:55 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 00:16:55 +1000 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: <4D223AFA.1010802@noir.com> References: <4D223AFA.1010802@noir.com> Message-ID: On Tue, Jan 4, 2011 at 7:09 AM, K. Richard Pixley wrote: > I think the meanings of the new ones are pretty straightforward, but in > case they are not... > > @staticproperty - like @property only without an implicit first argument. > Allows the property to be called directly from the class without requiring a > throw-away instance. > > @classproperty - like @property, only the implicit first argument to the > method is the class. Allows the property to be called directly from the > class without requiring a throw-away instance. > As Michael mentions later in the thread, these can't really work due to the asymmetry in the descriptor protocol: if you retrieve a descriptor object directly from a class, the interpreter will consult the __get__ method of that descriptor, but if you set or delete it through the class, it will just perform the set or delete - the descriptor has no say in the matter, even if it defines __set__ or __delete__ methods. (See the example interpreter session at http://pastebin.com/1M7KYB9d). The only way to get static or class properties to work correctly is to define them on the metaclass, in which case you can just use the existing property descriptor (although you have to then jump through additional hoops to make access via instances work properly - off the top of my head, I'm actually not sure how to make that happen). @abc.abstractattribute - a simple, non-callable variable that must be > overridden in subclasses > You can't decorate attributes, only functions. > @abc.abstractstaticproperty - like @abc.abstractproperty only for > @staticproperty > > @abc.abstractclassproperty - like @abc.abstractproperty only for > @classproperty > See above. These don't exist because staticproperty and classproperty don't work. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikegraham at gmail.com Tue Jan 4 16:23:52 2011 From: mikegraham at gmail.com (Mike Graham) Date: Tue, 4 Jan 2011 10:23:52 -0500 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: <4D223AFA.1010802@noir.com> References: <4D223AFA.1010802@noir.com> Message-ID: On Mon, Jan 3, 2011 at 4:09 PM, K. Richard Pixley wrote: > Essentially the permutation are, I think: > {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable > attribute}. > > At the abstract level, a property and a normal, non-callable attribute are the same thing. Mike -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikegraham at gmail.com Tue Jan 4 16:31:21 2011 From: mikegraham at gmail.com (Mike Graham) Date: Tue, 4 Jan 2011 10:31:21 -0500 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: References: <4D223AFA.1010802@noir.com> <4D22486B.1090607@noir.com> Message-ID: On Mon, Jan 3, 2011 at 7:56 PM, Guido van Rossum wrote: > That said, I am sure there are use cases for static property and class > property -- I've run into them myself. > > An example use case for class property: in App Engine, we have a Model > class (it's similar to Django's Model class). A model has a "kind" > which is a string (it's the equivalent of an SQL table name). The kind > is usually the class name but sometimes there's a need to override it. > However once the class is defined it should be considered read-only. > Currently our choices are to make this an instance property (but there > are some situations where we don't have an instance, e.g. when > creating a new instance using a class method); or to make it a class > attribute (but this isn't read-only); or to make it a class method > (which requires the user to write M.kind() instead of M.kind). If I > had class properties I'd use one here. > > -- > --Guido van Rossum (python.org/~guido) This attitude seems to go against the we're-all-adults-here attitude that Python, for better or worse, really wants us to take. If we want to turn "there's no good reason to do X; it's pointless and you'd have to be insane to try, and I even documented that you can't do it" into "it's programmability enforced that you can't do X", it seems like we should be migrating to a language that enforces this with nicer syntax, more fidelity, and less overhead. Mike From fuzzyman at voidspace.org.uk Tue Jan 4 18:27:26 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Tue, 4 Jan 2011 17:27:26 +0000 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: References: <4D223AFA.1010802@noir.com> <4D22486B.1090607@noir.com> Message-ID: On 4 January 2011 15:31, Mike Graham wrote: > On Mon, Jan 3, 2011 at 7:56 PM, Guido van Rossum wrote: > > That said, I am sure there are use cases for static property and class > > property -- I've run into them myself. > > > > An example use case for class property: in App Engine, we have a Model > > class (it's similar to Django's Model class). A model has a "kind" > > which is a string (it's the equivalent of an SQL table name). The kind > > is usually the class name but sometimes there's a need to override it. > > However once the class is defined it should be considered read-only. > > Currently our choices are to make this an instance property (but there > > are some situations where we don't have an instance, e.g. when > > creating a new instance using a class method); or to make it a class > > attribute (but this isn't read-only); or to make it a class method > > (which requires the user to write M.kind() instead of M.kind). If I > > had class properties I'd use one here. > > > I'm not entirely sure what you're referring to here - but if you're referring to the desire to make an attribute read-only then there is a different principle at work. If setting something is a programmer error, then it is better that the error become an exception at the point the error is made rather than become a different exception somewhere else later on. Michael Foord > > -- > > --Guido van Rossum (python.org/~guido ) > > This attitude seems to go against the we're-all-adults-here attitude > that Python, for better or worse, really wants us to take. If we want > to turn "there's no good reason to do X; it's pointless and you'd have > to be insane to try, and I even documented that you can't do it" into > "it's programmability enforced that you can't do X", it seems like we > should be migrating to a language that enforces this with nicer > syntax, more fidelity, and less overhead. > > Mike > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From rich at noir.com Tue Jan 4 18:44:30 2011 From: rich at noir.com (K. Richard Pixley) Date: Tue, 04 Jan 2011 09:44:30 -0800 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: References: <4D223AFA.1010802@noir.com> Message-ID: <4D235C7E.7030401@noir.com> On 1/4/11 07:23 , Mike Graham wrote: > On Mon, Jan 3, 2011 at 4:09 PM, K. Richard Pixley > wrote: > > Essentially the permutation are, I think: > {'unadorned'|abc.abstract}{'normal'|static|class}{method|property|non-callable > attribute}. > > > At the abstract level, a property and a normal, non-callable attribute > are the same thing. They are from the instantiation perspective but not from the subclassing perspective. From the subclassing perspective, it's the difference between: class Foo(object): @property def bar(self): ... and: class Foo(object): bar = ... If an abstract property were to be answered by a simple assignment, then the "read-only" trait would be lost. --rich -------------- next part -------------- An HTML attachment was scrubbed... URL: From rich at noir.com Tue Jan 4 18:50:42 2011 From: rich at noir.com (K. Richard Pixley) Date: Tue, 04 Jan 2011 09:50:42 -0800 Subject: [Python-ideas] @classproperty, @abc.abstractclasspropery, etc. In-Reply-To: References: <4D223AFA.1010802@noir.com> <4D22486B.1090607@noir.com> Message-ID: <4D235DF2.5060000@noir.com> On 1/4/11 07:31 , Mike Graham wrote: > On Mon, Jan 3, 2011 at 7:56 PM, Guido van Rossum wrote: >> That said, I am sure there are use cases for static property and class >> property -- I've run into them myself. >> >> An example use case for class property: in App Engine, we have a Model >> class (it's similar to Django's Model class). A model has a "kind" >> which is a string (it's the equivalent of an SQL table name). The kind >> is usually the class name but sometimes there's a need to override it. >> However once the class is defined it should be considered read-only. >> Currently our choices are to make this an instance property (but there >> are some situations where we don't have an instance, e.g. when >> creating a new instance using a class method); or to make it a class >> attribute (but this isn't read-only); or to make it a class method >> (which requires the user to write M.kind() instead of M.kind). If I >> had class properties I'd use one here. >> >> -- >> --Guido van Rossum (python.org/~guido) > This attitude seems to go against the we're-all-adults-here attitude > that Python, for better or worse, really wants us to take. If we want > to turn "there's no good reason to do X; it's pointless and you'd have > to be insane to try, and I even documented that you can't do it" into > "it's programmability enforced that you can't do X", it seems like we > should be migrating to a language that enforces this with nicer > syntax, more fidelity, and less overhead. It's not the restriction that I'm looking for - it's the expressive grace. These concepts are pretty straightforward given the beginnings of them that we have now. Filling out the matrix is a pretty obvious concept. The idea that while the concepts are available to anyone, straightforward, and recur, but can only be implemented by someone with extremely current and advanced knowledge of the interpreter, resulting in code which is less transparent rather than more, is the restrictive idea. --rich From guido at python.org Tue Jan 4 23:52:39 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 14:52:39 -0800 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: Hmm... I starred this and am finally dug out enough to comment. Would it be sufficient if the __module__ attribute of classes and functions got set to the "canonical" name rather than the "physical" name? You can currently get a crude version of this by simply assigning to __name__ at the top of the module. That sounds like it would be too confusing, however, so perhaps we could make it so that, when the __module__ attribute is initialized, it first looks for __canonical__ and then for __name__? This may still be too crude though -- I looked at the one example I could think of where this might be useful, the unittest package, and realized that it would set __module__ to 'unittest' even for classes that are not actually re-exported via the unittest namespace. So maybe it would be better in that case to just patch the __module__ attribute of all the public classes in unittest/__import__.py? OTOH for things named __main__, setting __canonical__ (automatically, by -m or whatever other mechanism starts execution, like "python " might actually work. On the third hand, maybe you've finally hit upon a reason why the "if __name__ == '__main__': main()" idiom is bad... --Guido On Thu, Dec 30, 2010 at 6:52 PM, Nick Coghlan wrote: > On Thu, Dec 30, 2010 at 11:48 AM, Ron Adam wrote: >> This sounds like two different separate issues to me. >> >> One is the leaking-out of lower level details. >> >> The other is abstracting a framework with the minimal amount of details >> needed. > > Yeah, sort of. Really, the core issue is that some objects live in two places: > - where they came from right now, in the current interpreter > - where they should be retrieved from "officially" (e.g. since another > interpreter may not provide an accelerated version, or because the > appropriate submodule may be selected at runtime based on the current > platform) > > There's currently no systematic way of flagging objects or modules > where the latter location differs from the former, so the components > that leak the low level details (such as pickling and pydoc) have no > way to avoid it. Once a system is in place to identify such objects > (or perhaps just the affected modules), then the places that leak that > information can be updated to deal with the situation appropriately > (e.g. pickling would likely just use the official names, while pydoc > would display both, indicating which one was the 'official' location, > and which one reflected the current interpreter behaviour). > > So it's really one core problem (non-portable module details), which > then leads to an assortment of smaller problems when other parts of > the standard library are forced to rely on those non-portable details > because that's the only information available. > > Cheers, > Nick. > > -- > Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Wed Jan 5 02:55:04 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 11:55:04 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On Wed, Jan 5, 2011 at 8:52 AM, Guido van Rossum wrote: > Hmm... I starred this and am finally dug out enough to comment. > > Would it be sufficient if the __module__ attribute of classes and > functions got set to the "canonical" name rather than the "physical" > name? > > You can currently get a crude version of this by simply assigning to > __name__ at the top of the module. > > That sounds like it would be too confusing, however, so perhaps we > could make it so that, when the __module__ attribute is initialized, > it first looks for __canonical__ and then for __name__? > > This may still be too crude though -- I looked at the one example I > could think of where this might be useful, the unittest package, and > realized that it would set __module__ to 'unittest' even for classes > that are not actually re-exported via the unittest namespace. > > So maybe it would be better in that case to just patch the __module__ > attribute of all the public classes in unittest/__import__.py? I did think about that - for classes, it would probably be sufficient, but for functions the fact that we'd be breaking the identity that "f.__globals__ is sys.modules[f.__module__]" scares me. Then again, the fact that "f.__module__ != f.__globals__['__name__']" would provide exactly the indicator of "two names" that I am talking about (at least where functions are concerned) - things like pydoc and the inspect module could definitely be updated to check both module names. On the gripping hand, there would still be problems with things like methods and nested classes and functions (unless tools were provided to recurse down through a class to update the subcomponents as well as the class itself). So perhaps the granularity on my initial suggestion wasn't fine enough - if the "__canonical__" idea was extended to all objects with a __module__ attribute, then objects could either be relocated at creation time (by setting __canonical__ in the module globals), or after the fact by assigning to the __canonical__ attribute on the object. > OTOH for things named __main__, setting __canonical__ (automatically, > by -m or whatever other mechanism starts execution, like "python > " might actually work. Yes, although a related modification is needed in those cases (to actual insert the module being executed into sys.modules under its module name as well as under __main__). > On the third hand, maybe you've finally hit upon a reason why the "if > __name__ == '__main__': main()" idiom is bad... I can't take credit for that particular observation - I've certainly heard others complain about that in the context of pickling objects over the years. It is one of the main things that got me thinking along these lines in the first place. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Jan 5 03:00:05 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 12:00:05 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D23CDD0.6080601@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D23CDD0.6080601@ronadam.com> Message-ID: On Wed, Jan 5, 2011 at 11:48 AM, Ron Adam wrote: > (This is probably something that was suggested more than a few times > before.) > > Would it help if global name space acquired a __main__ name? ?Then the > standard if line becomes only a slightly different "if __name__ == __main__: > main()". ?I think that would make more sense to beginners also and it is a > bit less magical. > > For now, both ways could work, __main__ would be "__main__" or None, but > down the road, (long enough to be sure everyone knows to drop the quotes), > both __main__ and __name__ could be switched to the actual module name so > that __name__ and __module__ attributes would always be correct. If we decided to actually change the way the main module was executed, the most likely result would be to resurrect PEP 299. Changing that particular idiom is probably a Py4k scale of change though :P Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Wed Jan 5 02:48:00 2011 From: rrr at ronadam.com (Ron Adam) Date: Tue, 04 Jan 2011 19:48:00 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: <4D23CDD0.6080601@ronadam.com> On 01/04/2011 04:52 PM, Guido van Rossum wrote: > Hmm... I starred this and am finally dug out enough to comment. > > Would it be sufficient if the __module__ attribute of classes and > functions got set to the "canonical" name rather than the "physical" > name? > > You can currently get a crude version of this by simply assigning to > __name__ at the top of the module. > > That sounds like it would be too confusing, however, so perhaps we > could make it so that, when the __module__ attribute is initialized, > it first looks for __canonical__ and then for __name__? > > This may still be too crude though -- I looked at the one example I > could think of where this might be useful, the unittest package, and > realized that it would set __module__ to 'unittest' even for classes > that are not actually re-exported via the unittest namespace. > > So maybe it would be better in that case to just patch the __module__ > attribute of all the public classes in unittest/__import__.py? > > OTOH for things named __main__, setting __canonical__ (automatically, > by -m or whatever other mechanism starts execution, like "python > " might actually work. > > On the third hand, maybe you've finally hit upon a reason why the "if > __name__ == '__main__': main()" idiom is bad... (This is probably something that was suggested more than a few times before.) Would it help if global name space acquired a __main__ name? Then the standard if line becomes only a slightly different "if __name__ == __main__: main()". I think that would make more sense to beginners also and it is a bit less magical. For now, both ways could work, __main__ would be "__main__" or None, but down the road, (long enough to be sure everyone knows to drop the quotes), both __main__ and __name__ could be switched to the actual module name so that __name__ and __module__ attributes would always be correct. Cheers, Ron From rrr at ronadam.com Wed Jan 5 05:18:16 2011 From: rrr at ronadam.com (Ron Adam) Date: Tue, 04 Jan 2011 22:18:16 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D23CDD0.6080601@ronadam.com> Message-ID: <4D23F108.6060704@ronadam.com> On 01/04/2011 08:00 PM, Nick Coghlan wrote: > On Wed, Jan 5, 2011 at 11:48 AM, Ron Adam wrote: >> (This is probably something that was suggested more than a few times >> before.) >> >> Would it help if global name space acquired a __main__ name? Then the >> standard if line becomes only a slightly different "if __name__ == __main__: >> main()". I think that would make more sense to beginners also and it is a >> bit less magical. >> >> For now, both ways could work, __main__ would be "__main__" or None, but >> down the road, (long enough to be sure everyone knows to drop the quotes), >> both __main__ and __name__ could be switched to the actual module name so >> that __name__ and __module__ attributes would always be correct. > > If we decided to actually change the way the main module was executed, > the most likely result would be to resurrect PEP 299. Changing that > particular idiom is probably a Py4k scale of change though :P Well, changing it in the way PEP 299 suggests is probably even a Py5k change. Which is why I didn't suggest that. ;-) Also PEP 299 main motivation is different than what is being discussed here. Cheers, Ron From guido at python.org Wed Jan 5 05:47:01 2011 From: guido at python.org (Guido van Rossum) Date: Tue, 4 Jan 2011 20:47:01 -0800 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On Tue, Jan 4, 2011 at 5:55 PM, Nick Coghlan wrote: > On Wed, Jan 5, 2011 at 8:52 AM, Guido van Rossum wrote: >> Hmm... I starred this and am finally dug out enough to comment. >> >> Would it be sufficient if the __module__ attribute of classes and >> functions got set to the "canonical" name rather than the "physical" >> name? >> >> You can currently get a crude version of this by simply assigning to >> __name__ at the top of the module. >> >> That sounds like it would be too confusing, however, so perhaps we >> could make it so that, when the __module__ attribute is initialized, >> it first looks for __canonical__ and then for __name__? >> >> This may still be too crude though -- I looked at the one example I >> could think of where this might be useful, the unittest package, and >> realized that it would set __module__ to 'unittest' even for classes >> that are not actually re-exported via the unittest namespace. >> >> So maybe it would be better in that case to just patch the __module__ >> attribute of all the public classes in unittest/__import__.py? > > I did think about that - for classes, it would probably be sufficient, > but for functions the fact that we'd be breaking the identity that > "f.__globals__ is sys.modules[f.__module__]" scares me. Really? Why? Who would ever depend on that? (You also probably meant sys.modules[...].__dict__ -- f.__globals__ is a dict, not a module object.) Note that for classes you'd have the same issue, since each method references the module globals in its f.__globals__. > Then again, > the fact that "f.__module__ != f.__globals__['__name__']" would > provide exactly the indicator of "two names" that I am talking about > (at least where functions are concerned) - things like pydoc and the > inspect module could definitely be updated to check both module names. I think the more important question to answer first would be what you'd want pydoc and inspect to do. > On the gripping hand, there would still be problems with things like > methods and nested classes and functions (unless tools were provided > to recurse down through a class to update the subcomponents as well as > the class itself). Well, method references (even unbound) are not picklable anyway. > So perhaps the granularity on my initial suggestion wasn't fine enough > - if the "__canonical__" idea was extended to all objects with a > __module__ attribute, then objects could either be relocated at > creation time (by setting __canonical__ in the module globals), or > after the fact by assigning to the __canonical__ attribute on the > object. BTW, I think we need to come up with a better word than __canonical__. In general I don't like using adjectives as attribute names. >> OTOH for things named __main__, setting __canonical__ (automatically, >> by -m or whatever other mechanism starts execution, like "python >> " might actually work. > > Yes, although a related modification is needed in those cases (to > actual insert the module being executed into sys.modules under its > module name as well as under __main__). That's the easy part. The hard part is to make the "real name" (i.e. not __main__) the name used by the classes and functions it defines, without breaking the "if __name__ == '__main__': main()" idiom... >> On the third hand, maybe you've finally hit upon a reason why the "if >> __name__ == '__main__': main()" idiom is bad... > > I can't take credit for that particular observation - I've certainly > heard others complain about that in the context of pickling objects > over the years. It is one of the main things that got me thinking > along these lines in the first place. Why didn't you say so in the first place? :-) I think it's easier to come up with a solution for just this case; the issue with e.g. unittest doesn't seem quite as hard (after all, "unittest.case" will always exist). We could just call it __real_name__ and use that in preference over __name__ for all __module__ attributes whenever it's set. (Or we could always set both...) -- --Guido van Rossum (python.org/~guido) From jimjjewett at gmail.com Wed Jan 5 06:58:32 2011 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 5 Jan 2011 00:58:32 -0500 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On Tue, Jan 4, 2011 at 5:52 PM, Guido van Rossum wrote: > Would it be sufficient if the __module__ attribute of classes and > functions got set to the "canonical" name rather than the "physical" > name? Not unless it were documented as an acceptable practice supported by the introspection libraries, with examples pointing to stdlib usage in places like elementTree. Even then it may not work out, but that is the rest of the thread; I just wanted to emphasize that this is a case where "yup, it works" isn't good enough, because of confusion over specification vs implementation vs accidentally worked this time. -jJ From tjreedy at udel.edu Wed Jan 5 11:07:58 2011 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 05 Jan 2011 05:07:58 -0500 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On 1/4/2011 5:52 PM, Guido van Rossum wrote: Nick's concern does not affect me, > On the third hand, maybe you've finally hit upon a reason why the "if > __name__ == '__main__': main()" idiom is bad... but I use this all the time. A suggested alternative and possible eventual replacement: give *every* module an attribute __main__ set to either True or False. Then the idiom would be much simpler and easier to learn and write: 'if __main__: ...'. If there were no other use of the fake '__main__' name, the simple and unconditional replacement would be much less disruptive than, say, the int division change. But the first 10 pages of codesearch on '__main__' shows things like django/test/_doctest.py - 107 identical elif module.__name__ == '__main__': 1850: m = sys.modules.get('__main__') another sys.modules.get(), a sys.modules(), and Formulator/tests/framework.py - many identical 57: if p0 and __name__ == '__main__': 58: os.chdir(p0) The variant conditionals are easy to patch (by hand). The sys.modules lookup suggests that the main module should continue to be keyed under '__main__', even if also keyed under its 'real' name. [Keying modules under a canonical name would eliminate duplicate import bugs, but that is another issue.] -- Terry Jan Reedy From ncoghlan at gmail.com Wed Jan 5 13:15:28 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 5 Jan 2011 22:15:28 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On Wed, Jan 5, 2011 at 2:47 PM, Guido van Rossum wrote: > On Tue, Jan 4, 2011 at 5:55 PM, Nick Coghlan wrote: >> I can't take credit for that particular observation - I've certainly >> heard others complain about that in the context of pickling objects >> over the years. It is one of the main things that got me thinking >> along these lines in the first place. > > Why didn't you say so in the first place? :-) Well, I did put that "half-baked" disclaimer in for a reason... I'm still trying to figure out exactly what I think the real problem here is, so my expression of it is probably as clear as mud :) > I think it's easier to come up with a solution for just this case; the > issue with e.g. unittest doesn't seem quite as hard (after all, > "unittest.case" will always exist). Perhaps it would focus the discussion if we picked one or two modules (in addition to __main__) as example cases. functools comes in two pieces - partial and reduce are implemented in C in the _functools module, everything else is implemented in Python in functools itself. datetime, on the other hand, is a case of a pure acceleration module - if _datetime is available, it is expected to completely implement the datetime API. _functools.partial and the classes in datetime all adopt the strategy of lying about their original location in __module__. This is probably the best available choice, as it makes pickling do the right thing. The main downside with this approach is the way it confuses things like inspect.getsource (for datetime, it reports the pure Python versions as the source code for the C accelerated versions, for functools.partial it gives a technically accurate, but potentially misleading error message. If inspect could easily *tell* that the accelerated versions were in use, then it could handle the situation a bit more gracefully). To eliminate that issue, what if, whenever we're setting a __module__ attribute (e.g. during class creation), we also set a "__real_module__" attribute? Then code could happily adjust __module__ to point to the official location (as it already does), but tools like inspect wouldn't be fooled regarding the state of the *current* interpreter. Most of the time, __module__ and __real_module__ will point to the same place, but the cases where they're different will be handled far more gracefully. (I suspect that is significantly easier said than done though - I expect it would be a very manual process getting an extension module to do this correctly) > We could just call it __real_name__ and use that in preference over > __name__ for all __module__ attributes whenever it's set. (Or we could > always set both...) The stuff I wrote above applies to pretty much everything *except* the __main__ module. For the __main__ module, I'm inclined to revisit Brett's idea from PEP 3122: put the real name of the __main__ module in a sys.main attribute. However, unlike that PEP, we would continue to set __name__ to "__main__" in the main module. The new attribute would be a transition step allowing manual reversal of the name mangling: # Near top of module if __name__ = "__main__": running_as_main = True import sys __name__ = sys.main # Rest of module # Near end of module if running_as_main: # Actually do "main" type stuff. Alternatively, we could just do nothing about the problem with __main__ and continue to encourage people to separate their "main" modules from the modules that define classes. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Wed Jan 5 14:28:30 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 5 Jan 2011 13:28:30 +0000 Subject: [Python-ideas] python -c "..." should output result like the interpreter In-Reply-To: References: Message-ID: On 29 December 2010 15:45, Michael Foord wrote: > > > On 29 December 2010 15:18, Georg Brandl wrote: > >> Am 29.12.2010 15:46, schrieb Michael Foord: >> >> > I like the idea, but that's a fairly big semantic change. What about >> > adding an -e option that takes an expression, and prints its value? >> So >> > you'd have >> > >> > python -e "12 / 4.1" >> > >> > (AFAICT, -e is unused at present). >> > >> > That would be great. I did worry that changing the output would be >> backwards >> > incompatible with code that shells out to Python using "-c", so a >> different >> > command line option would be great. So long as it works with multiple >> statements >> > (semi-colon separated) like the current "-c" behaviour. >> >> Hey, what about this little module: >> >> import sys >> for x in sys.argv[1:]: >> exec compile(x, '', 'single') >> >> Then: >> >> $ python -me '1+1; 2+2' >> 2 >> 4 >> > > So now you can `pip install e` and then `python -me`... > Just as a follow up, for which we should still be blaming Georg, you can now do `pip install oo` followed by `python -moo`. (Requires pygame - tested on Linux and Mac should be cross platform.) All the best, Michael Foord > > Michael > > >> >> Georg >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > > > -- > > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at voidspace.org.uk Wed Jan 5 14:42:00 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 5 Jan 2011 13:42:00 +0000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On 4 January 2011 22:52, Guido van Rossum wrote: > Hmm... I starred this and am finally dug out enough to comment. > > Would it be sufficient if the __module__ attribute of classes and > functions got set to the "canonical" name rather than the "physical" > name? > > You can currently get a crude version of this by simply assigning to > __name__ at the top of the module. > > That sounds like it would be too confusing, however, so perhaps we > could make it so that, when the __module__ attribute is initialized, > it first looks for __canonical__ and then for __name__? > > This may still be too crude though -- I looked at the one example I > could think of where this might be useful, the unittest package, and > realized that it would set __module__ to 'unittest' even for classes > that are not actually re-exported via the unittest namespace. > > So maybe it would be better in that case to just patch the __module__ > attribute of all the public classes in unittest/__import__.py? > > So should I do this in unittest for Python 2.7 / 3.2? The problem this *would* solve is that pickled unittest objects from 2.7 / 3.2 can't be unpickled on earlier versions of Python. I don't know how *real* a problem it is or whether it is worth losing / faking the __module__ information on these classes to solve it. Sure it's a problem that is likely to bite *someone* at some point, but not very many people. If someone is using __module__ information to find source code (or anything else) for a class then changing __module__ will break that, so I'm not convinced it's a worthwhile tradeoff. All the best, Michael > OTOH for things named __main__, setting __canonical__ (automatically, > by -m or whatever other mechanism starts execution, like "python > " might actually work. > > On the third hand, maybe you've finally hit upon a reason why the "if > __name__ == '__main__': main()" idiom is bad... > > --Guido > > On Thu, Dec 30, 2010 at 6:52 PM, Nick Coghlan wrote: > > On Thu, Dec 30, 2010 at 11:48 AM, Ron Adam wrote: > >> This sounds like two different separate issues to me. > >> > >> One is the leaking-out of lower level details. > >> > >> The other is abstracting a framework with the minimal amount of details > >> needed. > > > > Yeah, sort of. Really, the core issue is that some objects live in two > places: > > - where they came from right now, in the current interpreter > > - where they should be retrieved from "officially" (e.g. since another > > interpreter may not provide an accelerated version, or because the > > appropriate submodule may be selected at runtime based on the current > > platform) > > > > There's currently no systematic way of flagging objects or modules > > where the latter location differs from the former, so the components > > that leak the low level details (such as pickling and pydoc) have no > > way to avoid it. Once a system is in place to identify such objects > > (or perhaps just the affected modules), then the places that leak that > > information can be updated to deal with the situation appropriately > > (e.g. pickling would likely just use the official names, while pydoc > > would display both, indicating which one was the 'official' location, > > and which one reflected the current interpreter behaviour). > > > > So it's really one core problem (non-portable module details), which > > then leads to an assortment of smaller problems when other parts of > > the standard library are forced to rely on those non-portable details > > because that's the only information available. > > > > Cheers, > > Nick. > > > > -- > > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > -- > --Guido van Rossum (python.org/~guido ) > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Wed Jan 5 16:57:12 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Jan 2011 01:57:12 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On Wed, Jan 5, 2011 at 11:42 PM, Michael Foord wrote: > So should I do this in unittest for Python 2.7 / 3.2? > > The problem this *would* solve is that pickled unittest objects from 2.7 / > 3.2 can't be unpickled on earlier versions of Python. > > I don't know how *real* a problem it is or whether it is worth losing / > faking the __module__ information on these classes to solve it. Sure it's a > problem that is likely to bite *someone* at some point, but not very many > people. If someone is using __module__ information to find source code (or > anything else) for a class then changing __module__ will break that, so I'm > not convinced it's a worthwhile tradeoff. The two examples I looked at (functools and datetime) favoured hiding the implementation details at the cost of causing introspection problems. Despite my comments in the opening post of the thread, I think that is the better trade-off to make. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Wed Jan 5 18:32:39 2011 From: rrr at ronadam.com (Ron Adam) Date: Wed, 05 Jan 2011 11:32:39 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: <4D24AB37.7050900@ronadam.com> On 01/05/2011 06:15 AM, Nick Coghlan wrote: > Perhaps it would focus the discussion if we picked one or two modules > (in addition to __main__) as example cases. > > functools comes in two pieces - partial and reduce are implemented in > C in the _functools module, everything else is implemented in Python > in functools itself. > datetime, on the other hand, is a case of a pure acceleration module - > if _datetime is available, it is expected to completely implement the > datetime API. > > _functools.partial and the classes in datetime all adopt the strategy > of lying about their original location in __module__. This is probably > the best available choice, as it makes pickling do the right thing. > > The main downside with this approach is the way it confuses things > like inspect.getsource (for datetime, it reports the pure Python > versions as the source code for the C accelerated versions, for > functools.partial it gives a technically accurate, but potentially > misleading error message. If inspect could easily *tell* that the > accelerated versions were in use, then it could handle the situation a > bit more gracefully). It seems Python tries pretty hard to hide external calls, (the cause of the confusion you mention above). It makes me wonder why python doesn't have an extern type (or types). Then instead of them being a source of confusion, they would be recognisable for what they are. They could have extra attributes to enable pickle and other tools to work in a nice way. Ron From fuzzyman at voidspace.org.uk Wed Jan 5 18:45:46 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Wed, 5 Jan 2011 17:45:46 +0000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On 5 January 2011 15:57, Nick Coghlan wrote: > On Wed, Jan 5, 2011 at 11:42 PM, Michael Foord > wrote: > > So should I do this in unittest for Python 2.7 / 3.2? > > > > The problem this *would* solve is that pickled unittest objects from 2.7 > / > > 3.2 can't be unpickled on earlier versions of Python. > > > > I don't know how *real* a problem it is or whether it is worth losing / > > faking the __module__ information on these classes to solve it. Sure it's > a > > problem that is likely to bite *someone* at some point, but not very many > > people. If someone is using __module__ information to find source code > (or > > anything else) for a class then changing __module__ will break that, so > I'm > > not convinced it's a worthwhile tradeoff. > > The two examples I looked at (functools and datetime) favoured hiding > the implementation details at the cost of causing introspection > problems. Despite my comments in the opening post of the thread, I > think that is the better trade-off to make. > Both of those are because of underlying C implementations where introspection problems would be the default anyway, which isn't quite the same for situation for unittest. Michael > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Jan 5 20:10:18 2011 From: guido at python.org (Guido van Rossum) Date: Wed, 5 Jan 2011 11:10:18 -0800 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: I'm going to have to leave this thread to you all, my main goal was to tease out a better problem description. I think that's been taken care of now. The solution will then follow. -- --Guido van Rossum (python.org/~guido) From ncoghlan at gmail.com Thu Jan 6 02:52:54 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 6 Jan 2011 11:52:54 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: On Thu, Jan 6, 2011 at 3:45 AM, Michael Foord wrote: > On 5 January 2011 15:57, Nick Coghlan wrote: >> The two examples I looked at (functools and datetime) favoured hiding >> the implementation details at the cost of causing introspection >> problems. Despite my comments in the opening post of the thread, I >> think that is the better trade-off to make. > > Both of those are because of underlying C implementations where > introspection problems would be the default anyway, which isn't quite the > same for situation for unittest. True, but it means the precedent of using __module__ to refer to the official location rather than than the actual location has already been set. That suggests to me our best way forward is to bless that as a recommended practice, then find a way to deal with the negative impact it currently has on introspection (such as a "__real_module__" attribute, as I suggested in another post). Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Thu Jan 6 13:21:15 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Thu, 06 Jan 2011 12:21:15 +0000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: <4D25B3BB.8070800@voidspace.org.uk> On 06/01/2011 01:52, Nick Coghlan wrote: > On Thu, Jan 6, 2011 at 3:45 AM, Michael Foord wrote: >> On 5 January 2011 15:57, Nick Coghlan wrote: >>> The two examples I looked at (functools and datetime) favoured hiding >>> the implementation details at the cost of causing introspection >>> problems. Despite my comments in the opening post of the thread, I >>> think that is the better trade-off to make. >> Both of those are because of underlying C implementations where >> introspection problems would be the default anyway, which isn't quite the >> same for situation for unittest. > True, but it means the precedent of using __module__ to refer to the > official location rather than than the actual location has already > been set. That suggests to me our best way forward is to bless that as > a recommended practice, then find a way to deal with the negative > impact it currently has on introspection (such as a "__real_module__" > attribute, as I suggested in another post). > Well, I would say set __module__ to the official location *when* we have "__real_module__" (or whatever) in place. Changing __module__ breaks inspect.getsource: .>>> import inspect .>>> from unittest import TestCase .>>> TestCase.__module__ 'unittest.case' .>>> TestCase.__module__ = 'unittest' .>>> inspect.getsource(TestCase) Traceback (most recent call last): ... IOError: could not find class definition As the only problem this solves is a theoretical one (so far for unittest anyway) I'm not keen to do this until the introspection issue is resolved. One this is resolved I'm fine with it. All the best, Michael > Cheers, > Nick. > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html From rrr at ronadam.com Fri Jan 7 03:38:07 2011 From: rrr at ronadam.com (Ron Adam) Date: Thu, 06 Jan 2011 20:38:07 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> Message-ID: <4D267C8F.3050601@ronadam.com> On 01/05/2011 07:52 PM, Nick Coghlan wrote: > On Thu, Jan 6, 2011 at 3:45 AM, Michael Foord wrote: >> On 5 January 2011 15:57, Nick Coghlan wrote: >>> The two examples I looked at (functools and datetime) favoured hiding >>> the implementation details at the cost of causing introspection >>> problems. Despite my comments in the opening post of the thread, I >>> think that is the better trade-off to make. >> >> Both of those are because of underlying C implementations where >> introspection problems would be the default anyway, which isn't quite the >> same for situation for unittest. > > True, but it means the precedent of using __module__ to refer to the > official location rather than than the actual location has already > been set. That suggests to me our best way forward is to bless that as > a recommended practice, then find a way to deal with the negative > impact it currently has on introspection (such as a "__real_module__" > attribute, as I suggested in another post). You could add a private dictionary to sys, that is updated along with sys.modules, which maps module names to real names. And have a function in inspect to retrieve the real name for an object. That sounds like it would do pretty much what you need and doesn't add a top level builtin or global, or change "if __name__ == '__main__': main()". Cheers, Ron From ncoghlan at gmail.com Fri Jan 7 04:28:46 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 7 Jan 2011 13:28:46 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D267C8F.3050601@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> Message-ID: On Fri, Jan 7, 2011 at 12:38 PM, Ron Adam wrote: > You could add a private dictionary to sys, that is updated along with > sys.modules, which maps module names to real names. ?And have a function in > inspect to retrieve the real name for an object. > > That sounds like it would do pretty much what you need and doesn't add a top > level builtin or global, or change "if __name__ == '__main__': main()". My original suggestion was along those lines, but I've come to the conclusion that it isn't sufficiently granular - when existing code tinkers with "__module__" it tends to do it at the object level rather than by modifying __name__ in the module globals. To turn this into a concrete proposal, here is what I am thinking of specifying in a PEP for 3.3: 1. Implicit configuration of __module__ attributes is updated to check for a definition of "__import_name__" at the module level. If found, then this is used as the value for the __module__ attribute. Otherwise, __module__ is set to __name__ as usual. 2. Any code that currently sets a __module__ attribute (i.e. function and class definitions) will also set an __impl_module__ attribute. This attribute will always be set to the value of __name__. 3. Update and/or augment the relevant C APIs to make it easy to do this for affected extension modules 4. Update inspect.getsource() (and possibly some other introspection functions) to look at __impl_module__ rather than __module__ 5. Update all acceleration (such as _datetime) and "implementation packages" (such as unittest) to set __module__ and __impl_module__ appropriately on exported objects 6. Update the __main__ execution logic (including both the builtin logic and runpy) to insert the __main__ module into sys.modules as both "__main__" and the module's real name (i.e. the name that would result in a second copy of the module ending up in sys.modules if you imported it) 7. Update the __main__ execution logic to set __import_name__ to the actual name of the module. So we end up with two new magic attributes: __import_name__: optional module level attribute that indicates a preferred alternative to __name__ for accessing the module. contents. Alters the value of __module__ for classes and functions defined in the module. Implicitly set for the __main__ module. __impl_module__: implicitly set on objects with a __module__ attribute to allow __module__ to be altered to refer to an object's preferred import location without losing the actual implementation location of the object Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mmanns at gmx.net Fri Jan 7 11:24:35 2011 From: mmanns at gmx.net (Martin Manns) Date: Fri, 7 Jan 2011 11:24:35 +0100 Subject: [Python-ideas] Add irange with large integer step support to itertools Message-ID: <20110107112435.2ae46c89@Knock> Hi I would like to propose an addition of an "irange" function to itertools. This addition could reduce testing effort when developing applications, in which large integers show up. Both, xrange (Python 2.x) and range (Python 3.x) have limited support for large integer step values, for example: Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> range(10**10000, 10**10000+10**1000, 10**900)[5] Traceback (most recent call last): File "", line 1, in OverflowError: Python int too large to convert to C ssize_t The code below is untested and for clarification only. It has been taken and modified from [issue7721] http://bugs.python.org/issue7721 With irange, no OverflowError is thrown: >>> from itertools import islice >>> from irange import irange >>> def nth(iterable, n, default=None): ... "Returns the nth item or a default value" ... return next(islice(iterable, n, None), default) ... >>> nth(irange(10**10000, 10**10000+10**1000, 10**900), 5) 100000000000000 ... ## Code snippet (untested) from itertools import count, takewhile def irange(start, stop=None, step=1): """Range for long integers Usage: irange([start], stop, [step]) Parameters ---------- start: Integer stop: Integer step: Integer, defaults to 1 """ if start is None: raise TypeError("range() integer argument expected, got NoneType") if stop is None: stop = start start = 0 if step is None: step = 1 if step > 0: if stop < start: return (_ for _ in []) cond = lambda x: x < stop elif step < 0: if stop > start: return (_ for _ in []) cond = lambda x: x > stop else: raise ValueError("irange() step argument must not be zero") return takewhile(cond, (start + i * step for i in count())) ## End code snippet Does such an addition make sense in your eyes? Regards Martin From rrr at ronadam.com Sat Jan 8 10:06:31 2011 From: rrr at ronadam.com (Ron Adam) Date: Sat, 08 Jan 2011 03:06:31 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> Message-ID: <4D282917.3020606@ronadam.com> On 01/06/2011 09:28 PM, Nick Coghlan wrote: > On Fri, Jan 7, 2011 at 12:38 PM, Ron Adam wrote: >> You could add a private dictionary to sys, that is updated along with >> sys.modules, which maps module names to real names. And have a function in >> inspect to retrieve the real name for an object. >> >> That sounds like it would do pretty much what you need and doesn't add a top >> level builtin or global, or change "if __name__ == '__main__': main()". > > My original suggestion was along those lines, but I've come to the > conclusion that it isn't sufficiently granular - when existing code > tinkers with "__module__" it tends to do it at the object level rather > than by modifying __name__ in the module globals. What do you mean by *tinkers with "__module__"* ? Do you have an example where/when that is needed? > To turn this into a concrete proposal, here is what I am thinking of > specifying in a PEP for 3.3: > > 1. Implicit configuration of __module__ attributes is updated to check > for a definition of "__import_name__" at the module level. If found, > then this is used as the value for the __module__ attribute. > Otherwise, __module__ is set to __name__ as usual. If __import_name__ is going to match __module__ everywhere else, why not just call it __module__ every where? Would __package__ be changed in any way? > 2. Any code that currently sets a __module__ attribute (i.e. function > and class definitions) will also set an __impl_module__ attribute. > This attribute will always be set to the value of __name__. So we will have: __package__, __module__, __import_name__, __impl_name__, and if you also include __file__ and __path__, that makes six different attributes for describing where something came from. I don't know about you, but this bothers me a bit. :-/ How about reconsidering going the other direction: 1. Add __module__ to module level name space. +1 2. Add a module registry that uses the __module__ attribute to get a module_location_info object, which would have all the useful location info in it. (including the real name of "__main__") If __name__ and __module__ are not changed, Programs that use those won't break. Also consider having virtual modules, where objects in it may have come from different *other* locations. A virtual module would need a way to keep track of that. (I'm not sure this is a good idea.) Does this fit some of problems you are thinking of where the granularity may matter? It would take two functions to do this. One to create the virtual module, and another to pre-load it's initial objects. For those objects, the loader would set obj.__module__ to the virtual module name, and also set obj.__original_module__ to the original module name. These would only be seen on objects in virtual modules. A lookup on obj.__module__ will tell you it's in a virtual module. Then a lookup with obj.__original_module__ would give you the actual location info it came from. By doing it that way, most people will never need to know how these things work or even see them. ie... It's advance/expert Python foo. ;-) Any way, I hope this gives you some ideas, I know you can figure out the details much better than I can. Cheers, Ron From rrr at ronadam.com Sun Jan 9 02:20:22 2011 From: rrr at ronadam.com (Ron Adam) Date: Sat, 08 Jan 2011 19:20:22 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D282917.3020606@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> Message-ID: <4D290D56.8020406@ronadam.com> On 01/08/2011 03:06 AM, Ron Adam wrote: > So we will have: __package__, __module__, __import_name__, __impl_name__, > and if you also include __file__ and __path__, that makes six different > attributes for describing where something came from. And also add __cached__ to that list. > I don't know about you, but this bothers me a bit. :-/ From ncoghlan at gmail.com Sun Jan 9 07:39:24 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 9 Jan 2011 16:39:24 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D282917.3020606@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> Message-ID: On Sat, Jan 8, 2011 at 7:06 PM, Ron Adam wrote: > On 01/06/2011 09:28 PM, Nick Coghlan wrote: >> My original suggestion was along those lines, but I've come to the >> conclusion that it isn't sufficiently granular - when existing code >> tinkers with "__module__" it tends to do it at the object level rather >> than by modifying __name__ in the module globals. > > What do you mean by *tinkers with "__module__"* ? > > Do you have an example where/when that is needed? >>> from inspect import getsource >>> from functools import partial >>> partial.__module__ 'functools' >>> getsource(partial) Traceback (most recent call last): File "", line 1, in File "/usr/lib/python2.6/inspect.py", line 689, in getsource lines, lnum = getsourcelines(object) File "/usr/lib/python2.6/inspect.py", line 678, in getsourcelines lines, lnum = findsource(object) File "/usr/lib/python2.6/inspect.py", line 552, in findsource raise IOError('could not find class definition') IOError: could not find class definition partial is actually implemented in C in the _functools module, hence the failure of the getsource call. However, it officially lives in functools for pickling purposes (other implementations aren't obliged to provide _functools at all), so __module__ is adjusted appropriately. The other examples I have been using are the _datetime C acceleration module and the unittest pseudo-package. >> 1. Implicit configuration of __module__ attributes is updated to check >> for a definition of "__import_name__" at the module level. If found, >> then this is used as the value for the __module__ attribute. >> Otherwise, __module__ is set to __name__ as usual. > > If __import_name__ is going to match __module__ everywhere else, why not > just call it __module__ every where? Because the module level attributes for identifying the module don't serve the same purpose as the attributes identifying where functions and classes are defined. That said, calling it "__module__" would probably work, and make the naming logic a bit more intuitive. The precedent for that attribute name to refer to a string rather than a module object was set a long time ago, after all. > Would __package__ be changed in any way? To look for __module__ before checking __name__? No, since doing that would make it unnecessarily difficult to use relative imports inside pseudo-packages. >> 2. Any code that currently sets a __module__ attribute (i.e. function >> and class definitions) will also set an __impl_module__ attribute. >> This attribute will always be set to the value of __name__. > > So we will have: ?__package__, __module__, __import_name__, __impl_name__, > ?and if you also include __file__ and __path__, that makes six different > attributes for describing where something came from. > > I don't know about you, but this bothers me a bit. :-/ It bothers me a lot, since I probably could have avoided at least some of it by expanding the scope of PEP 366. However, it does help to split them out into the different contexts and look at how each of them are used, since it makes it clear that there are a lot of attributes because there is a fair bit of information that is used in different ways. Module level attributes relating to location in the external environment: __file__: typically refers to a source file, but is not required to (see PEP 302) __path__: package attribute used to identify the directory (or directories) searched for submodules __loader__: PEP 302 loader reference (may not exist for ordinary filesystem imports) __cached__: if it exists, refers to a compiled bytecode file (see PEP 3149) It is important to understand that ever since PEP 302, *there is no loader independent mapping* between any of these external environment related attributes and the module namespace. Some Python standard library code (i.e. multiprocessing) currently assumes such a mapping exists and it is broken on windows right now as a direct result of that incorrect assumption (other code explicitly disclaims support for PEP 302 loaded modules and only works with actual files and directories). Module level attributes relating to location within the module namespace: __name__: actual name of current module in the current interpreter instance. Best choice for introspection of the current interpreter. __module__ (*new*): "official" portable name for module contents (components should never include leading underscores). Best choice for information that should be portable to other interpreters (e.g. for pickling and other serialisation formats) __package__: optional attribute used specifically to control handling of relative imports. May be explicitly set (e.g. by runpy), otherwise implicitly set to "__name__.rpartion('.')[0]" by the first relative import. Most of the time, __name__ is consistent across all 3 use cases, in which case __package__ and __import_name__ are redundant. However, when __name__ is wrong for some reason (e.g. including an implementation detail, or adjusted to "__main__" for execution as a script), then __package__ allows relative imports to be fixed, while __import_name__ will allow pickling and other operations that should hide implementation details to be fixed. Object level attributes relating to location of class and function definitions: __module__ (*updated*): refers to __module__ from originating module (if defined) and to __name__, otherwise __impl_module__ (*new*): refers to __name__ from originating module Looking at that write-up, I do quite like the idea of reusing __module__ for the new module level attribute. > Also consider having virtual modules, where objects in it may have come from > different *other* locations. A virtual module would need a way to keep track > of that. (I'm not sure this is a good idea.) It's too late, code already does that. This is precisely the use case I am trying to fix (objects like functools.partial that deliberately lie in their __module__ attribute), so that this can be done *right* (i.e. without having to choose which use cases to support and which ones to break). That basic problem is that __module__ currently tries to serve two masters: 1. use cases like inspect.getsource, where we want to know where the object came from in the current interpreter 2. use cases like pickle, where we want the "official" portable location, with any implementation details (like the _functools module) hidden. Currently, the default behaviour of the interpreter is to support use case 1 and break use case 2 if any objects are defined in a different module from where they claim to live (e.g. see the pickle compatibility breakage with the 3.2 unittest implementation layout changes). The only tool currently available to module authors is to override __module__ (as functools.partial and the datetime acceleration module do), which is correct for use case 2, but breaks use case 1 (leading to misleading error messages in the C acceleration module case, and breaking otherwise valid introspection in the unittest case). My proposed changes will: a) make overriding __module__ significantly easier to do b) allow the introspection use cases access to the information they need so they can do the right thing when confronted with an overridden __module__ attribute > Does this fit some of problems you are thinking of where the granularity may > matter? > > It would take two functions to do this. ?One to create the virtual module, > and another to pre-load it's initial objects. ?For those objects, the loader > would set obj.__module__ to the virtual module name, and also set > obj.__original_module__ to the original module name. ?These would only be > seen on objects in virtual modules. ?A lookup on obj.__module__ will tell > you it's in a virtual module. ?Then a lookup with obj.__original_module__ > would give you the actual location info it came from. That adds a lot of complexity though - far simpler to define a new __impl_module__ attribute on every object, retroactively fixing introspection of existing code that adjusts __module__ to make pickling work properly across different versions and implementations. > By doing it that way, most people will never need to know how these things > work or even see them. ?ie... It's advance/expert Python foo. ;-) Most people will never need to care or worry about the difference between __module__ and __impl_module__ either - it will be hidden inside libraries like inspect, pydoc and pickle. > Any way, I hope this gives you some ideas, I know you can figure out the > details much better than I can. Yeah, the idea of reusing the __module__ attribute name at the top level is an excellent one. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Sun Jan 9 18:56:24 2011 From: rrr at ronadam.com (Ron Adam) Date: Sun, 09 Jan 2011 11:56:24 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> Message-ID: <4D29F6C8.2010505@ronadam.com> On 01/09/2011 12:39 AM, Nick Coghlan wrote: >> Also consider having virtual modules, where objects in it may have come from >> different *other* locations. A virtual module would need a way to keep track >> of that. (I'm not sure this is a good idea.) > It's too late, code already does that. This is precisely the use case > I am trying to fix (objects like functools.partial that deliberately > lie in their __module__ attribute), so that this can be done *right* > (i.e. without having to choose which use cases to support and which > ones to break). Yes, __builtins__ is a virtual module. Creating a module in memory... >>> import imp >>> new = imp.new_module("new") >>> new The term "(built-in)" doesn't quite fit in this case. But I can get used to it. >>> sys.modules[new.__name__] Traceback (most recent call last): File "", line 1, in KeyError: 'new' And it's not in sys.modules yet. That's ok, other things can be loaded into it before it's added it to sys.modules. It's this loading part that can be improved. > That basic problem is that __module__ currently tries to serve two masters: > 1. use cases like inspect.getsource, where we want to know where the > object came from in the current interpreter > 2. use cases like pickle, where we want the "official" portable > location, with any implementation details (like the _functools module) > hidden. Most C extensions are written as modules, to be imported and imported from. A tool to load objects rather than import them, may be better in these situations. partial = imp.load_extern_object("_functools.partial") A loaded object would have it's __module__ attribute set to the module it's loaded into instead of where it came from. By doing it this way, it doesn't complicate the import semantics. It may also be useful to make it a special type, so that other tools can decide how to handle them. > Currently, the default behaviour of the interpreter is to support use > case 1 and break use case 2 if any objects are defined in a different > module from where they claim to live (e.g. see the pickle > compatibility breakage with the 3.2 unittest implementation layout > changes). The only tool currently available to module authors is to > override __module__ (as functools.partial and the datetime > acceleration module do), which is correct for use case 2, but breaks > use case 1 (leading to misleading error messages in the C acceleration > module case, and breaking otherwise valid introspection in the > unittest case). > > My proposed changes will: > a) make overriding __module__ significantly easier to do > b) allow the introspection use cases access to the information they > need so they can do the right thing when confronted with an overridden > __module__ attribute It would be better to find solutions that don't override __module__ after it has been imported or loaded. >> Does this fit some of problems you are thinking of where the granularity may >> matter? >> >> It would take two functions to do this. One to create the virtual module, >> and another to pre-load it's initial objects. For those objects, the loader >> would set obj.__module__ to the virtual module name, and also set >> obj.__original_module__ to the original module name. These would only be >> seen on objects in virtual modules. A lookup on obj.__module__ will tell >> you it's in a virtual module. Then a lookup with obj.__original_module__ >> would give you the actual location info it came from. > > That adds a lot of complexity though - far simpler to define a new > __impl_module__ attribute on every object, retroactively fixing > introspection of existing code that adjusts __module__ to make > pickling work properly across different versions and implementations. > >> By doing it that way, most people will never need to know how these things >> work or even see them. ie... It's advance/expert Python foo. ;-) > > Most people will never need to care or worry about the difference > between __module__ and __impl_module__ either - it will be hidden > inside libraries like inspect, pydoc and pickle. I think __impl_module__ should only be on objects where it would be different than __module__. >> Any way, I hope this gives you some ideas, I know you can figure out the >> details much better than I can. > Yeah, the idea of reusing the __module__ attribute name at the top > level is an excellent one. The hard part of all of this, is separating out the the good doable ideas from the good, but unfortunately can't do ideas because it will break something ideas. Cheers, Ron From ncoghlan at gmail.com Sun Jan 9 19:18:42 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Jan 2011 04:18:42 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D29F6C8.2010505@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> Message-ID: On Mon, Jan 10, 2011 at 3:56 AM, Ron Adam wrote: > On 01/09/2011 12:39 AM, Nick Coghlan wrote: >>> Also consider having virtual modules, where objects in it may have come >>> from >>> different *other* locations. A virtual module would need a way to keep >>> track >>> of that. (I'm not sure this is a good idea.) > >> It's too late, code already does that. This is precisely the use case >> I am trying to fix (objects like functools.partial that deliberately >> lie in their __module__ attribute), so that this can be done *right* >> (i.e. without having to choose which use cases to support and which >> ones to break). > > Yes, __builtins__ is a virtual module. No, it's a real module, just like all the others. >>>> sys.modules[new.__name__] > Traceback (most recent call last): > ?File "", line 1, in > KeyError: 'new' > > And it's not in sys.modules yet. ?That's ok, other things can be loaded into > it before it's added it to sys.modules. > > It's this loading part that can be improved. I don't understand the point of this tangent. The practice of how objects are merged into modules is already established: you use "import *" or some other form of import statement. I want to *make that work properly*, not invent a new way to do it. >> That basic problem is that __module__ currently tries to serve two >> masters: >> 1. use cases like inspect.getsource, where we want to know where the >> object came from in the current interpreter >> 2. use cases like pickle, where we want the "official" portable >> location, with any implementation details (like the _functools module) >> hidden. > > Most C extensions are written as modules, to be imported and imported from. > ?A tool to load objects rather than import them, may be better in these > situations. > > ? partial = imp.load_extern_object("_functools.partial") > > A loaded object would have it's __module__ attribute set to the module it's > loaded into instead of where it came from. > > By doing it this way, it doesn't complicate the import semantics. What complication to the import semantics? I'm not touching the import semantics, just the semantics for defining functions and classes. > It may also be useful to make it a special type, so that other tools can > decide how to handle them. No. The idea is to make existing code work properly, not force people to jump through new hoops. >> Currently, the default behaviour of the interpreter is to support use >> case 1 and break use case 2 if any objects are defined in a different >> module from where they claim to live (e.g. see the pickle >> compatibility breakage with the 3.2 unittest implementation layout >> changes). The only tool currently available to module authors is to >> override __module__ (as functools.partial and the datetime >> acceleration module do), which is correct for use case 2, but breaks >> use case 1 (leading to misleading error messages in the C acceleration >> module case, and breaking otherwise valid introspection in the >> unittest case). >> >> My proposed changes will: >> a) make overriding __module__ significantly easier to do >> b) allow the introspection use cases access to the information they >> need so they can do the right thing when confronted with an overridden >> __module__ attribute > > It would be better to find solutions that don't override __module__ after it > has been imported or loaded. Again, no. My aim is to make existing practices not break things, rather than trying to get people to change their practices. >> Most people will never need to care or worry about the difference >> between __module__ and __impl_module__ either - it will be hidden >> inside libraries like inspect, pydoc and pickle. > > I think __impl_module__ should only be on objects where it would be > different than __module__. How does introducing an inconsistency like that make anything simpler? Optional attributes are painful to deal with, so we only use them for things where we don't fully control their creation (e.g. when we add new attributes to modules, PEP 302 means we can't assume they will exist when the module code is running, as third party loaders may not include them when initialising the module namespace). That is unlikely to be the case here. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From g.brandl at gmx.net Sun Jan 9 20:30:24 2011 From: g.brandl at gmx.net (Georg Brandl) Date: Sun, 09 Jan 2011 20:30:24 +0100 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> Message-ID: Am 09.01.2011 19:18, schrieb Nick Coghlan: > On Mon, Jan 10, 2011 at 3:56 AM, Ron Adam wrote: >> On 01/09/2011 12:39 AM, Nick Coghlan wrote: >>>> Also consider having virtual modules, where objects in it may have come >>>> from >>>> different *other* locations. A virtual module would need a way to keep >>>> track >>>> of that. (I'm not sure this is a good idea.) >> >>> It's too late, code already does that. This is precisely the use case >>> I am trying to fix (objects like functools.partial that deliberately >>> lie in their __module__ attribute), so that this can be done *right* >>> (i.e. without having to choose which use cases to support and which >>> ones to break). >> >> Yes, __builtins__ is a virtual module. > > No, it's a real module, just like all the others. __builtin__ (2.x) / builtins (3.x) is; __builtins__ you (Ron) should just forget about. Georg From rrr at ronadam.com Mon Jan 10 02:11:45 2011 From: rrr at ronadam.com (Ron Adam) Date: Sun, 09 Jan 2011 19:11:45 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> Message-ID: <4D2A5CD1.6020503@ronadam.com> On 01/09/2011 12:18 PM, Nick Coghlan wrote: > On Mon, Jan 10, 2011 at 3:56 AM, Ron Adam wrote: >> On 01/09/2011 12:39 AM, Nick Coghlan wrote: >>>> Also consider having virtual modules, where objects in it may have come >>>> from >>>> different *other* locations. A virtual module would need a way to keep >>>> track >>>> of that. (I'm not sure this is a good idea.) >> >>> It's too late, code already does that. This is precisely the use case >>> I am trying to fix (objects like functools.partial that deliberately >>> lie in their __module__ attribute), so that this can be done *right* >>> (i.e. without having to choose which use cases to support and which >>> ones to break). >> >> Yes, __builtins__ is a virtual module. > > No, it's a real module, just like all the others. As George pointed out it's "builtins". But you knew what I was referring to. ;-) I wasn't saying it's not a real module, but there are differences. Mainly builtins (and other c modules) don't have a file reference after it's imported like modules written in python. >>> import dis >>> dis >>> dis.__file__ '/usr/local/lib/python3.2/dis.py' >>> import builtins >>> builtins >>> builtins.__file__ Traceback (most recent call last): File "", line 1, in AttributeError: 'module' object has no attribute '__file__' So they appear as if they don't have a source. There is probably a better term for this than virtual. I was thinking it fits well for modules constructed in memory rather than ones built up by executing python code directly. Hmmm... Should modules written in other languages have a __file__ attribute? Would that help introspection or in other ways? >> It's this loading part that can be improved. > > I don't understand the point of this tangent. The practice of how > objects are merged into modules is already established: you use > "import *" or some other form of import statement. I want to *make > that work properly*, not invent a new way to do it. Sorry, I was looking for ways to avoid changing __module__. All of the above ways, will still have the __module__ attribute on objects set to the module they came from. Which again is fine, because that is what you want most of the time. Just not in the case of partial. Setting __module__ manually is easy enough in that case. > Cheers, > Nick. I think I'm more likely to side track you at this point. I am starting to get familiar with the c code, but I still have a ways to go before I understand all the different parts. Getting there though. :-) On the python side of things, the attributes we've been discussing almost never have anything to do with what most programs are written to do. Unless it's a program written specifically for managing pythons various parts. It's kind of like the problem of separating content, context, and presentation in web pages. Sometimes it's hard to do. Cheers, Ron From dickinsm at gmail.com Mon Jan 10 09:27:30 2011 From: dickinsm at gmail.com (Mark Dickinson) Date: Mon, 10 Jan 2011 08:27:30 +0000 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: <20110107112435.2ae46c89@Knock> References: <20110107112435.2ae46c89@Knock> Message-ID: On Fri, Jan 7, 2011 at 10:24 AM, Martin Manns wrote: > Hi > > I would like to propose an addition of an "irange" function to > itertools. This addition could reduce testing effort when developing > applications, in which large integers show up. > > Both, xrange (Python 2.x) and range (Python 3.x) have limited support > for large integer step values, for example: > > Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07) > [GCC 4.4.5] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>>> range(10**10000, 10**10000+10**1000, 10**900)[5] > Traceback (most recent call last): > ?File "", line 1, in > OverflowError: Python int too large to convert to C ssize_t This example strikes me as a bug in range (specifically, in range_subscript in Objects/rangeobject.c). > Does such an addition make sense in your eyes? Wouldn't it be better to fix 'range' to behave as expected? Mark From ncoghlan at gmail.com Mon Jan 10 12:26:01 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Jan 2011 21:26:01 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D2A5CD1.6020503@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> <4D2A5CD1.6020503@ronadam.com> Message-ID: On Mon, Jan 10, 2011 at 11:11 AM, Ron Adam wrote: > On the python side of things, the attributes we've been discussing almost > never have anything to do with what most programs are written to do. Unless > it's a program written specifically for managing pythons various parts. It's > kind of like the problem of separating content, context, and presentation in > web pages. ?Sometimes it's hard to do. Yep - 99.99% of python code will never care if this is ever fixed. However, the fact that we've started using acceleration modules and pseudo-packages in the standard library means that "things should just work" is being broken subtly in the stuff we're shipping ourselves (either by creating pickling problems, as in unittest, or misleading introspection results, as in functools and datetime). And if we're going to fix it at all, we may as well fix it right :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From fuzzyman at voidspace.org.uk Mon Jan 10 12:37:08 2011 From: fuzzyman at voidspace.org.uk (Michael Foord) Date: Mon, 10 Jan 2011 11:37:08 +0000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> <4D2A5CD1.6020503@ronadam.com> Message-ID: On 10 January 2011 11:26, Nick Coghlan wrote: > On Mon, Jan 10, 2011 at 11:11 AM, Ron Adam wrote: > > On the python side of things, the attributes we've been discussing almost > > never have anything to do with what most programs are written to do. > Unless > > it's a program written specifically for managing pythons various parts. > It's > > kind of like the problem of separating content, context, and presentation > in > > web pages. Sometimes it's hard to do. > > Yep - 99.99% of python code will never care if this is ever fixed. > However, the fact that we've started using acceleration modules and > pseudo-packages in the standard library means that "things should just > work" is being broken subtly in the stuff we're shipping ourselves > (either by creating pickling problems, as in unittest, or misleading > introspection results, as in functools and datetime). > > And if we're going to fix it at all, we may as well fix it right :) > > I certainly don't object to fixing this, and neither do I object to adding a new class / module / function attribute to achieve it. However... is there anything else that this fixes? (Are there more examples "in the wild" where this would help?) The unittest problem with pickling is real but likely to only affect a very, very small number of users. The introspection problem (getsource) for functools and datetime isn't a *real* problem because the source code isn't available. If in fact getsource now points to the pure Python version even in the cases where the C versions are being used then "fixing" this seems like a step backwards... Python 3.2: >>> import inspect >>> from datetime import date >>> inspect.getsource(date) 'class date:\n """Concrete date type.\n\n ...' Python 3.1: >>> import inspect >>> from datetime import date >>> inspect.getsource(date) Traceback (most recent call last): ... IOError: source code not available With your changes in place would Python 3.3 revert to the 3.1 behaviour here? How is this an advantage? What I'm really asking is, is the cure (and the accompanying implementation effort and additional complexity to the Python object model) worse than the disease... All the best, Michael Foord > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jan 10 12:52:36 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Jan 2011 21:52:36 +1000 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: References: <20110107112435.2ae46c89@Knock> Message-ID: On Mon, Jan 10, 2011 at 6:27 PM, Mark Dickinson wrote: >> Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07) >> [GCC 4.4.5] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>>>> range(10**10000, 10**10000+10**1000, 10**900)[5] >> Traceback (most recent call last): >> ?File "", line 1, in >> OverflowError: Python int too large to convert to C ssize_t Note that the problem isn't actually the step value - it's the overall length of the resulting sequence. If you make the sequence shorter, it works (at least in 3.2, I didn't check earlier versions): >>> x = range(10**10000, 10**10000+(500*10**900), 10**900) >>> len(x) 500 >>> x[5] > This example strikes me as a bug in range (specifically, in > range_subscript in Objects/rangeobject.c). The main issue is actually in range_item rather than range_subscript - we invoke range_len() there to simplify the bounds checking logic. To remove this limitation, the C arithmetic and comparison operations in that function need to be replaced with their PyLong equivalent, similar to what has been done for compute_range_length(). There's a related bug where range_subscript doesn't support *indices* greater than sys.maxsize - given an indexing helper function that can handle a range length that doesn't fit in sys.maxsize, it would be easy to call that unconditionally rather than indirectly via range_item, fixing that problem as well. >> Does such an addition make sense in your eyes? > > Wouldn't it be better to fix 'range' to behave as expected? Agreed. It isn't a deliberate design limitation - it's just a consequence of the fact that converting from C integer programming to PyLong programming is a PITA, so it has been a process of progressive upgrades in range's support for values that don't fit in sys.maxsize. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Jan 10 13:09:52 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 10 Jan 2011 22:09:52 +1000 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> <4D2A5CD1.6020503@ronadam.com> Message-ID: On Mon, Jan 10, 2011 at 9:37 PM, Michael Foord wrote: > I certainly don't object to fixing this, and neither do I object to adding a > new class / module / function attribute to achieve it. > > However... is there anything else that this fixes? (Are there more examples > "in the wild" where this would help?) > > The unittest problem with pickling is real but likely to only affect a very, > very small number of users. The introspection problem (getsource) for > functools and datetime isn't a *real* problem because the source code isn't > available. If in fact getsource now points to the pure Python version even > in the cases where the C versions are being used then "fixing" this seems > like a step backwards... unittest is actually a better example, because there *is* a solution to your pickling problem: alter __module__ to say "unittest" rather than "unittest.", just as _functools.partial and the _datetime classes do. However, you've stated you don't want to do that because it would break introspection. That's a reasonable position to take, so the idea is to make it so you don't have to make that choice. Instead, you'll be able to happily adjust __module__ to make pickling work properly, while introspection will be able to fall back on __impl_module__ to get the correct information. > Python 3.2: >>>> import inspect >>>> from datetime import date >>>> inspect.getsource(date) > 'class date:\n??? """Concrete date type.\n\n ...' > > Python 3.1: >>>> import inspect >>>> from datetime import date >>>> inspect.getsource(date) > Traceback (most recent call last): > ? ... > IOError: source code not available > > With your changes in place would Python 3.3 revert to the 3.1 behaviour > here? How is this an advantage? It's an improvement because the current answer is misleading: that source code is *not* what is currently running. You can change that source to your heart's content and it will do exactly *squat* when it comes to changing the interpreter's behaviour. That said, one of the benefits of this proposal is that we aren't restricted to the either/or behaviour. Since the interpreter will provide both pieces of information, we have plenty of opportunity to make inspect smarter about the situation. (e.g. only looking in __impl_module__ by default, but offering a flag to also check __module__ if no source is available from the implementation module). > What I'm really asking is, is the cure (and the accompanying implementation > effort and additional complexity to the Python object model) worse than the > disease... Potentially, but I see enough merit in the idea to follow up with a PEP for it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From zac256 at gmail.com Mon Jan 10 15:55:06 2011 From: zac256 at gmail.com (Zac Burns) Date: Mon, 10 Jan 2011 22:55:06 +0800 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: References: <20110107112435.2ae46c89@Knock> Message-ID: -1 for any proposal that adds anything differentiating int/long. -Zac On Mon, Jan 10, 2011 at 7:52 PM, Nick Coghlan wrote: > On Mon, Jan 10, 2011 at 6:27 PM, Mark Dickinson > wrote: > >> Python 3.1.3 (r313:86834, Nov 28 2010, 10:01:07) > >> [GCC 4.4.5] on linux2 > >> Type "help", "copyright", "credits" or "license" for more information. > >> > >>>>> range(10**10000, 10**10000+10**1000, 10**900)[5] > >> Traceback (most recent call last): > >> File "", line 1, in > >> OverflowError: Python int too large to convert to C ssize_t > > Note that the problem isn't actually the step value - it's the overall > length of the resulting sequence. > > If you make the sequence shorter, it works (at least in 3.2, I didn't > check earlier versions): > > >>> x = range(10**10000, 10**10000+(500*10**900), 10**900) > >>> len(x) > 500 > >>> x[5] > > > > This example strikes me as a bug in range (specifically, in > > range_subscript in Objects/rangeobject.c). > > The main issue is actually in range_item rather than range_subscript - > we invoke range_len() there to simplify the bounds checking logic. To > remove this limitation, the C arithmetic and comparison operations in > that function need to be replaced with their PyLong equivalent, > similar to what has been done for compute_range_length(). > > There's a related bug where range_subscript doesn't support *indices* > greater than sys.maxsize - given an indexing helper function that can > handle a range length that doesn't fit in sys.maxsize, it would be > easy to call that unconditionally rather than indirectly via > range_item, fixing that problem as well. > > >> Does such an addition make sense in your eyes? > > > > Wouldn't it be better to fix 'range' to behave as expected? > > Agreed. It isn't a deliberate design limitation - it's just a > consequence of the fact that converting from C integer programming to > PyLong programming is a PITA, so it has been a process of progressive > upgrades in range's support for values that don't fit in sys.maxsize. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Mon Jan 10 16:23:21 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Jan 2011 01:23:21 +1000 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: References: <20110107112435.2ae46c89@Knock> Message-ID: On Tue, Jan 11, 2011 at 12:55 AM, Zac Burns wrote: > -1 for any proposal that adds anything differentiating int/long. It isn't about adding anything - the signature of the length slots at the C level already uses a Py_ssize_t, so any time you get the length of a container, you're limited to values that will fit in that size. This is fine for real containers, as you will run out of memory long before the container length overflows and throws an exception. It *is* an issue for a virtual container like range() though - because it doesn't actually *create* the whole range, it can be created with a length that exceeds what Py_ssize_t can handle. That's fine, until you run into one of the operations that directly or indirectly invokes len() on the object. Currently, indexing a range is such an operation (which is why it fails). While it's a fairly straightforward (albeit somewhat tedious) change to fix range_subscript and range_item to correctly handle cases where the index and/or the length exceed sys.maxsize, it still requires someone to actually create the issue on the tracker and then propose a patch to fix it. It's even theoretically possible to upgrade the __len__ protocol to support lengths that exceed Py_ssize_t, but that's a much more ambitious (PEP scale) project. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mmanns at gmx.net Mon Jan 10 18:01:35 2011 From: mmanns at gmx.net (Martin Manns) Date: Mon, 10 Jan 2011 18:01:35 +0100 Subject: [Python-ideas] Add irange with large integer step support to itertools References: <20110107112435.2ae46c89@Knock> Message-ID: <20110110180135.6d81c261@Knock> On Mon, 10 Jan 2011 21:52:36 +1000 Nick Coghlan wrote: > > Wouldn't it be better to fix 'range' to behave as expected? > > Agreed. It isn't a deliberate design limitation - it's just a > consequence of the fact that converting from C integer programming to > PyLong programming is a PITA, so it has been a process of progressive > upgrades in range's support for values that don't fit in sys.maxsize. Nick: So the limitations is not a deliberate design choice. Looking at the tracker, a fix would probably be covered by issue2690. I see that you have provided a patch there on Dec. 3. However, this patch either does not address the problem or it has not been committed to Py3k, yet. I checked Py3k with an svn dump and it shows the same behavior as Python 3.1. original message by Nick Coghlan in issue2690: > "I brought the patch up to date for the Py3k branch, but realised just > before checking it in that it may run afoul of the language moratorium > (since it alters the behaviour of builtin range objects)." Does the patch address the issue or is it a more complicated problem? If the former is the case then could issue2690 be re-opened and the patch committed? However, if the latter is the case then I still would like to propose at least adding a snippet to the itertools docs because fixing the issue properly could take its time. Cheers Martin From stefan_ml at behnel.de Mon Jan 10 18:04:59 2011 From: stefan_ml at behnel.de (Stefan Behnel) Date: Mon, 10 Jan 2011 18:04:59 +0100 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: <20110110180135.6d81c261@Knock> References: <20110107112435.2ae46c89@Knock> <20110110180135.6d81c261@Knock> Message-ID: Martin Manns, 10.01.2011 18:01: > original message by Nick Coghlan in issue2690: > >> "I brought the patch up to date for the Py3k branch, but realised just >> before checking it in that it may run afoul of the language moratorium >> (since it alters the behaviour of builtin range objects)." The language moratorium does not apply here because the described behaviour is a limitation that is specific to the CPython implementation, not the language. Stefan From ncoghlan at gmail.com Mon Jan 10 18:42:43 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Jan 2011 03:42:43 +1000 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: <20110110180135.6d81c261@Knock> References: <20110107112435.2ae46c89@Knock> <20110110180135.6d81c261@Knock> Message-ID: On Tue, Jan 11, 2011 at 3:01 AM, Martin Manns wrote: > original message by Nick Coghlan in issue2690: > >> "I brought the patch up to date for the Py3k branch, but realised just >> before checking it in that it may run afoul of the language moratorium >> (since it alters the behaviour of builtin range objects)." > > > Does the patch address the issue or is it a more complicated problem? It's a separate problem. Last paragraph of the message you quoted: >>Note that I also fixed the patch so that OverflowError occurs only when encountering an affected operation (primarily indexing and retrieval of the length). If you don't do any of those things, you can make your ranges as large as you like. (The indexing could fairly easily be fixed to eliminate the overflow errors - I just didn't do it in this patch, since it is a separate problem). > However, if the latter is the case then I still would like to propose > at least adding a snippet to the itertools docs because fixing the > issue properly could take its time. It's a separate issue. In the meantime, a recipe on the Python Cookbook would probably be the most appropriate way to handle it. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Mon Jan 10 18:45:35 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 11 Jan 2011 03:45:35 +1000 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: References: <20110107112435.2ae46c89@Knock> <20110110180135.6d81c261@Knock> Message-ID: On Tue, Jan 11, 2011 at 3:04 AM, Stefan Behnel wrote: > Martin Manns, 10.01.2011 18:01: >> >> original message by Nick Coghlan in issue2690: >> >>> "I brought the patch up to date for the Py3k branch, but realised just >>> before checking it in that it may run afoul of the language moratorium >>> (since it alters the behaviour of builtin range objects)." > > The language moratorium does not apply here because the described behaviour > is a limitation that is specific to the CPython implementation, not the > language. The patch that message referred to was subsequently checked in, it just didn't solve the bug mentioned in this thread :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Mon Jan 10 18:55:45 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 10 Jan 2011 11:55:45 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> <4D2A5CD1.6020503@ronadam.com> Message-ID: <4D2B4821.8000903@ronadam.com> On 01/10/2011 05:26 AM, Nick Coghlan wrote: > On Mon, Jan 10, 2011 at 11:11 AM, Ron Adam wrote: >> On the python side of things, the attributes we've been discussing almost >> never have anything to do with what most programs are written to do. Unless >> it's a program written specifically for managing pythons various parts. It's >> kind of like the problem of separating content, context, and presentation in >> web pages. Sometimes it's hard to do. > > Yep - 99.99% of python code will never care if this is ever fixed. > However, the fact that we've started using acceleration modules and > pseudo-packages in the standard library means that "things should just > work" is being broken subtly in the stuff we're shipping ourselves > (either by creating pickling problems, as in unittest, or misleading > introspection results, as in functools and datetime). > > And if we're going to fix it at all, we may as well fix it right :) Fixing it right mean taking a longer view point. What would we like all this stuff to look like two or more versions down the road? (Probably python 3.5 or 3.6) Doing the minimum to fix just the immediate problems is a short term veiw. That will work, but if we can align it with up with a longer view solution, it would be better. If we can't decide what the long term solution might be, then we may be better off using private attributes and methods for now for these isolated situations. How about making __module__ a property on accelerated objects, that looks for a global flag, then returned either, _module__ or _alt_module__ depending on the flag? (or some other way of store those values) Pickle could set the flag so it can get what it needs from __module__, then unset it when it's done. Cheers, Ron From eric at trueblade.com Mon Jan 10 19:34:27 2011 From: eric at trueblade.com (Eric Smith) Date: Mon, 10 Jan 2011 13:34:27 -0500 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D2B4821.8000903@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> <4D2A5CD1.6020503@ronadam.com> <4D2B4821.8000903@ronadam.com> Message-ID: <4D2B5133.1000002@trueblade.com> On 01/10/2011 12:55 PM, Ron Adam wrote: > How about making __module__ a property on accelerated objects, that > looks for a global flag, then returned either, _module__ or > _alt_module__ depending on the flag? (or some other way of store those > values) Global flag => threading problem. From rrr at ronadam.com Mon Jan 10 19:49:15 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 10 Jan 2011 12:49:15 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: <4D2B4821.8000903@ronadam.com> References: <4D1BE506.2070806@ronadam.com> <4D267C8F.3050601@ronadam.com> <4D282917.3020606@ronadam.com> <4D29F6C8.2010505@ronadam.com> <4D2A5CD1.6020503@ronadam.com> <4D2B4821.8000903@ronadam.com> Message-ID: <4D2B54AB.8000202@ronadam.com> On 01/10/2011 11:55 AM, Ron Adam wrote: > How about making __module__ a property on accelerated objects, that looks > for a global flag, then returned either, _module__ or _alt_module__ > depending on the flag? (or some other way of store those values) > > Pickle could set the flag so it can get what it needs from __module__, then > unset it when it's done. Or this maybe should be the other way around. When a module begins with an underscore it should be considered a private implementation detail. So __module__, in the case of partial, is already set to the correct value. But when the actual name is needed instead of the official name, A global flag can be set. Then when module is a property, it will get the actual name, instead of the official name. Cheers, Ron From ncoghlan at gmail.com Tue Jan 11 19:27:12 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Jan 2011 04:27:12 +1000 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: References: <20110107112435.2ae46c89@Knock> Message-ID: On Tue, Jan 11, 2011 at 1:23 AM, Nick Coghlan wrote: > Currently, indexing a range is such an operation (which is why it > fails). While it's a fairly straightforward (albeit somewhat tedious) > change to fix range_subscript and range_item to correctly handle cases > where the index and/or the length exceed sys.maxsize, it still > requires someone to actually create the issue on the tracker and then > propose a patch to fix it. As anticipated, a reasonably straightforward but somewhat tedious change to make: http://bugs.python.org/issue10889 :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Wed Jan 12 04:25:20 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 12 Jan 2011 13:25:20 +1000 Subject: [Python-ideas] Add irange with large integer step support to itertools In-Reply-To: References: <20110107112435.2ae46c89@Knock> Message-ID: On Wed, Jan 12, 2011 at 4:27 AM, Nick Coghlan wrote: > On Tue, Jan 11, 2011 at 1:23 AM, Nick Coghlan wrote: >> Currently, indexing a range is such an operation (which is why it >> fails). While it's a fairly straightforward (albeit somewhat tedious) >> change to fix range_subscript and range_item to correctly handle cases >> where the index and/or the length exceed sys.maxsize, it still >> requires someone to actually create the issue on the tracker and then >> propose a patch to fix it. > > As anticipated, a reasonably straightforward but somewhat tedious > change to make: http://bugs.python.org/issue10889 :) This change has been committed to SVN, and will be available with 3.2rc1 later this week. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From rrr at ronadam.com Wed Jan 12 11:31:12 2011 From: rrr at ronadam.com (Ron Adam) Date: Wed, 12 Jan 2011 04:31:12 -0600 Subject: [Python-ideas] Module aliases and/or "real names" In-Reply-To: References: Message-ID: <4D2D82F0.8000904@ronadam.com> On 12/29/2010 03:52 PM, Nick Coghlan wrote: > Disclaimer: this is a currently half-baked idea that needs some > discussion here if it is going to turn into something a bit more > coherent :) > > On and off, I've been pondering the problem of the way implementation > details (like the real file structures of the multiprocessing and > unittest packages, or whether or not an interpreter use the pure > Python or the C accelerated version of various modules) leak out into > the world via the __module__ attribute on various components. This > mostly comes up when discussing pickle compatibility between 2.x and > 3.x, but in can show up in various guises whenever you start relying > on dynamic introspection. > > As, I see it, there are 3 basic ways of dealing with the problem: > > 1. Allow objects to lie about their source module > This is likely a terrible idea, since a function's global namespace > reference would disagree with its module reference. I suspect much > weirdness would result. > > 2. A pickle-specific module alias registry, since that is where the > problem comes up most often > A possible approach, but not necessarily a good one (since it isn't > really a pickle-specific problem). > > 3. An inspect-based module alias registry > That is, an additional query API (get_canonical_module_name?) in the > inspect module that translates from the implementation detail module > name to the "preferred" module name. The implementation could be as > simple as a "__canonical__" attribute in the module namespace. > > I actually quite like option 3, with various things (such as pydoc) > updated to show *both* names when they're different. That way people > will know where to find official documentation for objects from > pseudo-packages and acceleration modules (i.e. under the canonical > name), without hiding where the actual implementation came from. > > Pickle *generation* could then be updated to only send canonical > module names during normal operation, reducing the exposure of > implementation details like pseudo-packages and acceleration modules. > > Whether or not runpy should set __canonical__ on the main module would > be an open question (probably not, *unless* runpy was also updated to > add the main module to sys.modules under its real name as well > __main__). This makes more sense now that we've discussed it a bit. Here's a rough sketch of a context manager that temporarily overrides the __module__ attribute. This works well for simple introspection. For example, you can use it to call inspect functions without changing them. But pickling is recursive, so this probably wouldn't work very well for that. Cheers, Ron #------------------------------- from contextlib import contextmanager class cls: def method(self): pass c = cls() InstanceMethod = type(c.method) def _getter(self, value): if value == "__module__" and hasattr(self, "__alt_module__"): return object.__getattribute__(self, "__alt_module__") return object.__getattribute__(self, value) @contextmanager def alt_module_getter(obj): obj.__class__.__getattribute__ = InstanceMethod(_getter, obj) try: yield obj finally: del obj.__class__.__getattribute__ def get_module_name(obj): return obj.__module__ # gets __alt__module__ if it exists, else gets __module__ with alt_module_getter(obj) as obj: module_name = get_module_name(obj) From luc.goossens at cern.ch Thu Jan 13 15:30:26 2011 From: luc.goossens at cern.ch (Luc Goossens) Date: Thu, 13 Jan 2011 15:30:26 +0100 Subject: [Python-ideas] values in vs. values out Message-ID: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Hi all, There's a striking asymmetry between the wonderful flexibility in passing values into functions (positional args, keyword args, default values, *args, **kwargs, ...) and the limited options for processing the return values (assignment). Hence, whenever I upgrade a function with a new keyword arg and a default value, I do not have to change any of the existing calls, whereas whenever I add a new element to its output tuple, I find myself chasing all existing code to upgrade the corresponding assignments with an additional (unused) variable. So I was wondering whether this was ever discussed before (and recorded) inside the Python community. (naively what seems to be missing is the ability to use the assignment machinery that binds functions' formal params to the given actual param list also in the context of a return value assignment) cheers, Luc From rob.cliffe at btinternet.com Thu Jan 13 15:56:13 2011 From: rob.cliffe at btinternet.com (Rob Cliffe) Date: Thu, 13 Jan 2011 14:56:13 +0000 Subject: [Python-ideas] values in vs. values out In-Reply-To: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: <4D2F128D.1020105@btinternet.com> To deal with specifically adding a new value to a returned tuple, you could write your function calls to truncate the tuple to the expected length, e.g. def myfunc(): ... return (result1, result2, newresult) x,y = myfunc()[2] x,y,z = myfunc()[3] So you would have to change all the relevant function calls, but only once. More generally, perhaps you could return a dictionary. Although this makes the function calls a bit more awkward: results = myfunc() x, y = results['result1'], results['result2'] Best wishes Rob Cliffe On 13/01/2011 14:30, Luc Goossens wrote: > Hi all, > > There's a striking asymmetry between the wonderful flexibility in > passing values into functions (positional args, keyword args, default > values, *args, **kwargs, ...) and the limited options for processing > the return values (assignment). > Hence, whenever I upgrade a function with a new keyword arg and a > default value, I do not have to change any of the existing calls, > whereas whenever I add a new element to its output tuple, I find > myself chasing all existing code to upgrade the corresponding > assignments with an additional (unused) variable. > So I was wondering whether this was ever discussed before (and > recorded) inside the Python community. > (naively what seems to be missing is the ability to use the assignment > machinery that binds functions' formal params to the given actual > param list also in the context of a return value assignment) > > cheers, > Luc > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From eric at trueblade.com Thu Jan 13 16:00:15 2011 From: eric at trueblade.com (Eric Smith) Date: Thu, 13 Jan 2011 10:00:15 -0500 Subject: [Python-ideas] values in vs. values out In-Reply-To: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: <4D2F137F.4030405@trueblade.com> On 01/13/2011 09:30 AM, Luc Goossens wrote: > Hi all, > > There's a striking asymmetry between the wonderful flexibility in > passing values into functions (positional args, keyword args, default > values, *args, **kwargs, ...) and the limited options for processing the > return values (assignment). > Hence, whenever I upgrade a function with a new keyword arg and a > default value, I do not have to change any of the existing calls, > whereas whenever I add a new element to its output tuple, I find myself > chasing all existing code to upgrade the corresponding assignments with > an additional (unused) variable. > So I was wondering whether this was ever discussed before (and recorded) > inside the Python community. > (naively what seems to be missing is the ability to use the assignment > machinery that binds functions' formal params to the given actual param > list also in the context of a return value assignment) You can achieve something similar with PEP 3132's Extended Iterable Unpacking: >>> def f(): return 0, 1, 2, 3 ... >>> a, b, c, d, *unused = f() >>> a, b, c, d, unused (0, 1, 2, 3, []) If you add more return values, they show up in unused. >>> def f(): return 0, 1, 2, 3, 4 ... >>> a, b, c, d, *unused = f() # note caller is unchanged >>> a, b, c, d, unused (0, 1, 2, 3, [4]) Or you could return dicts. Eric. From ben+python at benfinney.id.au Thu Jan 13 16:13:24 2011 From: ben+python at benfinney.id.au (Ben Finney) Date: Fri, 14 Jan 2011 02:13:24 +1100 Subject: [Python-ideas] values in vs. values out References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: <87y66oalbf.fsf@benfinney.id.au> Luc Goossens writes: > Hence, whenever I upgrade a function with a new keyword arg and a > default value, I do not have to change any of the existing calls, > whereas whenever I add a new element to its output tuple, I find > myself chasing all existing code to upgrade the corresponding > assignments with an additional (unused) variable. If your function is returning a bunch of related values in a tuple, and that tuple keeps changing as you re-design the code, that's a code smell. The tuple should instead be a user-defined type (defined with ?class?), the elements of the tuple should instead be attributes of the type, and the return value should be a single object of that type. The type can grow new attributes as you change the design, without the calling code needing to know every attribute. This refactoring is called ?Replace Array With Object? in the Java world, but it's just as applicable in Python. -- \ ?How wonderful that we have met with a paradox. Now we have | `\ some hope of making progress.? ?Niels Bohr | _o__) | Ben Finney From luc.goossens at cern.ch Thu Jan 13 16:21:14 2011 From: luc.goossens at cern.ch (Luc Goossens) Date: Thu, 13 Jan 2011 16:21:14 +0100 Subject: [Python-ideas] values in vs. values out In-Reply-To: <4D2F137F.4030405@trueblade.com> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> Message-ID: <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Hi Eric (and Rob, and Ben, ...), Sorry maybe this was not clear from my mail but I am not so much interested in possible work-arounds but in why this asymmetry exists in the first place. I mean is there a reason as to why it is the way it is, or is it just that nobody ever asked for anything else. cheers, Luc On Jan 13, 2011, at 4:00 PM, Eric Smith wrote: > On 01/13/2011 09:30 AM, Luc Goossens wrote: >> Hi all, >> >> There's a striking asymmetry between the wonderful flexibility in >> passing values into functions (positional args, keyword args, default >> values, *args, **kwargs, ...) and the limited options for >> processing the >> return values (assignment). >> Hence, whenever I upgrade a function with a new keyword arg and a >> default value, I do not have to change any of the existing calls, >> whereas whenever I add a new element to its output tuple, I find >> myself >> chasing all existing code to upgrade the corresponding assignments >> with >> an additional (unused) variable. >> So I was wondering whether this was ever discussed before (and >> recorded) >> inside the Python community. >> (naively what seems to be missing is the ability to use the >> assignment >> machinery that binds functions' formal params to the given actual >> param >> list also in the context of a return value assignment) > > You can achieve something similar with PEP 3132's Extended Iterable > Unpacking: > > >>> def f(): return 0, 1, 2, 3 > ... > >>> a, b, c, d, *unused = f() > >>> a, b, c, d, unused > (0, 1, 2, 3, []) > > If you add more return values, they show up in unused. > > >>> def f(): return 0, 1, 2, 3, 4 > ... > >>> a, b, c, d, *unused = f() # note caller is unchanged > >>> a, b, c, d, unused > (0, 1, 2, 3, [4]) > > Or you could return dicts. > > Eric. > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From steve at pearwood.info Thu Jan 13 16:27:19 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 14 Jan 2011 02:27:19 +1100 Subject: [Python-ideas] values in vs. values out In-Reply-To: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: <4D2F19D7.2040602@pearwood.info> Luc Goossens wrote: > Hi all, > > There's a striking asymmetry between the wonderful flexibility in > passing values into functions (positional args, keyword args, default > values, *args, **kwargs, ...) and the limited options for processing the > return values (assignment). You're not limited to returning tuples. You could return an object with named attributes, or a namedtuple, or even just a dict. There's precedence in the standard library, for example, os.stat. Except in the case of tuple unpacking, this does mean that assigning the result of the function call is a two stage procedure: t = func(x, y, z) a, b, c = t.spam, t.ham, t.cheese but it does give you flexibility in adding new return fields without having to update function calls that don't use the new fields. -- Steven From eric at trueblade.com Thu Jan 13 16:33:02 2011 From: eric at trueblade.com (Eric Smith) Date: Thu, 13 Jan 2011 10:33:02 -0500 Subject: [Python-ideas] values in vs. values out In-Reply-To: <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: <4D2F1B2E.9020108@trueblade.com> On 01/13/2011 10:21 AM, Luc Goossens wrote: > Hi Eric (and Rob, and Ben, ...), > > Sorry maybe this was not clear from my mail but I am not so much > interested in possible work-arounds but in why this asymmetry exists in > the first place. > I mean is there a reason as to why it is the way it is, or is it just > that nobody ever asked for anything else. If the system automatically ignored "new" return values (for whatever "new" might mean), I think it would be too easy to miss return values that you don't mean to be ignoring. Eric. From ddasilva at umd.edu Thu Jan 13 16:50:22 2011 From: ddasilva at umd.edu (Daniel da Silva) Date: Thu, 13 Jan 2011 10:50:22 -0500 Subject: [Python-ideas] values in vs. values out In-Reply-To: <4D2F1B2E.9020108@trueblade.com> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> <4D2F1B2E.9020108@trueblade.com> Message-ID: If the return value is an instance of a class, then to extend the return value you just add a new instance attribute to the class. If a class feels too heavy duty, use a single named tuple and access its elements with dot notation. Either method is guaranteed not to break until you remove an instance attribute or element, at which point it doesn't make sense to do anything else. On Thu, Jan 13, 2011 at 10:33 AM, Eric Smith wrote: > On 01/13/2011 10:21 AM, Luc Goossens wrote: > >> Hi Eric (and Rob, and Ben, ...), >> >> Sorry maybe this was not clear from my mail but I am not so much >> interested in possible work-arounds but in why this asymmetry exists in >> the first place. >> I mean is there a reason as to why it is the way it is, or is it just >> that nobody ever asked for anything else. >> > > If the system automatically ignored "new" return values (for whatever "new" > might mean), I think it would be too easy to miss return values that you > don't mean to be ignoring. > > > Eric. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rich at noir.com Thu Jan 13 19:16:58 2011 From: rich at noir.com (K. Richard Pixley) Date: Thu, 13 Jan 2011 10:16:58 -0800 Subject: [Python-ideas] values in vs. values out In-Reply-To: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: <4D2F419A.30703@noir.com> Sounds like you'd be happier returning named tuples. I'm sure I saw something about a named tuple package change recently but I can't find it now. Perhaps someone else will have it. --rich On 1/13/11 06:30 , Luc Goossens wrote: > Hi all, > > There's a striking asymmetry between the wonderful flexibility in > passing values into functions (positional args, keyword args, default > values, *args, **kwargs, ...) and the limited options for processing > the return values (assignment). > Hence, whenever I upgrade a function with a new keyword arg and a > default value, I do not have to change any of the existing calls, > whereas whenever I add a new element to its output tuple, I find > myself chasing all existing code to upgrade the corresponding > assignments with an additional (unused) variable. > So I was wondering whether this was ever discussed before (and > recorded) inside the Python community. > (naively what seems to be missing is the ability to use the assignment > machinery that binds functions' formal params to the given actual > param list also in the context of a return value assignment) > > cheers, > Luc > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From alexander.belopolsky at gmail.com Thu Jan 13 20:12:04 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 13 Jan 2011 14:12:04 -0500 Subject: [Python-ideas] values in vs. values out In-Reply-To: <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: On Thu, Jan 13, 2011 at 10:21 AM, Luc Goossens wrote: .. > Sorry maybe this was not clear from my mail but I am not so much interested > in possible work-arounds but in why this asymmetry exists in the first > place. It looks like you are asking why tuple unpacking syntax does not support all options available to argument passing. Part of it (the variable length unpacking) is the subject of PEP 3132 , which was approved, but the implementation was postponed due to the moratorium on language changes in effect for 3.2 release. Note that PEP 3132 does not really achieve symmetry with argument passing because it makes (a, *x, b) = .. valid while f(a, *x, b) is not. > I mean is there a reason as to why it is the way it is, or is it just that > nobody ever asked for anything else. No one has ever proposed a design in which tuple unpacking and argument passing is "symmetric". This may very well be impossible. From alexander.belopolsky at gmail.com Thu Jan 13 20:23:42 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 13 Jan 2011 14:23:42 -0500 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: On Thu, Jan 13, 2011 at 2:12 PM, Alexander Belopolsky wrote: > .. ?Part of it (the > variable length unpacking) is the subject of PEP 3132 > , which was approved, but > the implementation was postponed due to the moratorium on language > changes in effect for 3.2 release. I should have checked before posting. In py3k: >>> a, *b = range(10) >>> b [1, 2, 3, 4, 5, 6, 7, 8, 9] See also http://bugs.python.org/issue2292 . From masklinn at masklinn.net Thu Jan 13 20:27:39 2011 From: masklinn at masklinn.net (Masklinn) Date: Thu, 13 Jan 2011 20:27:39 +0100 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: On 2011-01-13, at 20:12 , Alexander Belopolsky wrote: > >> I mean is there a reason as to why it is the way it is, or is it just that >> nobody ever asked for anything else. > > No one has ever proposed a design in which tuple unpacking and > argument passing is "symmetric". This may very well be impossible. Indeed, barring partial dictionary matching (take PEP 3132 and now do the same with dicts) *and* a new object which combines attributes of tuples and dictionaries (and lets the user match them both at once, some kind of ordered dictionary with offsets so the initial values can be purely positional) I don't see how it would be possible to replicate Python's breadth of arguments unpacking in return values. Considering Python's tendency to hedge towards functional programming and pattern matching (these days it runs the opposite way as fast as possible and doesn't stop until it's gone through a few borders), I don't see it happening any time soon, even ignoring the debatable value of the scheme. From ianb at colorstudy.com Thu Jan 13 20:41:30 2011 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 13 Jan 2011 13:41:30 -0600 Subject: [Python-ideas] values in vs. values out In-Reply-To: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: On Thu, Jan 13, 2011 at 8:30 AM, Luc Goossens wrote: > There's a striking asymmetry between the wonderful flexibility in passing > values into functions (positional args, keyword args, default values, *args, > **kwargs, ...) and the limited options for processing the return values > (assignment). > Hence, whenever I upgrade a function with a new keyword arg and a default > value, I do not have to change any of the existing calls, whereas whenever I > add a new element to its output tuple, I find myself chasing all existing > code to upgrade the corresponding assignments with an additional (unused) > variable. > So I was wondering whether this was ever discussed before (and recorded) > inside the Python community. > (naively what seems to be missing is the ability to use the assignment > machinery that binds functions' formal params to the given actual param list > also in the context of a return value assignment) > I have often thought that I'd like a way to represent the arguments to a function. (args, kwargs) is what I usually use, but func(*thing[0], **thing[1]) is very unsatisfying. I'd like, um, func(***thing) ;) Interestingly you have traditionally been able to do things like "def func(a, (b, c))" (removed in py3, right?) -- but it created a sense of symmetric between assignment and function signatures. But of course keyword arguments aren't quite the same (nor are named parameters, but I'll ignore that). So it would be neat if you could do: (a, b, c=3) = func(...) where this was essentially like: result = func(...) (a, b) = result.args c = result.kwargs.get('c', 3) Where result was some new tuple-dict hybrid object. -- Ian Bicking | http://blog.ianbicking.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Jan 13 20:44:02 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 13 Jan 2011 14:44:02 -0500 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: On Thu, Jan 13, 2011 at 2:27 PM, Masklinn wrote: > .. I don't see how it would be possible to replicate Python's breadth of arguments unpacking in return values. What I do miss sometimes is the ability to inject the contents of a dictionary into locals. For example, when I get the results of a database query in a list of dictionaries or named tuples, I would like to do something like for in sql('select name, age from students'): print(name, age) I can achieve that with hacks like for x in sql('select name, age from students'): locals().update(*x) print(name, age) but I don't think this is guaranteed to work and it is ugly and inefficient. From jason.orendorff at gmail.com Thu Jan 13 21:11:00 2011 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 13 Jan 2011 14:11:00 -0600 Subject: [Python-ideas] values in vs. values out In-Reply-To: <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: On Thu, Jan 13, 2011 at 9:21 AM, Luc Goossens wrote: > Sorry maybe this was not clear from my mail but I am not so much interested > in possible work-arounds but in why this asymmetry exists in the first > place. > I mean is there a reason as to why it is the way it is, or is it just that > nobody ever asked for anything else. There are two kinds of asymmetry here. One is semantic and one is syntactic. 1. Semantically, function calls are fundamentally asymmetric in Python. A call takes as its input a tuple of arguments and a dictionary of keyword arguments, but its output is either a single return value or a single raised exception. 2. Syntactically, the syntax for composing a value (tuple expressions, list/set/dict displays, constructor calls) differs from the syntax for decomposing a value into its parts (unpacking assignment). The ML family of programming languages eliminate both asymmetries about as completely as I can imagine. ML functions take one argument and return one value; either can be a tuple. The same pattern-matching syntax is used to cope with parameters and return values. To a very great degree the syntax for composing a tuple, record, or list is the same as the syntax for decomposing it. So what you're asking is at least demonstrably possible, at least for other languages. So why does Python have these asymmetries? 1. The semantic asymmetry (functions taking multiple parameters but returning a single value) is a subtle thing. Even in Scheme, where conceptual purity and treating continuations as procedures are core design principles of the entire language, this asymmetry is baked into (lambda) and the behavior of function calls. And even in ML there is *some* asymmetry; a function can die with an error rather than return anything. (You can "error out" but not "error in".) In Python's design, I imagine Guido found this particular asymmetry made the language fit the brain better. It's more like C. The greater symmetry in languages like ML may have felt like be too much--and one more unfamiliar thing for new users to trip over. In any case it would be impractical to change this in Python. It's baked into the language, the implementation, and the C API. 2. The syntactic asymmetry is made up of lots of little asymmetries, and I think it's enlightening to take a few of them case by case. (a) You can write [a, b] = [1, 2] but not {a, b} = {1, 2} or {"A": a, "B": b} = {"A": 1, "B": 2} Sets have no deterministic order, so the second possibility is misleading. The third is not implemented, I imagine, purely for usability reasons: it would do more harm than good. (b) You can write x = complex(1, 2) but not complex(a, b) = x In ML-like languages, you can identify constructors at compile time, so it's clear on the left-hand side of something like this what variables are being defined. In Python it's not so obvious what this is supposed to do. (c) Unlike ML, you can write (a, b) = [1, 2] or generally a, b = any_iterable It is useful for unpacking to depend on the iterable protocol rather than the exact type of the right-hand side. This is a nicety that ML-like languages don't bother with, afaik. (d) You can write def f(x, y, a) : ... f(**dct) but not (x, y, a) = **dct and conversely you can write lst[1] = x but not def f(lst[1]): ... f(x) In both cases, I find the accepted syntax sane, and the symmetric-but-currently-not-accepted syntax baffling. Note that in the case of lst[1] = x, we are mutating an existing object, something ML does not bother to make easy. All four of these cases seem to boil down to what's useful vs. what's confusing. You could go on for some time in that vein. Hope this helps. -j From raymond.hettinger at gmail.com Thu Jan 13 21:10:56 2011 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Thu, 13 Jan 2011 12:10:56 -0800 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: <4C06659A-9226-4E97-87ED-F88CF1958C2A@gmail.com> > On Thu, Jan 13, 2011 at 8:30 AM, Luc Goossens wrote: > There's a striking asymmetry between the wonderful flexibility in passing values into functions (positional args, keyword args, default values, *args, **kwargs, ...) and the limited options for processing the return values (assignment). As others have mentions, if you return a dictionary or a named tuple from the function will a little more flexiblity with respect to argument order. In the end no matter what is done, there is still going to be a pretty tight semantic coupling between what a function returns and how the caller accesses it, so there are limits to what you can achieve with syntax. I would like to note that the complexity of passing arguments into functions is not a pure win. The flexibility has a cost in terms of complexity, learnability, speed, and implementation challenges. ISTM that very few people fully grok all of the existing capabilities. I don't think that adding yet more complexity to the language would be a net win. As C++ has shown, when you start getting to feature rich, the features will interact in unexpected ways. For example, how would all those options for processing return values interact with augmented assignment? Raymond FWIW, here's an excerpt from Grammar/Grammar: funcdef: 'def' NAME parameters ['->' test] ':' suite parameters: '(' [typedargslist] ')' typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [',' ['*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef]] | '*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef) tfpdef: NAME [':' test] varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' ['*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef]] | '*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef) vfpdef: NAME -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Thu Jan 13 22:10:30 2011 From: masklinn at masklinn.net (Masklinn) Date: Thu, 13 Jan 2011 22:10:30 +0100 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: On 2011-01-13, at 21:11 , Jason Orendorff wrote: > > (c) Unlike ML, you can write > (a, b) = [1, 2] > or generally > a, b = any_iterable > It is useful for unpacking to depend on the iterable protocol > rather than the exact type of the right-hand side. This is a > nicety that ML-like languages don't bother with, afaik. In no small part because, in ML-type languages (or more generally in functional languages, Erlang could hardly be called an ML-like language) lists (or more generally sequences) and tuples are very different beasts and entirely incompatible. So the language has no way to make sense of a pattern match between a tuple and a list, except maybe by considering that the list is a cons and that a tuple is a dotted pair. From jason.orendorff at gmail.com Thu Jan 13 22:31:39 2011 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 13 Jan 2011 15:31:39 -0600 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: On Thu, Jan 13, 2011 at 3:10 PM, Masklinn wrote: > On 2011-01-13, at 21:11 , Jason Orendorff wrote: >> (c) Unlike ML, you can write >> (a, b) = [1, 2] >> or generally >> a, b = any_iterable >> It is useful for unpacking to depend on the iterable protocol >> rather than the exact type of the right-hand side. This is a >> nicety that ML-like languages don't bother with, afaik. > In no small part because, in ML-type languages (or more generally in > functional languages, Erlang could hardly be called an ML-like language) > lists (or more generally sequences) and tuples are very different beasts and > entirely incompatible. Well, sure, as far as tuples go. But the point I was making was more general. Python has a notion of "iterable" which covers many types, not just "list". The iterable protocol is used by Python's for-loops, sorted(), str.join() and so on; it's only natural for unpacking assignment to use it as well. As far as I know, most ML languages don't have that notion.* So Python has a reason for this asymmetry that those languages don't have. -j *Haskell, to be sure, has several typeclasses that generalize List, but for whatever reason it is List, and not any of the generalizations, that is baked into the language. From rrr at ronadam.com Thu Jan 13 23:56:20 2011 From: rrr at ronadam.com (Ron Adam) Date: Thu, 13 Jan 2011 16:56:20 -0600 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: On 01/13/2011 01:41 PM, Ian Bicking wrote: > On Thu, Jan 13, 2011 at 8:30 AM, Luc Goossens > > wrote: > > There's a striking asymmetry between the wonderful flexibility in > passing values into functions (positional args, keyword args, default > values, *args, **kwargs, ...) and the limited options for processing > the return values (assignment). > Hence, whenever I upgrade a function with a new keyword arg and a > default value, I do not have to change any of the existing calls, > whereas whenever I add a new element to its output tuple, I find myself > chasing all existing code to upgrade the corresponding assignments with > an additional (unused) variable. > So I was wondering whether this was ever discussed before (and > recorded) inside the Python community. > (naively what seems to be missing is the ability to use the assignment > machinery that binds functions' formal params to the given actual param > list also in the context of a return value assignment) > > > I have often thought that I'd like a way to represent the arguments to a > function. (args, kwargs) is what I usually use, but func(*thing[0], > **thing[1]) is very unsatisfying. I'd like, um, func(***thing) ;) > > Interestingly you have traditionally been able to do things like "def > func(a, (b, c))" (removed in py3, right?) -- but it created a sense of > symmetric between assignment and function signatures. But of course > keyword arguments aren't quite the same (nor are named parameters, but I'll > ignore that). So it would be neat if you could do: > > (a, b, c=3) = func(...) > > where this was essentially like: > > result = func(...) > (a, b) = result.args > c = result.kwargs.get('c', 3) > > Where result was some new tuple-dict hybrid object. I think what you're thinking of is a single function signature object that can be passed around as is. In essence, it separates the signature handling parts of a function out into a separate object. The tricky part is making it easy to get to from inside the function. def foo(a, b, c=3) >> foosig: return bar(foosig) # No packing or unpacking here! result = foo(*args, **kwds) (a, b) = result.args c = result.kwds['c'] or... locals().update(result) Ron A. From masklinn at masklinn.net Thu Jan 13 23:59:55 2011 From: masklinn at masklinn.net (Masklinn) Date: Thu, 13 Jan 2011 23:59:55 +0100 Subject: [Python-ideas] values in vs. values out In-Reply-To: References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F137F.4030405@trueblade.com> <6985D3AD-9B9A-4457-96DE-5F1BB52F530B@cern.ch> Message-ID: <477FA6F4-5C9A-4F0F-B516-57A936698C88@masklinn.net> On 2011-01-13, at 22:31 , Jason Orendorff wrote: > On Thu, Jan 13, 2011 at 3:10 PM, Masklinn wrote: >> On 2011-01-13, at 21:11 , Jason Orendorff wrote: >>> (c) Unlike ML, you can write >>> (a, b) = [1, 2] >>> or generally >>> a, b = any_iterable >>> It is useful for unpacking to depend on the iterable protocol >>> rather than the exact type of the right-hand side. This is a >>> nicety that ML-like languages don't bother with, afaik. >> In no small part because, in ML-type languages (or more generally in >> functional languages, Erlang could hardly be called an ML-like language) >> lists (or more generally sequences) and tuples are very different beasts and >> entirely incompatible. > > Well, sure, as far as tuples go. But the point I was making was more > general. Python has a notion of "iterable" which covers many types, > not just "list". The iterable protocol is used by Python's for-loops, > sorted(), str.join() and so on; it's only natural for unpacking > assignment to use it as well. As far as I know, most ML languages > don't have that notion.* So Python has a reason for this asymmetry > that those languages don't have. Well yeah, but even if those languages had a higher-level "iterable" abstraction, tuples wouldn't be part of it. From jafo at tummy.com Mon Jan 17 08:11:20 2011 From: jafo at tummy.com (Sean Reifschneider) Date: Mon, 17 Jan 2011 00:11:20 -0700 Subject: [Python-ideas] Adding salt and Modular Crypt Format to crypt library. Message-ID: <20110117071120.GA9810@tummy.com> Over the years I've written the same code over and over to create a random salt string of 2 characters. Worse, the Modular Crypt Format is difficult to find documentation on, so creating stronger hashed passwords is difficult to get right. By this, I mean things like: crypt.crypt('password', 'xJ') crypt.crypt('password', '$1$/gL8bA.z') crypt.crypt('password', '$6$/uPNNoSGrlc0Kf0go') To that end, I'm proposing the addition of a "mksalt()" method which will generate a salt, and several METHOD_* values to select which hashing method to use. I also figure there will need to be a "methods()" call that figures out what methods are available in the library crypt() and return a list of the available ones. If we have a way to generate a salt, then I figure we could drop the salt argument of crypt.crypt(), and if not specified to generate one. So to hash a password you could do: "crypt.crypt('password')". I figure that the best way to accomplish this is to implement this all in Python and move the existing C crypt module to _crypt. I've created an issue: http://bugs.python.org/issue10924 with this description and a patch to accomplish the above. Thoughts and review? Thanks, Sean -- I have a large collection of sea shells, which I keep scattered on beaches around the world. Maybe you've seen it... -- Steven Wright Sean Reifschneider, Member of Technical Staff tummy.com, ltd. - Linux Consulting since 1995: Ask me about High Availability From lvh at laurensvh.be Mon Jan 17 12:54:37 2011 From: lvh at laurensvh.be (Laurens Van Houtven) Date: Mon, 17 Jan 2011 12:54:37 +0100 Subject: [Python-ideas] Adding salt and Modular Crypt Format to crypt library. In-Reply-To: <20110117071120.GA9810@tummy.com> References: <20110117071120.GA9810@tummy.com> Message-ID: Hi Sean 1) Minor API note: I'd expect a dict of hashes to their respective crypt functions 2) Is there any leverage for possibly including stronger KDFs, such as scrypt or possibly bcrypt into Python? People have created nice C bindings for both, and licenses permit it. That would make the crypt module good for storing passwords too, as well as being good for comparing them to some particular format. cheers lvh From jafo at tummy.com Mon Jan 17 13:20:18 2011 From: jafo at tummy.com (Sean Reifschneider) Date: Mon, 17 Jan 2011 05:20:18 -0700 Subject: [Python-ideas] Adding salt and Modular Crypt Format to crypt library. In-Reply-To: References: <20110117071120.GA9810@tummy.com> Message-ID: <4D343402.50209@tummy.com> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/17/2011 04:54 AM, Laurens Van Houtven wrote: > 1) Minor API note: I'd expect a dict of hashes to their respective > crypt functions I don't follow what you mean, sorry. Can you provide an example? > 2) Is there any leverage for possibly including stronger KDFs, such as > scrypt or possibly bcrypt into Python? People have created nice C Possibly, but I'd say that's beyond the scope of this patch and would need to be a separate patch. This patch is about adding salt functions to the existing module which wraps the C library function crypt(), which requires a salt argument but provides no helpers to generate them. Sean -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iD8DBQFNNDQCxUhyMYEjVX0RAg/2AKC6Q3WYL5YV/LME02H9HvPYSxrISwCcCAuD +9cPhYOTX3pYYK31hLN1RBk= =lhYv -----END PGP SIGNATURE----- From luc.goossens at cern.ch Mon Jan 17 16:04:43 2011 From: luc.goossens at cern.ch (Luc Goossens) Date: Mon, 17 Jan 2011 16:04:43 +0100 Subject: [Python-ideas] values in vs. values out In-Reply-To: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> Message-ID: <6F208934-E8D5-43D3-B077-B818F4CBC117@cern.ch> Hi all, Thanks to everybody for your feedback! So I guess the answer to my question (which - I noticed just now - did not end with a question mark), is ... no. > If your function is returning a bunch of related values in a tuple, > and > that tuple keeps changing as you re-design the code, that's a code > smell. the use cases I have in mind are the functions that return a set of weakly related values, or more importantly report on different aspects of the calculation; an example of the first is a divmod function that returns the div and the mod while callers might only be interested in the div; examples of the latter are the time it took to calculate the value, possible warnings that were encountered, ... like the good old errorcode/stdout/stderr trio > [various workarounds suggested] the problem with (all) the workarounds that were suggested is that they help with migrating from 2 to more return values; for the 1 to 2 case (the most common case) they don't help a lot, as the amount of work to put the workaround in place exceeds the amount of work to cope with the migration directly; I would say it is a requirement that the simple case of single variable gets single (or first) return value, retains its current simple notation > If the system automatically ignored "new" return values (for > whatever "new" might mean), I think it would be too easy to miss > return values that you don't mean to be ignoring. this I guess is only valid in the case where multiple return values are so strongly related they probably should be an object instead of a bunch of values > So it would be neat if you could do: > > (a, b, c=3) = func(...) or adding keywords to the mix a, b, c = kw1, d = kw2 (defval2) = function(...) now for the can of worms ... - one would need some syntactic means to distinguish the returning of two values from the returning of a single pair with two values - there's a complication with nested function calls (i.e. fun1 ( fun2(...), fun3(...)); the only simple semantic I could associate with this, is to simply drop all return values except for the first, but that is incompatible with returning the full return value of a function without needing to manipulate it ... Hmm, maybe the second worm above hints at the root problem with multiple return values: there is just no simple way of accommodating them. Too bad :-( Luc On Jan 13, 2011, at 3:30 PM, Luc Goossens wrote: > Hi all, > > There's a striking asymmetry between the wonderful flexibility in > passing values into functions (positional args, keyword args, > default values, *args, **kwargs, ...) and the limited options for > processing the return values (assignment). > Hence, whenever I upgrade a function with a new keyword arg and a > default value, I do not have to change any of the existing calls, > whereas whenever I add a new element to its output tuple, I find > myself chasing all existing code to upgrade the corresponding > assignments with an additional (unused) variable. > So I was wondering whether this was ever discussed before (and > recorded) inside the Python community. > (naively what seems to be missing is the ability to use the > assignment machinery that binds functions' formal params to the > given actual param list also in the context of a return value > assignment) > > cheers, > Luc > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From ncoghlan at gmail.com Mon Jan 17 16:39:37 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 18 Jan 2011 01:39:37 +1000 Subject: [Python-ideas] values in vs. values out In-Reply-To: <6F208934-E8D5-43D3-B077-B818F4CBC117@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <6F208934-E8D5-43D3-B077-B818F4CBC117@cern.ch> Message-ID: On Tue, Jan 18, 2011 at 1:04 AM, Luc Goossens wrote: >> If the system automatically ignored "new" return values (for whatever >> "new" might mean), I think it would be too easy to miss return values that >> you don't mean to be ignoring. > > this I guess is only valid in the case where multiple return values are so > strongly related they probably should be an object instead of a bunch of > values If the relationship between the return values is so weak, I would seriously question the viability of returning them at all. >> So it would be neat if you could do: >> >> (a, b, c=3) = func(...) > > or adding keywords to the mix > > a, b, c = kw1, d = kw2 (defval2) ? = ? function(...) > > > now for the can of worms ... > > - one would need some syntactic means to distinguish the returning of two > values from the returning of a single pair with two values > - there's a complication with nested function calls (i.e. fun1 ( fun2(...), > fun3(...)); the only simple semantic I could associate with > this, is to simply drop all return values except for the first, but that is > incompatible with returning the full return value of a function without > needing to manipulate it > ... > > Hmm, maybe the second worm above hints at the root problem with multiple > return values: there is just no simple way of accommodating them. If you want additional independent return values, use a container (such as a list or dictionary) as an output variable. Even better, if you want to change the return value without having to change every location that calls the function, *create a new function* instead of modifying the existing one. Yes, this means you can sometimes end up with lousy names for functions because the original function used up the best one. Such is life in a world where you need to cope with backwards compatibility issues. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From arnodel at gmail.com Mon Jan 17 17:39:32 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Mon, 17 Jan 2011 16:39:32 +0000 Subject: [Python-ideas] values in vs. values out In-Reply-To: <6F208934-E8D5-43D3-B077-B818F4CBC117@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <6F208934-E8D5-43D3-B077-B818F4CBC117@cern.ch> Message-ID: On 17 January 2011 15:04, Luc Goossens wrote: > Hi all, > > Thanks to everybody for your feedback! > So I guess the answer to my question (which - I noticed just now - did not > end with a question mark), is ... no. > >> If your function is returning a bunch of related values in a tuple, and >> that tuple keeps changing as you re-design the code, that's a code smell. > > the use cases I have in mind are the functions that return a set of weakly > related values, or more importantly report on different aspects > of the calculation; > an example of the first is a divmod function that returns the div and the > mod while callers might only be interested in the div; > examples of the latter are the time it took to calculate the value, possible > warnings that were encountered, ... > > like the good old errorcode/stdout/stderr trio > >> [various workarounds suggested] > > > the problem with (all) the workarounds that were suggested is that they help > with migrating from 2 to more return values; > for the 1 to 2 case (the most common case) they don't help a lot, as the > amount of work to put the workaround in place exceeds the amount of work to > cope > with the migration directly; > I would say it is a requirement that the simple case of single variable gets > single (or first) return value, retains its current simple notation > > >> If the system automatically ignored "new" return values (for whatever >> "new" might mean), I think it would be too easy to miss return values that >> you don't mean to be ignoring. > > > this I guess is only valid in the case where multiple return values are so > strongly related they probably should be an object instead of a bunch of > values > > >> So it would be neat if you could do: >> >> (a, b, c=3) = func(...) > > or adding keywords to the mix > > a, b, c = kw1, d = kw2 (defval2) ? = ? function(...) > > > now for the can of worms ... > > - one would need some syntactic means to distinguish the returning of two > values from the returning of a single pair with two values > - there's a complication with nested function calls (i.e. fun1 ( fun2(...), > fun3(...)); the only simple semantic I could associate with > this, is to simply drop all return values except for the first, but that is > incompatible with returning the full return value of a function without > needing to manipulate it LISP has a notion of multiple return values. I can't easily find an authoritative reference, but here is a short explanation: http://abhishek.geek.nz/docs/features-of-common-lisp/#Multiple_values Based on this, you could define a decorator class: class multiple_values: def __init__(self, f): self.f = f def __call__(self, *args, **kwargs): return self.f(*args, **kwargs)[0] def all_values(self, *args, **kwargs): return self.f(*args, **kwargs) @multiple_values def div(x, y): return x//y, x%y Then: >>> q = div(10, 3) 3 >>> q, r = div.all_values(17, 5) -- Arnaud From rich at noir.com Mon Jan 17 18:54:20 2011 From: rich at noir.com (K. Richard Pixley) Date: Mon, 17 Jan 2011 09:54:20 -0800 Subject: [Python-ideas] values in vs. values out In-Reply-To: <4D2F419A.30703@noir.com> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F419A.30703@noir.com> Message-ID: <4D34824C.4000701@noir.com> If the values involved are sufficiently weakly related, then I question whether it's appropriate to calculate them at all. If the most frequent use is to select out a subset of the values, then even calculating the other values seems like a wasted effort. To take "average" and "stdev" as an example... If you use an object to represent not the range of return values, but the domain of input values, then you can use @property accessors for the results. class Statisics(object): def __init__(self, list): self.list = list @property def avg(self): return ... @property def stdev(self): return ... @property def inputs(self): return self.list @property def outputs(self): return self.avg, self.stdev Now you have the syntactic appearance of selecting from multiple values in either one step or two, your choice. x = Statistics([1, 2, 3]).stdev y, z = Statistics([1, 2, 3]).outputs p = Statistics([4, 5, 6]) q = p.avg --rich From rich at noir.com Mon Jan 17 19:09:34 2011 From: rich at noir.com (K. Richard Pixley) Date: Mon, 17 Jan 2011 10:09:34 -0800 Subject: [Python-ideas] values in vs. values out In-Reply-To: <4D34824C.4000701@noir.com> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <4D2F419A.30703@noir.com> <4D34824C.4000701@noir.com> Message-ID: <4D3485DE.4020504@noir.com> You could shorten this... def __call__(self): return self.avg, self.stdev Now it's even more dense and allows for indexing the results: p = Statistics([4, 5, 6])()[0] --rich On 1/17/11 09:54 , K. Richard Pixley wrote: > If the values involved are sufficiently weakly related, then I > question whether it's appropriate to calculate them at all. If the > most frequent use is to select out a subset of the values, then even > calculating the other values seems like a wasted effort. > > To take "average" and "stdev" as an example... > > If you use an object to represent not the range of return values, but > the domain of input values, then you can use @property accessors for > the results. > > class Statisics(object): > def __init__(self, list): > self.list = list > > @property > def avg(self): > return ... > > @property > def stdev(self): > return ... > > @property > def inputs(self): > return self.list > > @property > def outputs(self): > return self.avg, self.stdev > > Now you have the syntactic appearance of selecting from multiple > values in either one step or two, your choice. > > x = Statistics([1, 2, 3]).stdev > y, z = Statistics([1, 2, 3]).outputs > > p = Statistics([4, 5, 6]) > q = p.avg > > --rich > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From donspauldingii at gmail.com Thu Jan 20 05:29:26 2011 From: donspauldingii at gmail.com (Don Spaulding) Date: Wed, 19 Jan 2011 22:29:26 -0600 Subject: [Python-ideas] A sorted version of **kwargs Message-ID: Hi there python-ideas! Does it bother anyone else that it's so cumbersome to instantiate an OrderedDict with ordered data? >>> from collections import OrderedDict >>> OrderedDict(b=1,a=2) OrderedDict([('a', 2), ('b', 1)]) # Order lost. Boooo. >>> OrderedDict([('b',1),('a',2)]) OrderedDict([('b', 1), ('a', 2)]) Obviously, OrderedDict's __init__ method (like all other functions) never gets a chance to see the kwargs dict in the order it was specified. It's usually faked by accepting the sequence of (key, val) tuples, as above. I personally think it would be nice to be able to ask the interpreter to keep track of the order of the arguments to my function, something like: def sweet_function_name(*args, **kwargs, ***an_odict_of_kwargs): pass I'm not married to the syntax. What do you think about the idea? -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnodel at gmail.com Thu Jan 20 08:21:27 2011 From: arnodel at gmail.com (Arnaud Delobelle) Date: Thu, 20 Jan 2011 07:21:27 +0000 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: Message-ID: On 20 January 2011 04:29, Don Spaulding wrote: > Hi there python-ideas! > Does it bother anyone else that it's so cumbersome to instantiate an > OrderedDict with ordered data? > ?? ?>>> from collections import OrderedDict > ?? ?>>> OrderedDict(b=1,a=2) > ?? ?OrderedDict([('a', 2), ('b', 1)]) ? ? ? ? ?# Order lost. Boooo. > ?? ?>>> OrderedDict([('b',1),('a',2)]) > ?? ?OrderedDict([('b', 1), ('a', 2)]) > Obviously, OrderedDict's __init__ method (like all other functions) never > gets a chance to see the kwargs dict in the order it was specified. ?It's > usually faked by accepting the sequence of (key, val) tuples, as above. ?I > personally think it would be nice to be able to ask the interpreter to keep > track of the order of the arguments to my function, something like: > ?? ?def sweet_function_name(*args, **kwargs, ***an_odict_of_kwargs): > ?? ? ? ?pass > I'm not married to the syntax. ?What do you think about the idea? FYI this was discussed before on this list at least once: http://mail.python.org/pipermail/python-ideas/2009-April/004163.html -- Arnaud From donspauldingii at gmail.com Thu Jan 20 08:33:43 2011 From: donspauldingii at gmail.com (Don Spaulding) Date: Thu, 20 Jan 2011 01:33:43 -0600 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: Message-ID: On Thu, Jan 20, 2011 at 1:21 AM, Arnaud Delobelle wrote: > On 20 January 2011 04:29, Don Spaulding wrote: > > Hi there python-ideas! > > Does it bother anyone else that it's so cumbersome to instantiate an > > OrderedDict with ordered data? > > >>> from collections import OrderedDict > > >>> OrderedDict(b=1,a=2) > > OrderedDict([('a', 2), ('b', 1)]) # Order lost. Boooo. > > >>> OrderedDict([('b',1),('a',2)]) > > OrderedDict([('b', 1), ('a', 2)]) > > Obviously, OrderedDict's __init__ method (like all other functions) never > > gets a chance to see the kwargs dict in the order it was specified. It's > > usually faked by accepting the sequence of (key, val) tuples, as above. > I > > personally think it would be nice to be able to ask the interpreter to > keep > > track of the order of the arguments to my function, something like: > > def sweet_function_name(*args, **kwargs, ***an_odict_of_kwargs): > > pass > > I'm not married to the syntax. What do you think about the idea? > > FYI this was discussed before on this list at least once: > > http://mail.python.org/pipermail/python-ideas/2009-April/004163.html > > -- > Arnaud > So it was. Thanks for that link. Am I to assume nothing ever came of that discussion? -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Thu Jan 20 11:28:05 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 20 Jan 2011 21:28:05 +1100 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: Message-ID: <4D380E35.8040502@pearwood.info> Don Spaulding wrote: > Obviously, OrderedDict's __init__ method (like all other functions) never > gets a chance to see the kwargs dict in the order it was specified. It's > usually faked by accepting the sequence of (key, val) tuples, as above. I > personally think it would be nice to be able to ask the interpreter to keep > track of the order of the arguments to my function, something like: > > def sweet_function_name(*args, **kwargs, ***an_odict_of_kwargs): > pass > > I'm not married to the syntax. What do you think about the idea? I would be +0 on making **kwargs an ordered dict automatically, and -1 on adding ***ordered_kwargs. Because kwargs is mostly used only for argument passing, and generally with only a small number of items, it probably doesn't matter too much if it's slightly slower than an unordered dict. But adding a third asterisk just makes it ugly, and it leaves open the question what happens when you try to mix **kw and ***okw in the same function call, as you do above. -- Steven From ncoghlan at gmail.com Thu Jan 20 14:11:57 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 20 Jan 2011 23:11:57 +1000 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: <4D380E35.8040502@pearwood.info> References: <4D380E35.8040502@pearwood.info> Message-ID: On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: > I would be +0 on making **kwargs an ordered dict automatically, and -1 on > adding ***ordered_kwargs. Because kwargs is mostly used only for argument > passing, and generally with only a small number of items, it probably > doesn't matter too much if it's slightly slower than an unordered dict. Yeah, simply making the kwargs dict always ordered is likely the way we would do it. That's also the only solution with any chance of working by default with the way most decorators are structured (accepting *args and **kwargs and passing them to the wrapped function). To expand on Raymond's response in the previous thread on this topic, there are likely a number of steps to this process: 1. Provide a _collections.OrderedDict C implementation 2. Create a PEP to gain agreement from other implementations (especially IronPython, PyPy and Jython) to proceed with the remaining steps 3. Make it a builtin class (odict?) in its own right (with collections.OrderedDict becoming an alias for the builtin type) 4. Update the interpreter to use the new builtin type for kwargs containers Use various microbenchmarks to check that the use of the new odict builtin type instead of a plain dict doesn't slow things down too much. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mal at egenix.com Thu Jan 20 15:05:23 2011 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 20 Jan 2011 15:05:23 +0100 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> Message-ID: <4D384123.6040105@egenix.com> Nick Coghlan wrote: > On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >> passing, and generally with only a small number of items, it probably >> doesn't matter too much if it's slightly slower than an unordered dict. > > Yeah, simply making the kwargs dict always ordered is likely the way > we would do it. That's also the only solution with any chance of > working by default with the way most decorators are structured > (accepting *args and **kwargs and passing them to the wrapped > function). -1. How often do you really need this ? In which of those cases wouldn't a static code analysis give you the call order of the parameters already ? "Nice to have" is not good enough to warrant a slow down of all function calls involving keyword arguments, adding overhead for other Python implementations and possibly causing problems with 3rd party extensions relying on getting a PyDict for the keyword arguments object. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 20 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From guido at python.org Thu Jan 20 17:42:20 2011 From: guido at python.org (Guido van Rossum) Date: Thu, 20 Jan 2011 08:42:20 -0800 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: <4D384123.6040105@egenix.com> References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: > Nick Coghlan wrote: >> On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >>> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >>> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >>> passing, and generally with only a small number of items, it probably >>> doesn't matter too much if it's slightly slower than an unordered dict. >> >> Yeah, simply making the kwargs dict always ordered is likely the way >> we would do it. That's also the only solution with any chance of >> working by default with the way most decorators are structured >> (accepting *args and **kwargs and passing them to the wrapped >> function). > > -1. > > How often do you really need this ? > > In which of those cases wouldn't a static code analysis give you > the call order of the parameters already ?? > > "Nice to have" is not good enough to warrant a slow down of > all function calls involving keyword arguments, adding overhead > for other Python implementations and possibly causing problems > with 3rd party extensions relying on getting a PyDict for the > keyword arguments object. What he says. In addition, I wonder what the semantics would be if the caller passed **d where d was an *unordered* dict... -- --Guido van Rossum (python.org/~guido) From lorgandon at gmail.com Thu Jan 20 17:53:54 2011 From: lorgandon at gmail.com (Imri Goldberg) Date: Thu, 20 Jan 2011 18:53:54 +0200 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 6:42 PM, Guido van Rossum wrote: > > > -1. > > > > How often do you really need this ? > > > > In which of those cases wouldn't a static code analysis give you > > the call order of the parameters already ? > > > > "Nice to have" is not good enough to warrant a slow down of > > all function calls involving keyword arguments, adding overhead > > for other Python implementations and possibly causing problems > > with 3rd party extensions relying on getting a PyDict for the > > keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... What if the default behavior stays as it is today, but a magic decorator is added, (maybe @ordered_kwargs or some such), and only for these kind of functions the new behavior applies. Also, given such a decorator, when given **d where d is a regular dict, the implementation could possibly throw an error. (Or maybe it is up to the implementor of the specific function). Cheers, Imri -- Imri Goldberg -------------------------------------- http://plnnr.com/ - automatic trip planning http://www.algorithm.co.il/blogs/ -------------------------------------- -- insert signature here ---- -------------- next part -------------- An HTML attachment was scrubbed... URL: From debatem1 at gmail.com Thu Jan 20 18:52:21 2011 From: debatem1 at gmail.com (geremy condra) Date: Thu, 20 Jan 2011 09:52:21 -0800 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 8:42 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: >> Nick Coghlan wrote: >>> On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >>>> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >>>> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >>>> passing, and generally with only a small number of items, it probably >>>> doesn't matter too much if it's slightly slower than an unordered dict. >>> >>> Yeah, simply making the kwargs dict always ordered is likely the way >>> we would do it. That's also the only solution with any chance of >>> working by default with the way most decorators are structured >>> (accepting *args and **kwargs and passing them to the wrapped >>> function). >> >> -1. >> >> How often do you really need this ? >> >> In which of those cases wouldn't a static code analysis give you >> the call order of the parameters already ?? >> >> "Nice to have" is not good enough to warrant a slow down of >> all function calls involving keyword arguments, adding overhead >> for other Python implementations and possibly causing problems >> with 3rd party extensions relying on getting a PyDict for the >> keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... Wouldn't this be a good argument for the original proposal? That there wouldn't be confusion about whether you were getting an odict or a dict with ***? Also, would functions that didn't specify this behavior see an actual performance hit? I assumed that given the existence of METH_NOARGS and friends that there was some kind of optimization going on here, but I can't get it to turn up on timeit. Geremy Condra From alexander.belopolsky at gmail.com Thu Jan 20 19:57:50 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 13:57:50 -0500 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 11:42 AM, Guido van Rossum wrote: .. > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... > Presumably, the caller would know whether the called function is sensitive to the order of keyword arguments and will use odict when it matters. The function called as say f(x=1, y=2, **d) will initialize its internal keyword odict with an equivalent of kwds = odict([('x', 1), ('y', 2)]); kwd.update(d.items()) and exhibit undefined behavior if d is unordered and f depends on the order of keywords. From bruce at leapyear.org Thu Jan 20 20:13:48 2011 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 20 Jan 2011 11:13:48 -0800 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 8:42 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: > > > "Nice to have" is not good enough to warrant a slow down of > > all function calls involving keyword arguments, adding overhead > > for other Python implementations and possibly causing problems > > with 3rd party extensions relying on getting a PyDict for the > > keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... > > -- > --Guido van Rossum (python.org/~guido) > > Agree with both. And if we were to make this change, my next thought is that what I really want is an ordered multi-set, since in some scenarios where I want ordered parameters I also want repeated parameters. I don't think we should go there. Back to the original problem though: if the issue is that creating an ordered dict is clumsy and perhaps interfering with adoption and usage then perhaps the notation for ordered dict could be improved. Just as we can now use {...} for both dicts and sets, perhaps we could add [ 'b' : 1, 'a' : 2 ] as a more convenient way of writing OrderedDict([('b', 1), ('a', 2)]) This is parallel to the way that [1,2] is an ordered container while {1,2} is unordered. --- Bruce Latest blog post: http://www.vroospeak.com/2010/12/fix-filibuster.html Learn about security: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From masklinn at masklinn.net Thu Jan 20 20:40:38 2011 From: masklinn at masklinn.net (Masklinn) Date: Thu, 20 Jan 2011 20:40:38 +0100 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On 2011-01-20, at 17:42 , Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: >> Nick Coghlan wrote: >>> On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >>>> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >>>> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >>>> passing, and generally with only a small number of items, it probably >>>> doesn't matter too much if it's slightly slower than an unordered dict. >>> >>> Yeah, simply making the kwargs dict always ordered is likely the way >>> we would do it. That's also the only solution with any chance of >>> working by default with the way most decorators are structured >>> (accepting *args and **kwargs and passing them to the wrapped >>> function). >> >> -1. >> >> How often do you really need this ? >> >> In which of those cases wouldn't a static code analysis give you >> the call order of the parameters already ? >> >> "Nice to have" is not good enough to warrant a slow down of >> all function calls involving keyword arguments, adding overhead >> for other Python implementations and possibly causing problems >> with 3rd party extensions relying on getting a PyDict for the >> keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict? Create an ordereddict based on d's iteration order would seem logical (but order would keep conserved for kwargs passed in explicitly, so in `foo(a=some, b=other, c=stuff, **d)` where `d` is a `dict` a, b, c would be the first three keys and then the keys of d would be stuffed after that in whatever order happens). Do Python 3's semantics allow for further kwargs after **kwargs? From timothy.c.delaney at gmail.com Thu Jan 20 20:47:20 2011 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Fri, 21 Jan 2011 06:47:20 +1100 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On 21 January 2011 06:13, Bruce Leban wrote: > Back to the original problem though: if the issue is that creating an > ordered dict is clumsy and perhaps interfering with adoption and usage then > perhaps the notation for ordered dict could be improved. Just as we can now > use {...} for both dicts and sets, perhaps we could add > > [ 'b' : 1, 'a' : 2 ] > > as a more convenient way of writing > > OrderedDict([('b', 1), ('a', 2)]) > > > This is parallel to the way that [1,2] is an ordered container while {1,2} > is unordered. > ['b':1] would then be ambiguous (appears to be a slice of a list). More obvious in the case of [1:2] ... Personally, I'm of the opinion that if an actual dictionary or subclass is passed via **kw, the same type of dictionary should be used for the keyword arguments i.e.: test(**dict(a=1, b=2)) => unordered dict passed. test(**odict(a=1, b=2)) => ordered dict passed. The only difficulty then is passing parameters into the ordered dict constructor in the desired order - I can't think of a reasonable way of doing that. Good old chicken and egg problem. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Thu Jan 20 21:05:41 2011 From: bruce at leapyear.org (Bruce Leban) Date: Thu, 20 Jan 2011 12:05:41 -0800 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 11:47 AM, Tim Delaney wrote: > > ['b':1] would then be ambiguous (appears to be a slice of a list). More > obvious in the case of [1:2] ... > We use parenthesis for tuples and avoid the ambiguity by writing (1,). In the same way, we could require your examples to be written ['b':1,] and [1:2,] --- Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From donspauldingii at gmail.com Thu Jan 20 21:26:23 2011 From: donspauldingii at gmail.com (Don Spaulding) Date: Thu, 20 Jan 2011 14:26:23 -0600 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 2:05 PM, Bruce Leban wrote: > > On Thu, Jan 20, 2011 at 11:47 AM, Tim Delaney > wrote: > >> >> ['b':1] would then be ambiguous (appears to be a slice of a list). More >> obvious in the case of [1:2] ... >> > > We use parenthesis for tuples and avoid the ambiguity by writing (1,). > In the same way, we could require your examples to be written ['b':1,] and > [1:2,] > Please, not this. I like the idea of syntactic support for the odict, but no need to spread the (foo,) syntax around. It's too easy to misinterpret when you do a quick scan of a body of code. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Thu Jan 20 21:28:41 2011 From: python at mrabarnett.plus.com (MRAB) Date: Thu, 20 Jan 2011 20:28:41 +0000 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: <4D389AF9.5080608@mrabarnett.plus.com> On 20/01/2011 19:47, Tim Delaney wrote: > On 21 January 2011 06:13, Bruce Leban > wrote: > > Back to the original problem though: if the issue is that creating > an ordered dict is clumsy and perhaps interfering with adoption and > usage then perhaps the notation for ordered dict could be improved. > Just as we can now use {...} for both dicts and sets, perhaps we > could add > > [ 'b' : 1, 'a' : 2 ] > > as a more convenient way of writing > > OrderedDict([('b', 1), ('a', 2)]) > > > This is parallel to the way that [1,2] is an ordered container while > {1,2} is unordered. > > > ['b':1] would then be ambiguous (appears to be a slice of a list). More > obvious in the case of [1:2] ... > [snip] In what way is it ambiguous? [1] isn't ambiguous, is it? spam[1] is subscripting and [1] is a list; spam[1 : 2] is slicing and [1 : 2] would be an ordered dict. From alexander.belopolsky at gmail.com Thu Jan 20 21:57:34 2011 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Jan 2011 15:57:34 -0500 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Thu, Jan 20, 2011 at 2:47 PM, Tim Delaney wrote: .. > ['b':1] would then be ambiguous (appears to be a slice of a list). More > obvious in the case of?[1:2] ... x[1:2], x[1:2,], and x[1:2, 3:4] are all valid syntaxes. (NumPy uses the latter for multi-dimensional slicing.) However, I don't see an ambiguity here. We don't have an ambiguity between tuple syntax and function calls: (a, b, c) vs. f(a, b, c). From ncoghlan at gmail.com Fri Jan 21 01:54:23 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jan 2011 10:54:23 +1000 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: <4D380E35.8040502@pearwood.info> <4D384123.6040105@egenix.com> Message-ID: On Fri, Jan 21, 2011 at 2:42 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: >> Nick Coghlan wrote: >>> Yeah, simply making the kwargs dict always ordered is likely the way >>> we would do it. That's also the only solution with any chance of >>> working by default with the way most decorators are structured >>> (accepting *args and **kwargs and passing them to the wrapped >>> function). >> >> -1. >> >> How often do you really need this ? >> >> In which of those cases wouldn't a static code analysis give you >> the call order of the parameters already ?? >> >> "Nice to have" is not good enough to warrant a slow down of >> all function calls involving keyword arguments, adding overhead >> for other Python implementations and possibly causing problems >> with 3rd party extensions relying on getting a PyDict for the >> keyword arguments object. > > What he says. I actually agree as well, but I was misremembering how the construction of the kwargs dict worked and hence was thinking that was the only possible way this could work (since the interpreter didn't know anything about the target function while building the dict). Actually checking with dis.dis and looking at the associated code in ceval.c corrected my misapprehension (the function is actually retrieved first, while the kwargs are still on the stack in the appropriate order, so it is theoretically possible for the function to influence how the args are stored). So, as an alternative proposal, perhaps it would be possible to add a new protocol that allowed a callable to flag that an ordered dictionary should be used for kwargs (e.g. an "__ordered_call__" boolean attribute, with C level flags to speed up the common builtin callable cases). A @functools.ordered_call decorator could then just do "__ordered_call__ = True" to set the flag appropriately. (You could also be even more flexible and devise a protocol that supported any type for kwargs, but I think that would just be far more complicated without a corresponding increase in expressive power) Ordinary calls that used "x=y" or "**d" would be slowed down marginally due to the check for the new attribute on the supplied callable, but that impact should be minimal (especially for the builtin cases, which would just be checking a C struct field) and other calls would be entirely unaffected. The CPython specific impact would largely be limited to update_keyword_args() and implementing C level fields with associated __ordered_call__ properties on the various builtin callable objects. There would also need to be a new variant of PyEval_EvalCodeEx that either used an ordered dictionary for kwdict, or else allowed kwdict to be created and passed in by the calling code rather than implicitly created via PyDict_New(). A builtin version of collections.OrderedDict would still be a precursor to this idea though, so creating a _collections.OrderedDict C implementation still sounds like the right starting point for anyone that is particularly keen to see this idea progress. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From microcore at yahoo.com.cn Fri Jan 21 02:58:41 2011 From: microcore at yahoo.com.cn (Hongbao Chen) Date: Fri, 21 Jan 2011 09:58:41 +0800 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: Message-ID: Hey, guys I must argue that make **kwargs sorted is really a terrible idea. Please think that when user wants a unsorted **kwargs, how can he or she bring the original unsorted dict back? When a dict is unsorted, we can sort it with implementation in Python/C interface or just python code for portability. That is what should be like! Give users more control over behavior of dict. That is what I propose. Best regards Hongbao Chen XJTU -----????----- ???: python-ideas-bounces+microcore=yahoo.com.cn at python.org [mailto:python-ideas-bounces+microcore=yahoo.com.cn at python.org] ?? python-ideas-request at python.org ????: 2011?1?21? 1:52 ???: python-ideas at python.org ??: Python-ideas Digest, Vol 50, Issue 34 Send Python-ideas mailing list submissions to python-ideas at python.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.python.org/mailman/listinfo/python-ideas or, via email, send a message with subject or body 'help' to python-ideas-request at python.org You can reach the person managing the list at python-ideas-owner at python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Python-ideas digest..." Today's Topics: 1. Re: A sorted version of **kwargs (Nick Coghlan) 2. Re: A sorted version of **kwargs (M.-A. Lemburg) 3. Re: A sorted version of **kwargs (Guido van Rossum) 4. Re: A sorted version of **kwargs (Imri Goldberg) 5. Re: A sorted version of **kwargs (geremy condra) ---------------------------------------------------------------------- Message: 1 Date: Thu, 20 Jan 2011 23:11:57 +1000 From: Nick Coghlan To: "Steven D'Aprano" Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: > I would be +0 on making **kwargs an ordered dict automatically, and -1 on > adding ***ordered_kwargs. Because kwargs is mostly used only for argument > passing, and generally with only a small number of items, it probably > doesn't matter too much if it's slightly slower than an unordered dict. Yeah, simply making the kwargs dict always ordered is likely the way we would do it. That's also the only solution with any chance of working by default with the way most decorators are structured (accepting *args and **kwargs and passing them to the wrapped function). To expand on Raymond's response in the previous thread on this topic, there are likely a number of steps to this process: 1. Provide a _collections.OrderedDict C implementation 2. Create a PEP to gain agreement from other implementations (especially IronPython, PyPy and Jython) to proceed with the remaining steps 3. Make it a builtin class (odict?) in its own right (with collections.OrderedDict becoming an alias for the builtin type) 4. Update the interpreter to use the new builtin type for kwargs containers Use various microbenchmarks to check that the use of the new odict builtin type instead of a plain dict doesn't slow things down too much. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia ------------------------------ Message: 2 Date: Thu, 20 Jan 2011 15:05:23 +0100 From: "M.-A. Lemburg" To: Nick Coghlan Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: <4D384123.6040105 at egenix.com> Content-Type: text/plain; charset=ISO-8859-1 Nick Coghlan wrote: > On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >> passing, and generally with only a small number of items, it probably >> doesn't matter too much if it's slightly slower than an unordered dict. > > Yeah, simply making the kwargs dict always ordered is likely the way > we would do it. That's also the only solution with any chance of > working by default with the way most decorators are structured > (accepting *args and **kwargs and passing them to the wrapped > function). -1. How often do you really need this ? In which of those cases wouldn't a static code analysis give you the call order of the parameters already ? "Nice to have" is not good enough to warrant a slow down of all function calls involving keyword arguments, adding overhead for other Python implementations and possibly causing problems with 3rd party extensions relying on getting a PyDict for the keyword arguments object. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Jan 20 2011) >>> Python/Zope Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ------------------------------ Message: 3 Date: Thu, 20 Jan 2011 08:42:20 -0800 From: Guido van Rossum To: "M.-A. Lemburg" Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: > Nick Coghlan wrote: >> On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >>> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >>> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >>> passing, and generally with only a small number of items, it probably >>> doesn't matter too much if it's slightly slower than an unordered dict. >> >> Yeah, simply making the kwargs dict always ordered is likely the way >> we would do it. That's also the only solution with any chance of >> working by default with the way most decorators are structured >> (accepting *args and **kwargs and passing them to the wrapped >> function). > > -1. > > How often do you really need this ? > > In which of those cases wouldn't a static code analysis give you > the call order of the parameters already ?? > > "Nice to have" is not good enough to warrant a slow down of > all function calls involving keyword arguments, adding overhead > for other Python implementations and possibly causing problems > with 3rd party extensions relying on getting a PyDict for the > keyword arguments object. What he says. In addition, I wonder what the semantics would be if the caller passed **d where d was an *unordered* dict... -- --Guido van Rossum (python.org/~guido) ------------------------------ Message: 4 Date: Thu, 20 Jan 2011 18:53:54 +0200 From: Imri Goldberg To: Guido van Rossum Cc: python-ideas at python.org, "M.-A. Lemburg" Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: Content-Type: text/plain; charset="iso-8859-1" On Thu, Jan 20, 2011 at 6:42 PM, Guido van Rossum wrote: > > > -1. > > > > How often do you really need this ? > > > > In which of those cases wouldn't a static code analysis give you > > the call order of the parameters already ? > > > > "Nice to have" is not good enough to warrant a slow down of > > all function calls involving keyword arguments, adding overhead > > for other Python implementations and possibly causing problems > > with 3rd party extensions relying on getting a PyDict for the > > keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... What if the default behavior stays as it is today, but a magic decorator is added, (maybe @ordered_kwargs or some such), and only for these kind of functions the new behavior applies. Also, given such a decorator, when given **d where d is a regular dict, the implementation could possibly throw an error. (Or maybe it is up to the implementor of the specific function). Cheers, Imri -- Imri Goldberg -------------------------------------- http://plnnr.com/ - automatic trip planning http://www.algorithm.co.il/blogs/ -------------------------------------- -- insert signature here ---- -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 5 Date: Thu, 20 Jan 2011 09:52:21 -0800 From: geremy condra To: Guido van Rossum Cc: python-ideas at python.org, "M.-A. Lemburg" Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jan 20, 2011 at 8:42 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: >> Nick Coghlan wrote: >>> On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >>>> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >>>> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >>>> passing, and generally with only a small number of items, it probably >>>> doesn't matter too much if it's slightly slower than an unordered dict. >>> >>> Yeah, simply making the kwargs dict always ordered is likely the way >>> we would do it. That's also the only solution with any chance of >>> working by default with the way most decorators are structured >>> (accepting *args and **kwargs and passing them to the wrapped >>> function). >> >> -1. >> >> How often do you really need this ? >> >> In which of those cases wouldn't a static code analysis give you >> the call order of the parameters already ?? >> >> "Nice to have" is not good enough to warrant a slow down of >> all function calls involving keyword arguments, adding overhead >> for other Python implementations and possibly causing problems >> with 3rd party extensions relying on getting a PyDict for the >> keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... Wouldn't this be a good argument for the original proposal? That there wouldn't be confusion about whether you were getting an odict or a dict with ***? Also, would functions that didn't specify this behavior see an actual performance hit? I assumed that given the existence of METH_NOARGS and friends that there was some kind of optimization going on here, but I can't get it to turn up on timeit. Geremy Condra ------------------------------ _______________________________________________ Python-ideas mailing list Python-ideas at python.org http://mail.python.org/mailman/listinfo/python-ideas End of Python-ideas Digest, Vol 50, Issue 34 ******************************************** __________________________________________________ ??????????????? http://cn.mail.yahoo.com From ncoghlan at gmail.com Fri Jan 21 03:38:59 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 21 Jan 2011 12:38:59 +1000 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: Message-ID: 2011/1/21 Hongbao Chen : > Hey, guys > > I must argue that make **kwargs sorted is really a terrible idea. > Please think that when user wants a unsorted **kwargs, how can he or she > bring the original unsorted dict back? You can't get the original dict back now. > When a dict is unsorted, we can sort it with implementation in Python/C > interface or just python code for portability. That is what should be like! You can sort an ordered dict just as easily as you can sort a normal dict. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Fri Jan 21 04:42:13 2011 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 21 Jan 2011 14:42:13 +1100 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: Message-ID: <4D390095.4060303@pearwood.info> Hongbao Chen wrote: > Hey, guys > > I must argue that make **kwargs sorted is really a terrible idea. > Please think that when user wants a unsorted **kwargs, how can he or she > bring the original unsorted dict back? Despite the subject line, this is not about a *sorted* dict, but about an *ordered* dict. The difference is that an ordered dict remembers the order that elements are given: >>> pairs = list(zip('cat', (1,2,3))) >>> pairs [('c', 1), ('a', 2), ('t', 3)] >>> >>> dict(pairs) # order is lost {'a': 2, 'c': 1, 't': 3} >>> >>> from collections import OrderedDict as odict >>> odict(pairs) # but ordered dicts keep it OrderedDict([('c', 1), ('a', 2), ('t', 3)]) The problem is that keyword args are collected in an ordinary dictionary, which is unordered, so you lose the original order (except by accident): >>> odict(c=1, a=2, t=3) OrderedDict([('a', 2), ('c', 1), ('t', 3)]) So there is currently no way for functions to have arbitrary keyword arguments supplied in the order in which they were passed, because the order is lost when they are collected in a dict. -- Steven From rrr at ronadam.com Fri Jan 21 04:47:15 2011 From: rrr at ronadam.com (Ron Adam) Date: Thu, 20 Jan 2011 21:47:15 -0600 Subject: [Python-ideas] values in vs. values out In-Reply-To: <6F208934-E8D5-43D3-B077-B818F4CBC117@cern.ch> References: <27160EC2-67AD-4E4D-9937-80F9C88D0A51@cern.ch> <6F208934-E8D5-43D3-B077-B818F4CBC117@cern.ch> Message-ID: On 01/17/2011 09:04 AM, Luc Goossens wrote: > Hi all, > > Thanks to everybody for your feedback! > So I guess the answer to my question (which - I noticed just now - did not > end with a question mark), is ... no. > >> If your function is returning a bunch of related values in a tuple, and >> that tuple keeps changing as you re-design the code, that's a code smell. > > the use cases I have in mind are the functions that return a set of weakly > related values, or more importantly report on different aspects > of the calculation; > an example of the first is a divmod function that returns the div and the > mod while callers might only be interested in the div; > examples of the latter are the time it took to calculate the value, > possible warnings that were encountered, ... You could use a class instead of a function to get different variations on a function. >>> class DivMod: ... def div(self, x, y): ... return x//y ... def mod(self, x, y): ... return x%y ... def __call__(self, x, y): ... return x//y, x%y ... >>> dmod = DivMod() >>> dmod(100, 7) (14, 2) >>> dmod.div(100, 7) 14 >>> dmod.mod(100, 7) 2 Adding methods, to time and/or get warnings, should be fairly easy. If you do a bunch of these, you can make a base class and reuse the common parts. For timing, logging, and checking returned values of functions, decorators can be very useful. Cheers, Ron From rrr at ronadam.com Fri Jan 21 05:59:07 2011 From: rrr at ronadam.com (Ron Adam) Date: Thu, 20 Jan 2011 22:59:07 -0600 Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: <4D390095.4060303@pearwood.info> References: <4D390095.4060303@pearwood.info> Message-ID: On 01/20/2011 09:42 PM, Steven D'Aprano wrote: > Hongbao Chen wrote: >> Hey, guys >> >> I must argue that make **kwargs sorted is really a terrible idea. >> Please think that when user wants a unsorted **kwargs, how can he or she >> bring the original unsorted dict back? > > Despite the subject line, this is not about a *sorted* dict, but about > an *ordered* dict. The difference is that an ordered dict remembers the > order that elements are given: Yes, I was going to say that. :-) Presumably there are either performance or space reasons why an unorderd dictionary is better. (or both) These concepts do overlap, presumably an order dict can be sorted in place, rather than keeping a separate sorted list of keys. However, the orderd dictionary in the collections module doesn't do that. You sort it by sorting the items and then use those to create a new ordered dictionary. Lately I've been thinking it may be useful to have access to a separate self contained function-signature object. As a direct plug-in to functions and methods, it may present some nice advantages when it comes to how things fit together in python. Cheers, Ron From microcore at yahoo.com.cn Fri Jan 21 03:28:23 2011 From: microcore at yahoo.com.cn (=?utf-8?B?6ZmI5a6P6JGG?=) Date: Fri, 21 Jan 2011 10:28:23 +0800 (CST) Subject: [Python-ideas] A sorted version of **kwargs In-Reply-To: References: Message-ID: <501780.71258.qm@web15107.mail.cnb.yahoo.com> Hi, By the way,?a sorted **kwargs may slow down the speed of function invocation. But we do not know how often the function gets called. So we?mustn't enforce the dict to be sorted for it unpredictable defects in performance. I need to emphasize the UNPREDICTABLE. ?Hongbao Chen Software Engineering School Xi'an Jiaotong University Cell:+8613891979195 ________________________________ ???? "python-ideas-request at python.org" ???? python-ideas at python.org ????? 2011/1/21 (??) 9:59:06 ?? ? ?? Python-ideas Digest, Vol 50, Issue 36 Send Python-ideas mailing list submissions to ??? python-ideas at python.org To subscribe or unsubscribe via the World Wide Web, visit ??? http://mail.python.org/mailman/listinfo/python-ideas or, via email, send a message with subject or body 'help' to ??? python-ideas-request at python.org You can reach the person managing the list at ??? python-ideas-owner at python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Python-ideas digest..." Today's Topics: ? 1. Re: A sorted version of **kwargs (Don Spaulding) ? 2. Re: A sorted version of **kwargs (MRAB) ? 3. Re: A sorted version of **kwargs (Alexander Belopolsky) ? 4. Re: A sorted version of **kwargs (Nick Coghlan) ? 5. Re: A sorted version of **kwargs (Hongbao Chen) ---------------------------------------------------------------------- Message: 1 Date: Thu, 20 Jan 2011 14:26:23 -0600 From: Don Spaulding To: Bruce Leban Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? Content-Type: text/plain; charset="iso-8859-1" On Thu, Jan 20, 2011 at 2:05 PM, Bruce Leban wrote: > > On Thu, Jan 20, 2011 at 11:47 AM, Tim Delaney > wrote: > >> >> ['b':1] would then be ambiguous (appears to be a slice of a list). More >> obvious in the case of [1:2] ... >> > > We use parenthesis for tuples and avoid the ambiguity by writing (1,). > In the same way, we could require your examples to be written ['b':1,] and > [1:2,] > Please, not this.? I like the idea of syntactic support for the odict, but no need to spread the (foo,) syntax around.? It's too easy to misinterpret when you do a quick scan of a body of code. -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 2 Date: Thu, 20 Jan 2011 20:28:41 +0000 From: MRAB To: python-ideas Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: <4D389AF9.5080608 at mrabarnett.plus.com> Content-Type: text/plain; charset=UTF-8; format=flowed On 20/01/2011 19:47, Tim Delaney wrote: > On 21 January 2011 06:13, Bruce Leban > wrote: > >? ? Back to the original problem though: if the issue is that creating >? ? an ordered dict is clumsy and perhaps interfering with adoption and >? ? usage then perhaps the notation for ordered dict could be improved. >? ? Just as we can now use {...} for both dicts and sets, perhaps we >? ? could add > >? ? ? ? [ 'b' : 1, 'a' : 2 ] > >? ? as a more convenient way of writing > >? ? ? ? OrderedDict([('b', 1), ('a', 2)]) > > >? ? This is parallel to the way that [1,2] is an ordered container while >? ? {1,2} is unordered. > > > ['b':1] would then be ambiguous (appears to be a slice of a list). More > obvious in the case of [1:2] ... > [snip] In what way is it ambiguous? [1] isn't ambiguous, is it? spam[1] is subscripting and [1] is a list; spam[1 : 2] is slicing and [1 : 2] would be an ordered dict. ------------------------------ Message: 3 Date: Thu, 20 Jan 2011 15:57:34 -0500 From: Alexander Belopolsky To: Tim Delaney Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jan 20, 2011 at 2:47 PM, Tim Delaney wrote: .. > ['b':1] would then be ambiguous (appears to be a slice of a list). More > obvious in the case of?[1:2] ... x[1:2], x[1:2,], and x[1:2, 3:4] are all valid syntaxes.? (NumPy uses the latter for multi-dimensional slicing.)? However, I don't see an ambiguity here.? We don't have an ambiguity between tuple syntax and function calls: (a, b, c) vs. f(a, b, c). ------------------------------ Message: 4 Date: Fri, 21 Jan 2011 10:54:23 +1000 From: Nick Coghlan To: Guido van Rossum Cc: python-ideas at python.org, "M.-A. Lemburg" Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? Content-Type: text/plain; charset=ISO-8859-1 On Fri, Jan 21, 2011 at 2:42 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: >> Nick Coghlan wrote: >>> Yeah, simply making the kwargs dict always ordered is likely the way >>> we would do it. That's also the only solution with any chance of >>> working by default with the way most decorators are structured >>> (accepting *args and **kwargs and passing them to the wrapped >>> function). >> >> -1. >> >> How often do you really need this ? >> >> In which of those cases wouldn't a static code analysis give you >> the call order of the parameters already ?? >> >> "Nice to have" is not good enough to warrant a slow down of >> all function calls involving keyword arguments, adding overhead >> for other Python implementations and possibly causing problems >> with 3rd party extensions relying on getting a PyDict for the >> keyword arguments object. > > What he says. I actually agree as well, but I was misremembering how the construction of the kwargs dict worked and hence was thinking that was the only possible way this could work (since the interpreter didn't know anything about the target function while building the dict). Actually checking with dis.dis and looking at the associated code in ceval.c corrected my misapprehension (the function is actually retrieved first, while the kwargs are still on the stack in the appropriate order, so it is theoretically possible for the function to influence how the args are stored). So, as an alternative proposal, perhaps it would be possible to add a new protocol that allowed a callable to flag that an ordered dictionary should be used for kwargs (e.g. an "__ordered_call__" boolean attribute, with C level flags to speed up the common builtin callable cases). A @functools.ordered_call decorator could then just do "__ordered_call__ = True" to set the flag appropriately. (You could also be even more flexible and devise a protocol that supported any type for kwargs, but I think that would just be far more complicated without a corresponding increase in expressive power) Ordinary calls that used "x=y" or "**d" would be slowed down marginally due to the check for the new attribute on the supplied callable, but that impact should be minimal (especially for the builtin cases, which would just be checking a C struct field) and other calls would be entirely unaffected. The CPython specific impact would largely be limited to update_keyword_args() and implementing C level fields with associated __ordered_call__ properties on the various builtin callable objects. There would also need to be a new variant of PyEval_EvalCodeEx that either used an ordered dictionary for kwdict, or else allowed kwdict to be created and passed in by the calling code rather than implicitly created via PyDict_New(). A builtin version of collections.OrderedDict would still be a precursor to this idea though, so creating a _collections.OrderedDict C implementation still sounds like the right starting point for anyone that is particularly keen to see this idea progress. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia ------------------------------ Message: 5 Date: Fri, 21 Jan 2011 09:58:41 +0800 From: "Hongbao Chen" To: Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? ??? Content-Type: text/plain;??? charset="gb2312" Hey, guys I must argue that make **kwargs sorted is really a terrible idea. Please think that when user wants a unsorted **kwargs, how can he or she bring the original unsorted dict back? When a dict is unsorted, we can sort it with implementation in Python/C interface or just python code for portability. That is what should be like! Give users more control over behavior of dict. That is what I propose. Best regards Hongbao Chen XJTU -----????----- ???: python-ideas-bounces+microcore=yahoo.com.cn at python.org [mailto:python-ideas-bounces+microcore=yahoo.com.cn at python.org] ?? python-ideas-request at python.org ????: 2011?1?21? 1:52 ???: python-ideas at python.org ??: Python-ideas Digest, Vol 50, Issue 34 Send Python-ideas mailing list submissions to ??? python-ideas at python.org To subscribe or unsubscribe via the World Wide Web, visit ??? http://mail.python.org/mailman/listinfo/python-ideas or, via email, send a message with subject or body 'help' to ??? python-ideas-request at python.org You can reach the person managing the list at ??? python-ideas-owner at python.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Python-ideas digest..." Today's Topics: ? 1. Re: A sorted version of **kwargs (Nick Coghlan) ? 2. Re: A sorted version of **kwargs (M.-A. Lemburg) ? 3. Re: A sorted version of **kwargs (Guido van Rossum) ? 4. Re: A sorted version of **kwargs (Imri Goldberg) ? 5. Re: A sorted version of **kwargs (geremy condra) ---------------------------------------------------------------------- Message: 1 Date: Thu, 20 Jan 2011 23:11:57 +1000 From: Nick Coghlan To: "Steven D'Aprano" Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: > I would be +0 on making **kwargs an ordered dict automatically, and -1 on > adding ***ordered_kwargs. Because kwargs is mostly used only for argument > passing, and generally with only a small number of items, it probably > doesn't matter too much if it's slightly slower than an unordered dict. Yeah, simply making the kwargs dict always ordered is likely the way we would do it. That's also the only solution with any chance of working by default with the way most decorators are structured (accepting *args and **kwargs and passing them to the wrapped function). To expand on Raymond's response in the previous thread on this topic, there are likely a number of steps to this process: 1. Provide a _collections.OrderedDict C implementation 2. Create a PEP to gain agreement from other implementations (especially IronPython, PyPy and Jython) to proceed with the remaining steps 3. Make it a builtin class (odict?) in its own right (with collections.OrderedDict becoming an alias for the builtin type) 4. Update the interpreter to use the new builtin type for kwargs containers Use various microbenchmarks to check that the use of the new odict builtin type instead of a plain dict doesn't slow things down too much. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia ------------------------------ Message: 2 Date: Thu, 20 Jan 2011 15:05:23 +0100 From: "M.-A. Lemburg" To: Nick Coghlan Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: <4D384123.6040105 at egenix.com> Content-Type: text/plain; charset=ISO-8859-1 Nick Coghlan wrote: > On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >> passing, and generally with only a small number of items, it probably >> doesn't matter too much if it's slightly slower than an unordered dict. > > Yeah, simply making the kwargs dict always ordered is likely the way > we would do it. That's also the only solution with any chance of > working by default with the way most decorators are structured > (accepting *args and **kwargs and passing them to the wrapped > function). -1. How often do you really need this ? In which of those cases wouldn't a static code analysis give you the call order of the parameters already? ? "Nice to have" is not good enough to warrant a slow down of all function calls involving keyword arguments, adding overhead for other Python implementations and possibly causing problems with 3rd party extensions relying on getting a PyDict for the keyword arguments object. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source? (#1, Jan 20 2011) >>> Python/Zope Consulting and Support ...? ? ? ? http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ...? ? ? ? ? ? http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...? ? ? ? http://python.egenix.com/ ________________________________________________________________________ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: ? eGenix.com Software, Skills and Services GmbH? Pastor-Loeh-Str.48 ? ? D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg ? ? ? ? ? Registered at Amtsgericht Duesseldorf: HRB 46611 ? ? ? ? ? ? ? http://www.egenix.com/company/contact/ ------------------------------ Message: 3 Date: Thu, 20 Jan 2011 08:42:20 -0800 From: Guido van Rossum To: "M.-A. Lemburg" Cc: python-ideas at python.org Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: > Nick Coghlan wrote: >> On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >>> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >>> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >>> passing, and generally with only a small number of items, it probably >>> doesn't matter too much if it's slightly slower than an unordered dict. >> >> Yeah, simply making the kwargs dict always ordered is likely the way >> we would do it. That's also the only solution with any chance of >> working by default with the way most decorators are structured >> (accepting *args and **kwargs and passing them to the wrapped >> function). > > -1. > > How often do you really need this ? > > In which of those cases wouldn't a static code analysis give you > the call order of the parameters already ?? > > "Nice to have" is not good enough to warrant a slow down of > all function calls involving keyword arguments, adding overhead > for other Python implementations and possibly causing problems > with 3rd party extensions relying on getting a PyDict for the > keyword arguments object. What he says. In addition, I wonder what the semantics would be if the caller passed **d where d was an *unordered* dict... -- --Guido van Rossum (python.org/~guido) ------------------------------ Message: 4 Date: Thu, 20 Jan 2011 18:53:54 +0200 From: Imri Goldberg To: Guido van Rossum Cc: python-ideas at python.org, "M.-A. Lemburg" Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? Content-Type: text/plain; charset="iso-8859-1" On Thu, Jan 20, 2011 at 6:42 PM, Guido van Rossum wrote: > > > -1. > > > > How often do you really need this ? > > > > In which of those cases wouldn't a static code analysis give you > > the call order of the parameters already? ? > > > > "Nice to have" is not good enough to warrant a slow down of > > all function calls involving keyword arguments, adding overhead > > for other Python implementations and possibly causing problems > > with 3rd party extensions relying on getting a PyDict for the > > keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... What if the default behavior stays as it is today, but a magic decorator is added, (maybe @ordered_kwargs or some such), and only for these kind of functions the new behavior applies. Also, given such a decorator, when given **d where d is a regular dict, the implementation could possibly throw an error. (Or maybe it is up to the implementor of the specific function). Cheers, Imri -- Imri Goldberg -------------------------------------- http://plnnr.com/ - automatic trip planning http://www.algorithm.co.il/blogs/ -------------------------------------- -- insert signature here ---- -------------- next part -------------- An HTML attachment was scrubbed... URL: ------------------------------ Message: 5 Date: Thu, 20 Jan 2011 09:52:21 -0800 From: geremy condra To: Guido van Rossum Cc: python-ideas at python.org, "M.-A. Lemburg" Subject: Re: [Python-ideas] A sorted version of **kwargs Message-ID: ??? Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jan 20, 2011 at 8:42 AM, Guido van Rossum wrote: > On Thu, Jan 20, 2011 at 6:05 AM, M.-A. Lemburg wrote: >> Nick Coghlan wrote: >>> On Thu, Jan 20, 2011 at 8:28 PM, Steven D'Aprano wrote: >>>> I would be +0 on making **kwargs an ordered dict automatically, and -1 on >>>> adding ***ordered_kwargs. Because kwargs is mostly used only for argument >>>> passing, and generally with only a small number of items, it probably >>>> doesn't matter too much if it's slightly slower than an unordered dict. >>> >>> Yeah, simply making the kwargs dict always ordered is likely the way >>> we would do it. That's also the only solution with any chance of >>> working by default with the way most decorators are structured >>> (accepting *args and **kwargs and passing them to the wrapped >>> function). >> >> -1. >> >> How often do you really need this ? >> >> In which of those cases wouldn't a static code analysis give you >> the call order of the parameters already ?? >> >> "Nice to have" is not good enough to warrant a slow down of >> all function calls involving keyword arguments, adding overhead >> for other Python implementations and possibly causing problems >> with 3rd party extensions relying on getting a PyDict for the >> keyword arguments object. > > What he says. > > In addition, I wonder what the semantics would be if the caller passed > **d where d was an *unordered* dict... Wouldn't this be a good argument for the original proposal? That there wouldn't be confusion about whether you were getting an odict or a dict with ***? Also, would functions that didn't specify this behavior see an actual performance hit? I assumed that given the existence of METH_NOARGS and friends that there was some kind of optimization going on here, but I can't get it to turn up on timeit. Geremy Condra ------------------------------ _______________________________________________ Python-ideas mailing list Python-ideas at python.org http://mail.python.org/mailman/listinfo/python-ideas End of Python-ideas Digest, Vol 50, Issue 34 ******************************************** __________________________________________________ ??????????????? http://cn.mail.yahoo.com ------------------------------ _______________________________________________ Python-ideas mailing list Python-ideas at python.org http://mail.python.org/mailman/listinfo/python-ideas End of Python-ideas Digest, Vol 50, Issue 36 ******************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From rrr at ronadam.com Tue Jan 25 04:38:17 2011 From: rrr at ronadam.com (Ron Adam) Date: Mon, 24 Jan 2011 21:38:17 -0600 Subject: [Python-ideas] Location of tests for packages In-Reply-To: References: Message-ID: Moving these suggestions to python ideas as they are not immediately relevant to the current python dev discussion at this time. ;-) I'm really just wondering what others think, is this something worth working on? On 01/24/2011 01:46 PM, Raymond Hettinger wrote: > Right now, the tests for the unittest package are under the package > directory instead of Lib/test where we have most of the other tests. > > There are some other packages that do the same thing, each for their own > reason. > > I think we should develop a strong preference for tests going under > Lib/test unless there is a very compelling reason. > * For regrtest to work, there still needs to be some file in Lib/test > that dispatches to the alternate test directory. Currently tests are mostly separate from the modules. Mostly separate because, some modules have doctests in them, and/or a test() function to run tests. But the test function name isn't special in any way as far as I know. It's usually _test(), but it could be something else. Would it help things to have a special __test__ name? ie... special to python, in that module.__test__() can be depended on to run the tests for that module? (or package) I've found it useful to add a -T option to my own python applications to run tests. But moving it from being a module option to a python option, would make that even nicer. Where python -T modulename called the __test__() function. (or evoke the tests in another way.) With a dependable way to evoke the tests no matter where they are located, it then becomes just a matter of iterating the list of modules in the library (or a directory) to run all the tests. But then, what's the best way to actually do that? The current method uses pattern matching ... (not my first choice for sure) python -m unittest discover -s project_directory -p '*_test.py' python -m unittest discover project_directory '*_test.py' Those lines are way too long in my opinion. In the above, the test functions need to begin with an underscore. I'm not sure the discoverable test function should be private, The very act of finding them suggests they should be a public, but special API, rather than a private API to me. If they were only accessed from inside the module or package they are in, then being private makes sense. One choice is that __test__ is special to doctest and unittest only. python -m unittest module # runs module.__test__() if it exists. Or a little more far out ... The __test__() function could be called with "python -T module". But then it seems that maybe we could also have a __main__() function be called with "python -M module". (?) Alternately ... "python -T module" could alter the name of the module. if __name__ == "__main__": main() elif __name__ == "__test__": _test() # a private name is ok here. Cheers, Ron From ddasilva at umd.edu Thu Jan 27 19:03:18 2011 From: ddasilva at umd.edu (Daniel da Silva) Date: Thu, 27 Jan 2011 13:03:18 -0500 Subject: [Python-ideas] Adding threading.RepeatTimer class Message-ID: We have a threading.Timer class which executes after a given delay and then terminates. I was thinking about a pre-made class that would execute, wait an interval, and then repeat. It would cut down on the logic people need to implement themselves, and make simple scripts faster. Here are some use cases: # Send a ping message to all connected clients every 120 seconds from server import ping_all_clients pinger = threading.RepeatTimer(120, ping_all_clients) pinger.start() # Check for updates every 3 hours from mymodule import check_for_updates update_checker = threading.RepeatTimer(60*60*3, check_for_updates) update_checker.start() I was thinking of the class having an initializer signature as follows: class threading.RepeatTimer(interval, function, args=[], kwargs={}, limit=None) Create a timer that will run function with args and kwargs and repeat every interval secods. If limit is an integer, limits repetitions to that many calls. cancel() Stop the repeat timer. If in the middle of a call, the call is allowed to finish. Daniel -------------- next part -------------- An HTML attachment was scrubbed... URL: From brian.curtin at gmail.com Thu Jan 27 23:12:44 2011 From: brian.curtin at gmail.com (Brian Curtin) Date: Thu, 27 Jan 2011 16:12:44 -0600 Subject: [Python-ideas] Adding threading.RepeatTimer class In-Reply-To: References: Message-ID: On Thu, Jan 27, 2011 at 12:03, Daniel da Silva wrote: > We have a threading.Timer class which executes after a given delay and then > terminates. I was thinking about a pre-made class that would execute, wait > an interval, and then repeat. It would cut down on the logic people need to > implement themselves, and make simple scripts faster. > > Here are some use cases: > > # Send a ping message to all connected clients every 120 seconds > from server import ping_all_clients > pinger = threading.RepeatTimer(120, ping_all_clients) > pinger.start() > > # Check for updates every 3 hours > from mymodule import check_for_updates > update_checker = threading.RepeatTimer(60*60*3, check_for_updates) > update_checker.start() > > > I was thinking of the class having an initializer signature as follows: > > class threading.RepeatTimer(interval, function, args=[], kwargs={}, > limit=None) > Create a timer that will run function with args and kwargs and repeat > every interval secods. If limit is > an integer, limits repetitions to that many calls. > > cancel() > Stop the repeat timer. If in the middle of a call, the call is > allowed to finish. > > Daniel I'm not sure this is a good fit in the standard library. There's too much that the implementer might want to customize for their repeating needs. For your pinger, what if a ping fails? Keep pinging while it fails? Timeout for a bit and try again? Increasing timeouts? Just exit? I don't think there's a general enough use case that a repetitive timer has in order to be included. The following will likely do what you want. import threading class RepeatTimer(threading.Thread): def __init__(self, interval, callable, *args, **kwargs): threading.Thread.__init__(self) self.interval = interval self.callable = callable self.args = args self.kwargs = kwargs self.event = threading.Event() self.event.set() def run(self): while self.event.is_set(): t = threading.Timer(self.interval, self.callable, self.args, self.kwargs) t.start() t.join() def cancel(self): self.event.clear() -------------- next part -------------- An HTML attachment was scrubbed... URL: From alex.gronholm at nextday.fi Thu Jan 27 23:37:44 2011 From: alex.gronholm at nextday.fi (=?ISO-8859-1?Q?Alex_Gr=F6nholm?=) Date: Fri, 28 Jan 2011 00:37:44 +0200 Subject: [Python-ideas] Adding threading.RepeatTimer class In-Reply-To: References: Message-ID: <4D41F3B8.4050506@nextday.fi> 27.01.2011 20:03, Daniel da Silva kirjoitti: > We have a threading.Timer class which executes after a given delay and > then terminates. I was thinking about a pre-made class that would > execute, wait an interval, and then repeat. It would cut down on the > logic people need to implement themselves, and make simple scripts faster. For that purpose you can use APScheduler's interval scheduling: http://packages.python.org/APScheduler/ Granted, it's not a part of the standard library but it would fulfill the purpose stated above. > > Here are some use cases: > > # Send a ping message to all connected clients every 120 seconds > from server import ping_all_clients > pinger = threading.RepeatTimer(120, ping_all_clients) > pinger.start() > > # Check for updates every 3 hours > from mymodule import check_for_updates > update_checker = threading.RepeatTimer(60*60*3, check_for_updates) > update_checker.start() > > > I was thinking of the class having an initializer signature as follows: > > class threading.RepeatTimer(interval, function, args=[], kwargs={}, > limit=None) > Create a timer that will run function with args and kwargs and > repeat every interval secods. If limit is > an integer, limits repetitions to that many calls. > > cancel() > Stop the repeat timer. If in the middle of a call, the call > is allowed to finish. > > Daniel > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at cheimes.de Fri Jan 28 21:38:56 2011 From: lists at cheimes.de (Christian Heimes) Date: Fri, 28 Jan 2011 21:38:56 +0100 Subject: [Python-ideas] Adding threading.RepeatTimer class In-Reply-To: References: Message-ID: Am 27.01.2011 23:12, schrieb Brian Curtin: > import threading > > class RepeatTimer(threading.Thread): > def __init__(self, interval, callable, *args, **kwargs): > threading.Thread.__init__(self) > self.interval = interval > self.callable = callable > self.args = args > self.kwargs = kwargs > self.event = threading.Event() > self.event.set() > > def run(self): > while self.event.is_set(): > t = threading.Timer(self.interval, self.callable, > self.args, self.kwargs) > t.start() > t.join() > > def cancel(self): > self.event.clear() CherryPy has two simpler implementations for a PerpetualTimer and a BackgroundTask: http://cherrypy.org/browser/trunk/cherrypy/process/plugins.py#L424 . The BackgroundTask is less CPU consuming under Python 2.x. I guess the extended threading funtions in 3.2 make the implementation obsolete. Christian From ncoghlan at gmail.com Mon Jan 31 13:23:36 2011 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 31 Jan 2011 22:23:36 +1000 Subject: [Python-ideas] [Python-Dev] Byte code arguments from two to one byte: did anyone try this? In-Reply-To: References: Message-ID: On Mon, Jan 31, 2011 at 7:17 PM, Jurjen N.E. Bos wrote: > Did anyone try this already? If not, I might take up the gauntlet > and try it myself, but I never did this before... a) This is more on topic for python-ideas rather than python-dev (cc changed accordingly) b) My off-the-top-of-my-head guess is that caching effects will swamp any impact such a change might have. The only way to find out for sure is going to be for someone to try it (along the lines of the already mentioned WPython variant) and see what happens for benchmarks like pybench and the execution times of various test suites. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia