From p.f.moore at gmail.com Thu Mar 1 00:08:40 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Wed, 29 Feb 2012 23:08:40 +0000 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: <3395104A-4BE9-4372-BC6F-23AB5EA89A85@me.com> References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4E7CA8.7010109@stoneleaf.us> <9611190B-6909-4037-9A1D-2ED701D38E3C@me.com> <4F4E82B2.5050809@stoneleaf.us> <3395104A-4BE9-4372-BC6F-23AB5EA89A85@me.com> Message-ID: On 29 February 2012 21:47, Craig Yoshioka wrote: > ?I also think that when people use non-builtin contextmanagers it's usually within a very specific... context (*dammit*), and so they are likely to look up why > they are using an object as a context manager. ?That's where you would document the behavior: > > with uncached(path): > ?# code here only executes if the path does not exist Personally, I would *always* assume that the body of the with statement executes. That's what the with statement does, and I would be very surprised to see something different happen. Even with the comment, I'd be surprised. Paul. From steve at pearwood.info Thu Mar 1 01:00:03 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 01 Mar 2012 11:00:03 +1100 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> Message-ID: <4F4EBC03.8050609@pearwood.info> Craig Yoshioka wrote: > I've tried classes, decorators, and passing the conditional using 'as', as > suggested by Michael, so I disagree that with is not suitable here since I have > yet to find a better alternative. If you want I can give pretty concrete > examples in the ways they aren't as good. Furthermore, I think it could be > argued that it makes more sense to be able to safely skip the with body without > the user of the with statement having to manually catch the exception > themselves.... we don't make people catch the StopIteration exception manually > when using iterators... Sometimes we do. When you call next(it) manually, you are responsible for catching the exception manually. It is only flow-control tools (e.g. for loops, list comprehensions) that catch the exception for you. > 1) I can't think of many instances in python where a block of code can not be > conditionally executed safely: > if - obvious > functions - need to be called > loops - can have 0 or more iterations > try/except/finally - even here there is the same notion of the code blocks > being conditionally executed, just a bit more scrambled Classes are blocks of code. If you want to conditionally skip executing all, or part, of a class block you wrap it in an if block. (Or try...except.) This is as it should be: the class statement is not a flow control statement. Nor is the def statement. In this case, the right concept is not in *calling* the function body, but in *compiling* the function body. Again, if you want to conditionally skip compiling all or part of the function body, you have to wrap it in a flow control structure, if or try. We don't have any concept of: class K: body def f(): body where something inside the bodies invisibly determines whether or not the K block executes or the f block compiles. -- Steven From steve at pearwood.info Thu Mar 1 02:26:51 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 01 Mar 2012 12:26:51 +1100 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: <9611190B-6909-4037-9A1D-2ED701D38E3C@me.com> References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4E7CA8.7010109@stoneleaf.us> <9611190B-6909-4037-9A1D-2ED701D38E3C@me.com> Message-ID: <4F4ED05B.7040908@pearwood.info> Craig Yoshioka wrote: > with uncached('file') as file: > if not file: return > > which isn't so bad, except it is overloading the meaning of file a bit, and No, I don't agree with that. It is normal Pythonic idiom for objects to be interpreted in a boolean concept. The only difference here is that sometimes file will be (presumably) a file-like object, and sometimes it needs to be a sentinel like None. But this shouldn't disturb you: you have already suggested one possible implementation would be for __enter__ to return a special value, SkipWithBlock, to force skipping the block. That's fine: you can have this functionality *right now*. All you need do is change the spelling from SkipWithBlock to None, and use an explicit test inside the block rather than an implicit test, and you're done. Best of all, the use of an explicit test means you can do this: with whatever('file') as file: if file is None: log('error') else: normal_processing() which you can't do with your suggested implicit test-and-skip. The fact that with blocks are not flow-control is a feature, not a bug. > why shouldn't the with block be skippable? It is skippable. Like every other code block, it is skippable by wrapping it in flow-control code to decide whether or not to skip it. -- Steven From craigyk at me.com Thu Mar 1 01:46:58 2012 From: craigyk at me.com (Craig Yoshioka) Date: Wed, 29 Feb 2012 16:46:58 -0800 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: <4F4EBC03.8050609@pearwood.info> References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4EBC03.8050609@pearwood.info> Message-ID: On Feb 29, 2012, at 4:00 PM, Steven D'Aprano wrote: > Sometimes we do. When you call next(it) manually, you are responsible for catching the exception manually. It is only flow-control tools (e.g. for loops, list comprehensions) that catch the exception for you. > I know that. I'm arguing that 'with' could catch the for loop equivalent of StopIteration. Or at least that's one implementation. > Classes are blocks of code. If you want to conditionally skip executing all, or part, of a class block you wrap it in an if block. (Or try...except.) This is as it should be: the class statement is not a flow control statement. > > Nor is the def statement. In this case, the right concept is not in *calling* the function body, but in *compiling* the function body. Again, if you want to conditionally skip compiling all or part of the function body, you have to wrap it in a flow control structure, if or try. > > We don't have any concept of: > > class K: body > > def f(): body > > where something inside the bodies invisibly determines whether or not the K block executes or the f block compiles. > > ? I'm just making an observation that there aren't many other places where an 'independent' block is 'guaranteed' to run. I'm not arguing that something inside a block is magically preventing execution of the block. The with keyword is 'outside' the block. From within a block you can always return/break early anyways. Think about it: def test(): # block is only executed if function is called if True: # block is only executed if True for x in items: # block is executed only for each item class T(object): def m(self): # if method called but I find it surprising that all of a sudden it's correct to assume: with x: # it's obvious this block will always run that was not my assumption when I first became aware of 'with'. For me it only meant that the indented block ran within some sort of context that was being managed for me. Not saying the current use is wrong (though I do find it inconvenient and inconsistent), just that, other than 'with' I'm trying to think of a situation in which I'd put code in a new block where that code is guaranteed to run. I suppose I assumed that contexts were fancier if statements that were guaranteed to create and clean up various aspects of state. > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From jeanpierreda at gmail.com Thu Mar 1 02:56:45 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Wed, 29 Feb 2012 20:56:45 -0500 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4E7CA8.7010109@stoneleaf.us> <9611190B-6909-4037-9A1D-2ED701D38E3C@me.com> <4F4E82B2.5050809@stoneleaf.us> <3395104A-4BE9-4372-BC6F-23AB5EA89A85@me.com> Message-ID: On Wed, Feb 29, 2012 at 6:08 PM, Paul Moore wrote: > Personally, I would *always* assume that the body of the with > statement executes. That's what the with statement does, and I would > be very surprised to see something different happen. Even with the > comment, I'd be surprised. with open('/path/that/doesnt/exist.txt', 'r') as f: # code that doesn't get executed pass -- Devin From ncoghlan at gmail.com Thu Mar 1 03:15:49 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 1 Mar 2012 12:15:49 +1000 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4EBC03.8050609@pearwood.info> Message-ID: On Thu, Mar 1, 2012 at 10:46 AM, Craig Yoshioka wrote: > ? I'm just making an observation that there aren't many other places where an 'independent' block is 'guaranteed' to run. ?I'm not arguing that something inside a block is magically preventing execution of the block. ?The with keyword is 'outside' the block. ?From within a block you can always return/break early anyways. There are two other cases where a suite is guaranteed to at least start executing, and one of them is the only case that matters here: "try" blocks. The flow control behaviour of "with:" is consistent with the behaviour of a "try:" block, and that is not an accident. Conditional execution of a try block requires a separate if statement or throwing an exception that is caught and suppressed by one of the exception handlers. Similarly, conditional execution of a with block requires a separate if statement or throwing an exception that is caught and suppressed by one of the context managers. Hence, arguments of language consistency aren't going to get you anywhere here. Guido's verdict on PEP 377 was that the payoff (avoiding a separate if statement or a method call that deliberately throws an appropriate exception in certain niche use cases) wasn't worth the additional complexity in the with statement definition. I now agree with him. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From craigyk at me.com Thu Mar 1 03:21:23 2012 From: craigyk at me.com (Craig Yoshioka) Date: Wed, 29 Feb 2012 18:21:23 -0800 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: <4F4ED05B.7040908@pearwood.info> References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4E7CA8.7010109@stoneleaf.us> <9611190B-6909-4037-9A1D-2ED701D38E3C@me.com> <4F4ED05B.7040908@pearwood.info> Message-ID: <7C4A0760-61AA-4F7E-8C63-F2FFAD33128D@me.com> > The fact that with blocks are not flow-control is a feature, not a bug. > >> why shouldn't the with block be skippable? > > It is skippable. Like every other code block, it is skippable by wrapping it in flow-control code to decide whether or not to skip it. I'm not claiming the current functionality is a bug, just unfortunate. From luoyonggang at gmail.com Thu Mar 1 04:10:32 2012 From: luoyonggang at gmail.com (=?UTF-8?B?572X5YuH5YiaKFlvbmdnYW5nIEx1bykg?=) Date: Thu, 1 Mar 2012 11:10:32 +0800 Subject: [Python-ideas] Is that possible to extract the distutils subcommand's argv and calling other commands? Message-ID: For example I am running command like python setup.py test -f install I want to pass -f argv to unittest, how I can do that? And after unittest, we can still running command install. -- ?? ? ??? Yours sincerely, Yonggang Luo -------------- next part -------------- An HTML attachment was scrubbed... URL: From craigyk at me.com Thu Mar 1 04:28:45 2012 From: craigyk at me.com (Craig Yoshioka) Date: Wed, 29 Feb 2012 19:28:45 -0800 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4EBC03.8050609@pearwood.info> Message-ID: I agree about 'try' being another example of a, at least not immediately, flow-controlled block, I think I mentioned as much in a different response. But you're not going to win me over by saying that the #1 goal of 'with' is too best match the flow-control semantics of try|except|finally. The #1 goal, I would have hoped, was to abstract away the pattern of using try:except:finally to manage a block of code that does IO stuff. And the current 'with' doesn't even perfectly match the semantics of try|except|finally because if it did, then 'with' uses of try|except|finally that I can think of that don't have a with equivalent. try: # boilerplate while True: # boilerplate if alreadyCached(): # boilerplate break # boilerplate try: # boilerplate acquireLock() # boilerplate doSomeStuff() # <- code that does doMoreStuff() # <- stuff goes here alreadyCached(True) # boilerplate except AlreadyLocked: # boilerplate sleep() # boilerplate finally: # boilerplate cleanUpMyLockIfItExists() # boilerplate There isn't a 'with' equivalent to the above that hides all the unnecessary context state and boilerplate from the enclosed block, therefore the 'client' code must know and remember to check the appropriate flags everytime and if they don't the context's functionality may be broken. if not alreadyCached(): # <- check before we execute context block with aquireCache() as cache: ... # <- to run the ... # code they put in here with aquireCache() as cache: # <- slightly better than above if not cache: # <- but may still 'break' the context if they forget this ... # <- to run the ... # code they put in here On Feb 29, 2012, at 6:15 PM, Nick Coghlan wrote: > There are two other cases where a suite is guaranteed to at least > start executing, and one of them is the only case that matters here: > "try" blocks. The flow control behaviour of "with:" is consistent with > the behaviour of a "try:" block, and that is not an accident. > > Conditional execution of a try block requires a separate if statement > or throwing an exception that is caught and suppressed by one of the > exception handlers. Similarly, conditional execution of a with block > requires a separate if statement or throwing an exception that is > caught and suppressed by one of the context managers. Hence, arguments > of language consistency aren't going to get you anywhere here. > > Guido's verdict on PEP 377 was that the payoff (avoiding a separate if > statement or a method call that deliberately throws an appropriate > exception in certain niche use cases) wasn't worth the additional > complexity in the with statement definition. I now agree with him. > > Regards, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From craigyk at me.com Thu Mar 1 06:32:19 2012 From: craigyk at me.com (Craig Yoshioka) Date: Wed, 29 Feb 2012 21:32:19 -0800 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4EBC03.8050609@pearwood.info> <08D38B07-A393-4895-A0A6-F47EDB82B812@me.com> Message-ID: <212AF954-846A-47A0-BE20-6432625C8817@me.com> On Feb 29, 2012, at 8:01 PM, Nick Coghlan wrote: > On Thu, Mar 1, 2012 at 1:27 PM, Craig Yoshioka wrote: >> But you're not going to win me over by saying that the #1 goal of 'with' is too best match the flow-control semantics of try|except|finally. > > Um, you have that exactly backwards. *You* are the one agitating for > change, therefore the onus is on *you* to demonstrate that the > benefits exceed the costs. > I am aware that I need to motivate against a lot of inertia... 'win me over' was just a statement that that particular line of argument was unlikely to make me concede since I had already thought it through, and still came to a different conclusion. > Read PEP 343 - the with statement is a tool created specifically to > factor out try/except/finally statements where the body of the try > block is the only part that changes. The fact that it has other use > cases as well was an added bonus. > OK, then abstracting the concept of code running in a managed context wasn't the primary goal. That's too bad. > *You* are the one suggesting that it be enhanced to do more, *despite* > there being an existing PEP explicitly rejecting your proposed > enhancements as introducing unnecessary complexity. I am merely > telling you that the arguments you have presented so far are not > persuasive, since they add nothing beyond those that I included when I > wrote PEP 377. > except PEP 377 failed partially because there wasn't a concrete real-world example of where it would be useful. Thus, I thought presenting such an example might be enough to bring it up again. > Unless you have some compelling new information to present that can > clearly be handled *only* by changing the way the with statement > works, you're not going to win this one. Additional complexity is > inherently bad, so, without presentation of clear gains in > expressiveness and readability, the status quo will always win by > default (See http://www.boredomandlaziness.org/2011/02/status-quo-wins-stalemate.html > & http://www.boredomandlaziness.org/2011/02/justifying-python-language-changes.html) > I gathered that that was your real argument, I just wanted it out. So it's just a matter of wether my example is compelling *enough*. I think my example is *compelling*, but can concede just one example still might not be good enough. As to the added 'complexity': There's the code complexity. I suspect that two or three lines of code would do it, but I'm still trying to interpret the Python compiler code, as this is the first time I've looked at it. Then there's the conceptual complexity of the new behavior, but if all current code works the same, and there isn't a noticeable performance penalty, and people are extremely unlikely to run into context managers that use the new functionality anyways, I don't see that 'complexity' being an issue either. I'm going to try and make the modifications and do some testing so I can make a stronger argument on the latter point. I use a lot of third-party packages to test against, and have close to 1M lines of internal python code as well. From guido at python.org Thu Mar 1 06:49:52 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 29 Feb 2012 21:49:52 -0800 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: <212AF954-846A-47A0-BE20-6432625C8817@me.com> References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4EBC03.8050609@pearwood.info> <08D38B07-A393-4895-A0A6-F47EDB82B812@me.com> <212AF954-846A-47A0-BE20-6432625C8817@me.com> Message-ID: Hey, can we just let this PEP rest in peace? It was already rejected once. I don't see even a shimmer of support for reverting that decision. Please stop wasting everyone's time, Craig. Focus your energy somewhere productive. -- --Guido van Rossum (python.org/~guido) From phd at phdru.name Thu Mar 1 09:42:00 2012 From: phd at phdru.name (Oleg Broytman) Date: Thu, 1 Mar 2012 12:42:00 +0400 Subject: [Python-ideas] Is that possible to extract the distutils subcommand's argv and calling other commands? In-Reply-To: References: Message-ID: <20120301084200.GA24146@iskra.aviel.ru> Hi! This mailing list is for discussing new ideas in Python. Please ask general python lists/newsgroups/forums. On Thu, Mar 01, 2012 at 11:10:32AM +0800, ?????????(Yonggang Luo) wrote: > For example > I am running command like > python setup.py test -f install > I want to pass -f argv to unittest, how I can do that? > And after unittest, we can still running command install. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From jimjjewett at gmail.com Thu Mar 1 17:39:53 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 1 Mar 2012 11:39:53 -0500 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: <212AF954-846A-47A0-BE20-6432625C8817@me.com> References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4EBC03.8050609@pearwood.info> <08D38B07-A393-4895-A0A6-F47EDB82B812@me.com> <212AF954-846A-47A0-BE20-6432625C8817@me.com> Message-ID: On Thu, Mar 1, 2012 at 12:32 AM, Craig Yoshioka wrote: wrote: > except PEP 377 failed partially because there wasn't a concrete > real-world example of where it would be useful. But do you have one? I understand that you want the context to make the decision, instead of counting on a caller to do it properly. But why can't you do that with a decorator instead of a context? I think your examples are covered by http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize Is it just that you want the stuff inside both your function and your context manager/decorator to have access to the same locals, and don't want to use a closure and/or pass around a dict? -jJ From dreamingforward at gmail.com Fri Mar 2 03:31:57 2012 From: dreamingforward at gmail.com (Mark Janssen) Date: Thu, 1 Mar 2012 19:31:57 -0700 Subject: [Python-ideas] doctest In-Reply-To: References: Message-ID: On Mon, Feb 27, 2012 at 3:59 PM, Michael Foord wrote: > > > On 27 February 2012 20:35, Michael Foord wrote: > >> Personally I think there are several fundamental problems with doctest >> *as a unit testing tool*. doctest is *awesome* for testing documentation >> examples but in particular this one: >> >> * Every line becomes an assertion - in a unit test you typically follow >> the arrange -> act -> assert pattern. Only the results of the *assertion* >> are relevant to the test. (Obviously unexpected exceptions at any stage are >> relevant....). With doctest you have to take care to ensure that the exact >> output of *every line* of your arrange and act steps also match, even if >> they are irrelevant to your assertion. (The arrange and act steps will >> often include lines where you are creating state, and their output is >> irrelevant so long as they put the right things in place.) >> >> The particular implementation of doctest means that there are additional, >> potentially resolvable problems that are also a damn nuisance in a unit >> testing fail: >> > > Jeepers, I changed direction mid-sentence there. It should have read > something along the lines of: > > As well as fundamental problems, the particular implementation of doctest > suffers from these potentially resolvable problems: > > >> The problem of being dependent on order of unorderable types (actually >> very difficult to solve). >> >> I just thought of something: isn't the obvious solution is for doctest to test the type of an expression's output and if it's in the set of unordered types {set, dict} to run sort() on it? Then the docstring author can just put the (sorted) output of what's expected.... Perhaps I'm hashing a dead horse, but I really would like to see this added to the issue's tracker as a requested feature. I may code up the patch myself, but it helps my brain to have it "on the dev stack". mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Fri Mar 2 03:49:59 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 1 Mar 2012 21:49:59 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: Message-ID: On Thu, Mar 1, 2012 at 9:31 PM, Mark Janssen wrote: > I just thought of something: ?isn't the obvious solution is for doctest to > test the type of an expression's output and if it's in the set of unordered > types {set, dict} to run sort() on it? ?Then the docstring author can just > put the (sorted) output of what's expected.... It's the solution people seem to think of first, so it's definitely fairly obvious. It's not a big improvement, though. The original problem is that dict order shouldn't matter, but in doctest it does, making dicts unusable in the normal doctest style. Making it a specific dict order be the prominent one lets you use dicts in doctest, but you have to sort the dicts and rewrite the doctest to take that into account. And even so, some dicts cannot be represented this way. For example, the dict {1:2, "hello": world} cannot be sorted in Python 3, so it won't work in this scheme. The solution I used was to ast.literal_eval on both sides every time you compare output. This way you don't have to care about dict order, or whitespace, or any of these things for Python objects. (There is a flag for not caring about whitespace, but this would inappropriately collapse e.g. the whitespace inside string literals inside a dict. So imo this kills two birds with one stone). -- Devin From guido at python.org Fri Mar 2 04:31:24 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 1 Mar 2012 19:31:24 -0800 Subject: [Python-ideas] doctest In-Reply-To: References: Message-ID: On Thu, Mar 1, 2012 at 6:31 PM, Mark Janssen wrote: > I just thought of something: ?isn't the obvious solution is for doctest to > test the type of an expression's output and if it's in the set of unordered > types {set, dict} to run sort() on it? ?Then the docstring author can just > put the (sorted) output of what's expected.... What if the output is a list of dicts? -- --Guido van Rossum (python.org/~guido) From dreamingforward at gmail.com Fri Mar 2 04:39:31 2012 From: dreamingforward at gmail.com (Mark Janssen) Date: Thu, 1 Mar 2012 20:39:31 -0700 Subject: [Python-ideas] doctest In-Reply-To: References: Message-ID: On Thu, Mar 1, 2012 at 8:31 PM, Guido van Rossum wrote: > On Thu, Mar 1, 2012 at 6:31 PM, Mark Janssen > wrote: > > I just thought of something: isn't the obvious solution is for doctest > to > > test the type of an expression's output and if it's in the set of > unordered > > types {set, dict} to run sort() on it? Then the docstring author can > just > > put the (sorted) output of what's expected.... > > What if the output is a list of dicts? Right, thanks. Although I suppose in theory one could go deep -- take the deepcopy code and instead of an exact copy replace any unordered types with their sorted copies. mark. -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 2 04:48:03 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 02 Mar 2012 14:48:03 +1100 Subject: [Python-ideas] doctest In-Reply-To: References: Message-ID: <4F5042F3.8000507@pearwood.info> Mark Janssen wrote: > I just thought of something: isn't the obvious solution is for doctest > to test the type of an expression's output and if it's in the set of > unordered types {set, dict} to run sort() on it? Then the docstring author > can just put the (sorted) output of what's expected.... {set, dict} is not the set of unordered types. The set of unordered types is without bound: anyone can create their own unordered types. Even if you limit yourself to the builtins, you forgot frozenset. And then there are non-builtins in the standard library, like OrderedDict, other types like dict_proxy. Sorting the output of an OrderedDict is the wrong thing to do, because the order is significant. So doctest would need to not just recognise mappings and sets, and sort them, but *also* recognise mappings and sets which should *not* be sorted. Remember too, that by the time doctest.OutputChecker sees the output, it only sees it as a string. I don't know how much work it would take to introduce actual type-checks into doctest, but I expect it would be a lot. And one last problem for you to consider: what happens if the output is unsortable? Try this dict for size: { 2+1j: None, 2-1j: None, float('NAN'): None} > Perhaps I'm hashing a dead horse, but I really would like to see this added > to the issue's tracker as a requested feature. I may code up the patch > myself, but it helps my brain to have it "on the dev stack". Feel free to add it. For what it's worth, I am a very strong -1 on any suggestion to give doctest "Do What I Mean" powers when it comes to unordered objects. But I would support a "Do What I Say" doctest directive, like NORMALIZE_WHITESPACE, ELLIPSIS, IGNORE_EXCEPTION_DETAIL, e.g. a directive that tells doctest to split both the expected and actual output strings on whitespace, then lexicographically sort them before comparing. This approach doesn't try to be too clever: it's a dumb, understandable test which should fit in nicely with the other tests in doctest.OutputChecker.check_output, perhaps something like this: if optionflags & IGNORE_WORD_ORDER: if sorted(got.split()) == sorted(want.split()): return True It won't solve every doctest ordering problem, but doctest has other heuristics which can be fooled too. It is nice and simple, it solves the first 90% of the problem, and it is under the control of the coder. If you feel like submitting a patch, feel free to use my idea. -- Steven From ianb at colorstudy.com Fri Mar 2 04:51:06 2012 From: ianb at colorstudy.com (Ian Bicking) Date: Thu, 1 Mar 2012 21:51:06 -0600 Subject: [Python-ideas] doctest In-Reply-To: References: Message-ID: On Thu, Mar 1, 2012 at 8:49 PM, Devin Jeanpierre wrote: > On Thu, Mar 1, 2012 at 9:31 PM, Mark Janssen > wrote: > > I just thought of something: isn't the obvious solution is for doctest > to > > test the type of an expression's output and if it's in the set of > unordered > > types {set, dict} to run sort() on it? Then the docstring author can > just > > put the (sorted) output of what's expected.... > > It's the solution people seem to think of first, so it's definitely > fairly obvious. It's not a big improvement, though. The original > problem is that dict order shouldn't matter, but in doctest it does, > making dicts unusable in the normal doctest style. Making it a > specific dict order be the prominent one lets you use dicts in > doctest, but you have to sort the dicts and rewrite the doctest to > take that into account. Personally I never copy from the interactive prompt, but instead write my doctest and, if I'm in the mood to copy and paste, copy from the failed doctest error message (which is nicely indented just like my tests). So it would work fine if things were sorted. > And even so, some dicts cannot be represented > this way. For example, the dict {1:2, "hello": world} cannot be sorted > in Python 3, so it won't work in this scheme. > You could always use a heuristic sorting, e.g., sorted((str(key), key) for key in dict) To make this work you have to write a repr() replacement that is somewhat sophisticated. Though it still wouldn't save you from: class Something: def __repr__(self): return '' % (self.attr) where attr is a dict or some other object with an awkward repr. That's the part I'm unsure of. Of course no eval will help you there either. I don't know if there's any way to really replace repr's implementation everywhere; I'm guessing there isn't. Ian -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Fri Mar 2 05:02:02 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Thu, 1 Mar 2012 23:02:02 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: Message-ID: On Thu, Mar 1, 2012 at 10:51 PM, Ian Bicking wrote: > Personally I never copy from the interactive prompt, but instead write my > doctest and, if I'm in the mood to copy and paste, copy from the failed > doctest error message (which is nicely indented just like my tests). ?So it > would work fine if things were sorted. I both copy from the interactive prompt and write myself by hand. Either way, I don't want to care about order where order doesn't matter. The less thought required to write a test or example, the easier it is to write a test or example, and therefore the more tests/examples people will write. Why should we ever care about the order of a dict when writing an example or test case? > where attr is a dict or some other object with an awkward repr. ?That's the > part I'm unsure of. ?Of course no eval will help you there either. ?I don't > know if there's any way to really replace repr's implementation everywhere; > I'm guessing there isn't. There isn't any (non-insane?) way to replace the repr that %r uses (which is not builtins.repr). -- Devin From steve at pearwood.info Fri Mar 2 06:36:57 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 02 Mar 2012 16:36:57 +1100 Subject: [Python-ideas] doctest In-Reply-To: <4F5042F3.8000507@pearwood.info> References: <4F5042F3.8000507@pearwood.info> Message-ID: <4F505C79.4060501@pearwood.info> Steven D'Aprano wrote: > This approach doesn't try to be too clever: it's a dumb, understandable > test which should fit in nicely with the other tests in > doctest.OutputChecker.check_output, perhaps something like this: > > > if optionflags & IGNORE_WORD_ORDER: > if sorted(got.split()) == sorted(want.split()): > return True Ah, buggarit, too simple. I neglected to take into account the delimiters. Getting this right is harder than I thought, particularly with nested sets/dicts. Still, I reckon a directive is the right approach. -- Steven From aquavitae69 at gmail.com Fri Mar 2 06:56:55 2012 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 2 Mar 2012 07:56:55 +0200 Subject: [Python-ideas] doctest In-Reply-To: <4F505C79.4060501@pearwood.info> References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> Message-ID: It seems that the problem with any solution based on interpreting repr (especially when nothing in know about the object) is that there are just too many exceptions. Another approach might be to allow for a custom compare function to be defined on doctest. E.g., in the module to be tested: import doctest def _compare(got, expected): return (sorted(eval(got)) == sorted(eval(expected)) or doctest.compare(got, expected)) doctest.usercompare = _compare The compare function would only need to deal with the idiosyncrasies of types actually used in doctests in that module. I don't know how practical this idea is in terms of implementation - from my brief look through the code I think it should be fairly easy to slot this into OutputChecker, but it was a very brief look! David On Fri, Mar 2, 2012 at 7:36 AM, Steven D'Aprano wrote: > Steven D'Aprano wrote: > > This approach doesn't try to be too clever: it's a dumb, understandable >> test which should fit in nicely with the other tests in >> doctest.OutputChecker.check_**output, perhaps something like this: >> >> >> if optionflags & IGNORE_WORD_ORDER: >> if sorted(got.split()) == sorted(want.split()): >> return True >> > > Ah, buggarit, too simple. I neglected to take into account the delimiters. > > Getting this right is harder than I thought, particularly with nested > sets/dicts. > > Still, I reckon a directive is the right approach. > > > > -- > Steven > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jeanpierreda at gmail.com Fri Mar 2 07:55:05 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 2 Mar 2012 01:55:05 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> Message-ID: > On Fri, Mar 2, 2012 at 7:36 AM, Steven D'Aprano wrote: >> Still, I reckon a directive is the right approach. Why? That's how I do it because I am/was paranoid about compatibility, but surely fixing dicts is important enough that, done right, nobody would object if the semantics of comparison change subtly to allow for unordered container comparison in the "natural" way? (Is replying to a quoted quote acceptable on mailing lists?) On Fri, Mar 2, 2012 at 12:56 AM, David Townshend wrote: > It seems that the problem with any solution based on interpreting repr > (especially when nothing in know about the object) is that there are just > too many exceptions.Another approach might be to allow for a custom > compare function to be defined on doctest. ?E.g., in the module to be > tested: The definition/use of an alternate comparison function needs to be inside the doctests. Two issues: Suppose we're running the doctests on module A, which defines a different compare function. Module B also defines a different comparison function, and is imported but not run as a doctest. Since both of them did a global set-attribute to set the comparison function, but B did it later, B wins and A's doctests are run under the rules for module B. Also, don't forget that doctests are quite often run in things that are _not_ python source files. In particular, tutorial-like documentation these days is frequently in the form of reStructuredText. > import doctest > > def _compare(got, expected): > ? ? return (sorted(eval(got)) == sorted(eval(expected)) or > ? ? ? ? doctest.compare(got, expected)) > > doctest.usercompare = _compare This function is wrong in the context of the above discussion. Why sort a dict or set? Worse, why sort a list or tuple? > The compare function would only need to deal with the idiosyncrasies?of > types actually used in doctests in that module. Punting it to a user-defined function is nice for _really_ crazy situations, but dicts and sets are not idiosyncratic or in any way exceptional. doctest itself should handle them the way a naive user would expect. -- Devin From merwok at netwok.org Fri Mar 2 08:01:48 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Fri, 02 Mar 2012 08:01:48 +0100 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> Message-ID: <4F50705C.9090801@netwok.org> Le 02/03/2012 07:55, Devin Jeanpierre a ?crit : > [...] dicts and sets are not idiosyncratic or in any way > exceptional. doctest itself should handle them the way a naive user > would expect. This discussion seems to forget a core issue with doctest: The output lines can be *anything* that gets printed. eval-able reprs of Python objects are only a part of the possibilities. That?s why doctest cannot ?just call sorted? on the output lines. Regards From greg at krypto.org Fri Mar 2 08:52:05 2012 From: greg at krypto.org (Gregory P. Smith) Date: Thu, 1 Mar 2012 23:52:05 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 Message-ID: Something I'd like to bring up at the language summit next week if we have time (suggestion: limit it to 20 minutes), lets start discussion now: 1) Should _*all_* Python base types support keyword arguments on all of their methods. 2) Should _*all_* stdlib extension modules support keyword arguments on all of their functions and methods? Assumptions: any newly supported keyword arguments would have meaningful names (and the documentation updated to reflect their name). How should "_*all_*" be defined in each of the above. One example against _*all_* is trivial single-argument functions such as chr(). One example for is the str.find method's start parameter being more obvious if named. An example of a rule of thumb that could be proposed: * All optional arguments should also be accepted as keywords. * Any public function or method that require multiple arguments should accept them as keywords. Why propose this? For consistency and to promote readable code. A reasonably often heard complaint from people new to Python that I have heard is how some things work as keyword arguments and some don't. Pure Python code can't (easily without gross hacks) be written to refuse to accept keyword arguments, the standard types and library should not be different. Documentation and docstrings have at times been inconsistent in the past with respect to what they call arguments and would be cleaned up as part of this to match the actual keyword argument accepted (that is worth doing on its own even without changing what the code accepts). A decision with discussion notes on this may be PEP worthy. for reference - this is http://bugs.python.org/issue8706 -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 2 09:43:05 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 02 Mar 2012 19:43:05 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: Message-ID: <4F508819.7000809@pearwood.info> Gregory P. Smith wrote: > Something I'd like to bring up at the language summit next week if we have > time (suggestion: limit it to 20 minutes), lets start discussion now: > > 1) Should _*all_* Python base types support keyword arguments on all of > their methods. > 2) Should _*all_* stdlib extension modules support keyword arguments on all > of their functions and methods? +1 on adding keyword arguments to built-in methods and functions where they would help readability, e.g str.find(c, start=23), even if this happens in a ad-hoc fashion. +0 on forcing *all* built-in methods and functions to be updated to take keyword arguments out of a sense of purity, e.g. ord(char='c'). I think that "all built-ins should take keywords, so as to minimise the difference between them and pure-Python functions" is an admirable ideal. But it is an ideal, a nice-to-have rather than a must-have. -- Steven From aquavitae69 at gmail.com Fri Mar 2 12:07:16 2012 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 2 Mar 2012 13:07:16 +0200 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> Message-ID: On Mar 2, 2012 8:55 AM, "Devin Jeanpierre" wrote: > > > On Fri, Mar 2, 2012 at 7:36 AM, Steven D'Aprano wrote: > >> Still, I reckon a directive is the right approach. > > Why? That's how I do it because I am/was paranoid about compatibility, > but surely fixing dicts is important enough that, done right, nobody > would object if the semantics of comparison change subtly to allow for > unordered container comparison in the "natural" way? > > (Is replying to a quoted quote acceptable on mailing lists?) > > > On Fri, Mar 2, 2012 at 12:56 AM, David Townshend wrote: > > It seems that the problem with any solution based on interpreting repr > > (especially when nothing in know about the object) is that there are just > > too many exceptions.Another approach might be to allow for a custom > > compare function to be defined on doctest. E.g., in the module to be > > tested: > > The definition/use of an alternate comparison function needs to be > inside the doctests. Two issues: > > Suppose we're running the doctests on module A, which defines a > different compare function. Module B also defines a different > comparison function, and is imported but not run as a doctest. Since > both of them did a global set-attribute to set the comparison > function, but B did it later, B wins and A's doctests are run under > the rules for module B. That was just a quick example of another approach to the problem. Sure, there are some issues to work out, but I don't believe this is an insurmountable problem. > > Also, don't forget that doctests are quite often run in things that > are _not_ python source files. In particular, tutorial-like > documentation these days is frequently in the form of > reStructuredText. > Once again, I'm sure we could find a way around this. Perhaps it would also be acceptable to define the compare function inside the docstring, or in this case inside a rst comment. > > import doctest > > > > def _compare(got, expected): > > return (sorted(eval(got)) == sorted(eval(expected)) or > > doctest.compare(got, expected)) > > > > doctest.usercompare = _compare > > This function is wrong in the context of the above discussion. Why > sort a dict or set? Worse, why sort a list or tuple? Like I said, this is just a quick example. Obviously the function body could look very different. > > > The compare function would only need to deal with the idiosyncrasies of > > types actually used in doctests in that module. > > Punting it to a user-defined function is nice for _really_ crazy > situations, but dicts and sets are not idiosyncratic or in any way > exceptional. doctest itself should handle them the way a naive user > would expect. So what about, say, a defaultdict or a WeakSet? What exactly would a naive user expect? > > -- Devin -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Fri Mar 2 13:13:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 2 Mar 2012 22:13:13 +1000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F508819.7000809@pearwood.info> References: <4F508819.7000809@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 6:43 PM, Steven D'Aprano wrote: > +1 on adding keyword arguments to built-in methods and functions where they > would help readability, e.g str.find(c, start=23), even if this happens in a > ad-hoc fashion. Indeed, this is the approach we have taken to date. For example, str.split() recently gained keyword support for 3.3 because "text.split(maxsplit=1)" is less cryptic than "text.split(None, 1)". It makes the most sense when at least one of the following holds: - the second argument accepts a number that is unclear if you're not familiar with the full function signature - the earlier arguments have sensible default values that you'd prefer not to override So +1 on declaring "make X support keyword arguments" non-controversial for multi-argument functions, +0 on also doing so for single argument functions, but -0 on attempting to boil the ocean and fix them wholesale. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jeanpierreda at gmail.com Fri Mar 2 13:56:33 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 2 Mar 2012 07:56:33 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 6:07 AM, David Townshend wrote: > That was just a quick example of another approach to the problem. Sure, > there are some issues to work out, but I don't believe this is an > insurmountable problem. Nor do I. I was attempting to offer constructive criticism on the basis that this is a serious suggestion, and deserves attention. Sorry that I gave the wrong impression. > Once again, I'm sure we could find a way around this. Perhaps it would also > be acceptable to define the compare function inside the docstring, or in > this case inside a rst comment. I mentioned it earlier, but I think you missed it: I was actually thinking inside the doctest itself. Your system of global assignment works as-is if you do it inside the doctests themselves (except for threads, but who runs doctests in multiple threads? gah!) > So what about, say, a defaultdict or a WeakSet? What exactly would a naive > user expect? WeakSets shouldn't be tested like this, their contents are nondeterministic. Any expectations can be broken by unfortunate race conditions, and there is no way around this. Sometimes users might expect the impossible, but what they expect is irrelevant in such a case. So forget WeakSets. I think that a naive user would expect, w.r.t. all these things, that he copy-pasted a shell session, it would "just work". Where we can fix doctest to align with such lofty expectations, at least in the common cases -- such as with dicts and defaultdicts and so on -- is it really so bad? Or, for a different argument -- surely, if the natural way to write an example in a tutorial is to show a dict as a return value, then we should be able to test that with minimal fuss? Doctest is supposed to serve the documentation, not the other way around; and the harder things are, the higher the barrier to entry is, and the fewer people actually do it. Because testing is important, it's important that testing be as easy as reasonable and possible. -- Devin From jeanpierreda at gmail.com Fri Mar 2 14:05:05 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 2 Mar 2012 08:05:05 -0500 Subject: [Python-ideas] doctest In-Reply-To: <4F50705C.9090801@netwok.org> References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50705C.9090801@netwok.org> Message-ID: On Fri, Mar 2, 2012 at 2:01 AM, ?ric Araujo wrote: > Le 02/03/2012 07:55, Devin Jeanpierre a ?crit : >> [...] dicts and sets are not idiosyncratic or in any way >> exceptional. doctest itself should handle them the way a naive user >> would expect. > > This discussion seems to forget a core issue with doctest: The output > lines can be *anything* that gets printed. ?eval-able reprs of Python > objects are only a part of the possibilities. ?That?s why doctest cannot > ?just call sorted? on the output lines. Ah, well, it sort of can. You can treat eval-able reprs of python objects specially. But it gets messy. Simple treatments make the treatment of e.g. dicts irreparably inconsistent. in doctest 2:: This is totally fine >>> {-1:1, -2:1} # doctest: +LITERAL_EVAL {-1:1, -2:1} So is this (because doctest2 doesn't care that it was _printed_, just that it was output): >>> print({-1:1, -2:1}) # doctest: +LITERAL_EVAL {-1:1, -2:1} This is not; don't do it bad dog bad bad bad dog (doctest2 has no idea how to separate the repr from the printed text): >>> print(".", {-1:1, -2:1}) # doctest: +LITERAL_EVAL . {-1:1, -2:1} I think maybe this behaviour is a little surprising, or at least a little dumb. The solution I've had in mind is to only do object comparison if the thing in the interpreter window is an expression, rather than a statement. In such a scheme, only the first example would be safe. But aside from not being a very useful distinction anyway, this doesn't agree with the Python interpreter: displaying the last evaluated expression even happens inside a statement. ">>> a;" is totally fine and will display the repr of a. In fact: >>> for d in [{}, {-1:1}, {-2:1}, {-1:1, -2:1}]: d ... {} {-1: 1} {-2: 1} {-2: 1, -1: 1} I can't really think of a way that makes sense and works everywhere, except specifically marking up doctests with things like, "this is a repr'd string; compare by object equality", and "this is a printed string", and so on. That is no small change, but it's tempting. Of course Sphinx would need to be able to turn this into a viewable example. My only worry is that nobody would use it because it's dumb or something, and it's hard to make it not dumb. >>> for d in [{}, {-1:1}, {-2:1}, {-1:1, -2:1}]: # doctest: +schematic ... print("---"); d ... --- {} --- {-1: 1} --- {-2: 1} --- {-2: 1, -1: 1} (Except seriously.) -- Devin From mikegraham at gmail.com Fri Mar 2 15:49:19 2012 From: mikegraham at gmail.com (Mike Graham) Date: Fri, 2 Mar 2012 09:49:19 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50705C.9090801@netwok.org> Message-ID: On Fri, Mar 2, 2012 at 8:05 AM, Devin Jeanpierre wrote: > ? ?This is totally fine > ? ? ? ?>>> {-1:1, -2:1} # doctest: +LITERAL_EVAL > ? ? ? ?{-1:1, -2:1} Doing an ast.literal_eval seems like a great feature (and maybe even a sensible default after a string comparison fails). Having a real eval as an explicit doctest option would also make doctest more powerful. Mike From steve at pearwood.info Fri Mar 2 15:57:52 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 03 Mar 2012 01:57:52 +1100 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> Message-ID: <4F50DFF0.8040400@pearwood.info> Devin Jeanpierre wrote: >> On Fri, Mar 2, 2012 at 7:36 AM, Steven D'Aprano wrote: >>> Still, I reckon a directive is the right approach. > > Why? That's how I do it because I am/was paranoid about compatibility, > but surely fixing dicts is important enough that, done right, nobody > would object if the semantics of comparison change subtly to allow for > unordered container comparison in the "natural" way? I would. doctest doesn't compare dicts. If you want to compare dicts, write your doctest like this: >>> d = make_dict(...) >>> d == {'a': 42, 'b': 23, 'c': None} True The dicts will be compared directly, and doctest need only care that the result looks like "True". doctest compares *strings*. Under no circumstances do I want the default doctest comparison to try to be "clever" by guessing when I want strings to match using string equality and when I want strings to match using something else. doctest should Do What I Say, and not try to Do What It Guesses I Mean. [...] > Punting it to a user-defined function is nice for _really_ crazy > situations, but dicts and sets are not idiosyncratic or in any way > exceptional. doctest itself should handle them the way a naive user > would expect. No it shouldn't. doctest should handle them the way they actually are. A naive user might expect that {'None': 1} == {None: 1}. A naive user might expect that {2: 'spam'} == {2: 'SPAM'}. The problem with trying to satisfy naive users is that their expectations are often wrong or poorly thought-out. You, the author of the package being tested, is the best person to judge what your tests should be, not some arbitrary naive user who may know nothing about Python and even less about your package. -- Steven From jeanpierreda at gmail.com Fri Mar 2 17:18:20 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 2 Mar 2012 11:18:20 -0500 Subject: [Python-ideas] doctest In-Reply-To: <4F50DFF0.8040400@pearwood.info> References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 9:57 AM, Steven D'Aprano wrote: > doctest doesn't compare dicts. If you want to compare dicts, write your > doctest like this: > >>>> d = make_dict(...) >>>> d == {'a': 42, 'b': 23, 'c': None} > True Of course doctest doesn't compare dicts; if it did, nobody would be objecting to its lack of unordered dict comparison. That aside, your example is not how people do it in the shell, why should I do it that way in my documentation? Just because doctest makes me? Pff. My preferred solution is to change doctest to compare dicts. :) > doctest compares *strings*. Under no circumstances do I want the default > doctest comparison to try to be "clever" by guessing when I want strings to > match using string equality and when I want strings to match using something > else. You keep saying what doctest does now, as if that should affect what it does in the future. :/ By the way, doctest doesn't do that now. ;) With regards to exception handling, doctest doesn't compare traceback strings to traceback strings, it compares an exception object to a (badly) parsed traceback. doctest isn't string comparison everywhere, just most places. (Of course, it does the comparison by doing a string comparison on the exception message.) As it happens, as a result, doctest exceptions are very hard to screw up (except for SyntaxErrors). The biggest benefit is that you can copy-paste a traceback, and doctest doesn't care when the stack frames differ in details (like line numbers, for example). Or you can use "..." without enabling ELLIPSIS ;) Not to mention that it lets you use two different forms of exception header that come up in different versions of Python, and it still works in other versions of Python without problems. So many benefits from doctest not trying to be a strict "do what I say" string comparison! :p >> Punting it to a user-defined function is nice for _really_ crazy >> situations, but dicts and sets are not idiosyncratic or in any way >> exceptional. doctest itself should handle them the way a naive user >> would expect. > > > No it shouldn't. doctest should handle them the way they actually are. Yeah, that's what I think. Except I think that they "actually are" dicts, and you think they're strings. Your opinion doesn't make sense to me. They are only strings because that's what doctest turned the dicts into for convenience. There is no reason in particular that it has to ever turn them into strings at all -- the only thing making alternatives inconvenient is the syntax for specifying doctests, not the internal mechanisms of doctest itself. There's nothing holy about these string comparisons. They are only a means to an end. Also, could you give something more concrete about why you believe everything must be based on strings? I couldn't find any reasoning to that effect in your post. Also keep in mind that I'm not fond of literal_eval-ing _both_ sides, I'd much rather only the doctest be eval'd. (In case that affects your answer any.) -- Devin From aquavitae69 at gmail.com Fri Mar 2 18:17:16 2012 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 2 Mar 2012 19:17:16 +0200 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> Message-ID: > > I mentioned it earlier, but I think you missed it: I was actually > thinking inside the doctest itself. Your system of global assignment > works as-is if you do it inside the doctests themselves (except for > threads, but who runs doctests in multiple threads? gah!) > My only concern with that is muddying the documentation with long instructions on how to run the doctests. I'm not saying its not the best way though, because I do think the global assignments are a problem which would need to be resolved. Another way, though, would be to store compare functions by the module they were defined in and run assign them to the tests accordingly. This might end up quite complicated though. I think that a naive user would expect, w.r.t. all these things, that > he copy-pasted a shell session, it would "just work". Where we can fix > doctest to align with such lofty expectations, at least in the common > cases -- such as with dicts and defaultdicts and so on -- is it really > so bad? > This would be convenient, but, to play devil's advocate, how about testing this class? class NotADict: def repr(self): return '{'a': 1} Bearing in mind that the purpose of doctests is documentation rather than testing, maybe we're trying to do too much. The problem is that {1, 2, 3} doesn't match {2, 1, 3}, not that {1, 2, 3} might match something else that looks like {1, 2, 3}. My point is that maybe its better to err on the side of bad tests passing rather than good tests failing. After all, they should ideally all be repeated in a more controlled unit test environment anyway. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Mar 2 18:30:31 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 09:30:31 -0800 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> Message-ID: Devin, You need to start writing real code rather than continue to tell us that the problems are minor and easily fixable, and the solutions are uncontroversial. To those who have tried and thought about it, the problems are *not* easy to solve , except for some superficial edge cases that you and other critics of doctest keep focusing on. And please don't propose that we change the behavior of dict or other data types itself, or add new APIs to objects just for the purpose of "fixing" doctest's issues. -- --Guido van Rossum (python.org/~guido) From jeanpierreda at gmail.com Fri Mar 2 18:59:43 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 2 Mar 2012 12:59:43 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 12:30 PM, Guido van Rossum wrote: > Devin, > > You need to start writing real code rather than continue to tell us > that the problems are minor and easily fixable, and the solutions are > uncontroversial. To those who have tried and thought about it, the > problems are *not* easy to solve , except for some superficial edge > cases that you and other critics of doctest keep focusing on. I already did write real code. In the context of this discussion, I implemented a +LITERAL_EVAL flag. Was there something else I was supposed to write, other than the solution I advocated? ;) https://bitbucket.org/devin.jeanpierre/doctest2/src/e084a682ccbc/doctest2/constants.py#cl-122 > And please don't propose that we change the behavior of dict or other > data types itself, or add new APIs to objects just for the purpose of > "fixing" doctest's issues. I would never dream of it. That's pretty obscene. -- Devin From guido at python.org Fri Mar 2 19:14:41 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 10:14:41 -0800 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 9:59 AM, Devin Jeanpierre wrote: > On Fri, Mar 2, 2012 at 12:30 PM, Guido van Rossum wrote: >> Devin, >> >> You need to start writing real code rather than continue to tell us >> that the problems are minor and easily fixable, and the solutions are >> uncontroversial. To those who have tried and thought about it, the >> problems are *not* easy to solve , except for some superficial edge >> cases that you and other critics of doctest keep focusing on. > > I already did write real code. In the context of this discussion, I > implemented a +LITERAL_EVAL flag. > > Was there something else I was supposed to write, other than the > solution I advocated? ;) > > https://bitbucket.org/devin.jeanpierre/doctest2/src/e084a682ccbc/doctest2/constants.py#cl-122 It's not a solution. It's a hack that only works in the simplest cases -- it requires the output to look like a Python expression (that can be evaluated in a limited environment). What if the output were something like ??? There's a dict in there but the whole thing is not parseable. >> And please don't propose that we change the behavior of dict or other >> data types itself, or add new APIs to objects just for the purpose of >> "fixing" doctest's issues. > > I would never dream of it. That's pretty obscene. Good. I wasn't sure what you meant when you used the phrase "fix dict" -- I presume that was shorthand for "fix the problem that doctest has with dict". -- --Guido van Rossum (python.org/~guido) From jeanpierreda at gmail.com Fri Mar 2 19:24:21 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 2 Mar 2012 13:24:21 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 1:14 PM, Guido van Rossum wrote: >> https://bitbucket.org/devin.jeanpierre/doctest2/src/e084a682ccbc/doctest2/constants.py#cl-122 > > It's not a solution. It's a hack that only works in the simplest cases The simplest cases are also the most common, IME. But yes, I'd like to expand it to work in a few more cases and be less insane (I'd rather not interpret printed output as Python code). > -- it requires the output to look like a Python expression (that can > be evaluated in a limited environment). What if the output were > something like > > > > ??? There's a dict in there but the whole thing is not parseable. Yeah. I don't know of any solution to that either. Even if pure-python code generates that repr, it isn't possible to even replace the repr with a sorted dict-repr in any sane way, because %r doesn't defer to repr. It's just an intractible case. Nothing at all mentioned in this thread would work there. And I never called it easy, by the way. -- Devin From mwm at mired.org Fri Mar 2 19:27:54 2012 From: mwm at mired.org (Mike Meyer) Date: Fri, 2 Mar 2012 13:27:54 -0500 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> Message-ID: <20120302132754.5dfcad6e@bhuda.mired.org> Can I ask a possibly silly question? As I understand it, doctest takes a small snippet of code, runs it, and compares the resulting string with a string in the document. This thread seems to be centered around making comparisons of results with indeterminate ordering (dicts being the prime example) work properly. In fact, one proposal was to have doctest call sorted on the output to make sure it's right, which was shot down because that's not always the correct thing to do. So the question is - why isn't dealing with this the responsibility of the test writer? Yeah, it's not quite the spirit of documentation to turn a dictionary into a sorted list in the output, but neither is littering the documentation with +LITERAL_EVAL and the like. http://www.mired.org/ Independent Software developer/SCM consultant, email for more information. O< ascii ribbon campaign - stop html mail - www.asciiribbon.org From jeanpierreda at gmail.com Fri Mar 2 19:35:34 2012 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Fri, 2 Mar 2012 13:35:34 -0500 Subject: [Python-ideas] doctest In-Reply-To: <20120302132754.5dfcad6e@bhuda.mired.org> References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> <20120302132754.5dfcad6e@bhuda.mired.org> Message-ID: On Fri, Mar 2, 2012 at 1:27 PM, Mike Meyer wrote: > This thread seems to be centered around making comparisons of results > with indeterminate ordering (dicts being the prime example) work > properly. In fact, one proposal was to have doctest call sorted on the > output to make sure it's right, which was shot down because that's not > always the correct thing to do. Aahhh, that was me, and I didn't mean to shoot it down. The right modification would have been to typecheck for dict/set before you sort, and then format it like a string, IIRC. But at the time the function just returned a sorted object. In principle you can absolutely sort dict literals etc., but I don't think it's any easier to implement than just parsing them into dict objects and doing a direct dict comparison, so that's why I object to it. In addition, it's harder for the test writer, who now has to pay attention to ordering. > So the question is - why isn't dealing with this the responsibility of > the test writer? Yeah, it's not quite the spirit of documentation to > turn a dictionary into a sorted list in the output, but neither is > littering the documentation with +LITERAL_EVAL and the like. The thing about +LITERAL_EVAL and the other flags is that modern doctest-displaying tools like Sphinx hide the comments, so that you just see what looks like a regular interpreter session, without any # doctest: directives. Because of this, it's in principle possible to have "natural looking" things, in some cases. But yes, the status quo is that, somehow, you have to handle this explicitly yourself. -- Devin From mikegraham at gmail.com Fri Mar 2 19:42:14 2012 From: mikegraham at gmail.com (Mike Graham) Date: Fri, 2 Mar 2012 13:42:14 -0500 Subject: [Python-ideas] doctest In-Reply-To: <20120302132754.5dfcad6e@bhuda.mired.org> References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> <20120302132754.5dfcad6e@bhuda.mired.org> Message-ID: On Fri, Mar 2, 2012 at 1:27 PM, Mike Meyer wrote: > So the question is - why isn't dealing with this the responsibility of > the test writer? Yeah, it's not quite the spirit of documentation to > turn a dictionary into a sorted list in the output, but neither is > littering the documentation with +LITERAL_EVAL and the like. Currently the test-writer is not empowered to exercise this responsibility most effectively. Earlier, an example was presented that one should write >>> d = make_dict(...) >>> d == {'a': 42, 'b': 23, 'c': None} True rather than the implied >>> d = make_dict(...) {'a': 42, 'b': 23, 'c': None} Using the former is certainly necessary today, but it's far from ideal. The documentor is stuck either writing the documentation clearly (the latter case) or testably (the former). The hope is to find a way to let people more easily write documentation that is both clear and testable. One thing that covers some--but far from all--cases is to ast.literal_eval or eval the output, since very often that output line is a repr of a Python expression. To be reasonable, this might require a directive in both cases and certainly requires one in the latter case. Though there are still tons of situations this cannot cover, it would allow people writing documentation to avoid ugly constructs like the first code snippet in a not-tiny set of cases. As for littering your code with "+doctest BLAH_BLAH", I don't think this is all that harmful. It allows the documentation writer get features she wants and will not display to the user in the processed documentation. There are already directives like this today and though ugly, they are conventional. Mike From guido at python.org Fri Mar 2 20:28:01 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 11:28:01 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 4:13 AM, Nick Coghlan wrote: > On Fri, Mar 2, 2012 at 6:43 PM, Steven D'Aprano wrote: >> +1 on adding keyword arguments to built-in methods and functions where they >> would help readability, e.g str.find(c, start=23), even if this happens in a >> ad-hoc fashion. > > Indeed, this is the approach we have taken to date. For example, > str.split() recently gained keyword support for 3.3 because > "text.split(maxsplit=1)" is less cryptic than "text.split(None, 1)". > > It makes the most sense when at least one of the following holds: > - the second argument accepts a number that is unclear if you're not > familiar with the full function signature > - the earlier arguments have sensible default values that you'd prefer > not to override > > So +1 on declaring "make X support keyword arguments" > non-controversial for multi-argument functions, +0 on also doing so > for single argument functions, but -0 on attempting to boil the ocean > and fix them wholesale. Hm. I think for many (most?) 1-arg and selected 2-arg functions (and rarely 3+-arg functions) this would reduce readability, as the example of ord(char=x) showed. I would actually like to see a syntactic feature to state that an argument *cannot* be given as a keyword argument (just as we already added syntax to state that it *must* be a keyword). One area where I think adding keyword args is outright wrong: Methods of built-in types or ABCs and that are overridable. E.g. consider the pop() method on dict. Since the argument name is currently undocumented, if someone subclasses dict and overrides this method, or if they create another mutable mapping class that tries to emulate dict using duck typing, it doesn't matter what the argument name is -- all the callers (expecting a dict, a dict subclass, or a dict-like duck) will be using positional arguments in the call. But if we were to document the argument names for pop(), and users started to use these, then most dict sublcasses and ducks would suddenly be broken (except if by luck they happened to pick the same name). -- --Guido van Rossum (python.org/~guido) From arnodel at gmail.com Fri Mar 2 20:42:58 2012 From: arnodel at gmail.com (Arnaud Delobelle) Date: Fri, 2 Mar 2012 19:42:58 +0000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On 2 March 2012 19:28, Guido van Rossum wrote: > On Fri, Mar 2, 2012 at 4:13 AM, Nick Coghlan wrote: >> On Fri, Mar 2, 2012 at 6:43 PM, Steven D'Aprano wrote: >>> +1 on adding keyword arguments to built-in methods and functions where they >>> would help readability, e.g str.find(c, start=23), even if this happens in a >>> ad-hoc fashion. >> >> Indeed, this is the approach we have taken to date. For example, >> str.split() recently gained keyword support for 3.3 because >> "text.split(maxsplit=1)" is less cryptic than "text.split(None, 1)". >> >> It makes the most sense when at least one of the following holds: >> - the second argument accepts a number that is unclear if you're not >> familiar with the full function signature >> - the earlier arguments have sensible default values that you'd prefer >> not to override >> >> So +1 on declaring "make X support keyword arguments" >> non-controversial for multi-argument functions, +0 on also doing so >> for single argument functions, but -0 on attempting to boil the ocean >> and fix them wholesale. > > Hm. I think for many (most?) 1-arg and selected 2-arg functions (and > rarely 3+-arg functions) this would reduce readability, as the example > of ord(char=x) showed. > > I would actually like to see a syntactic feature to state that an > argument *cannot* be given as a keyword argument (just as we already > added syntax to state that it *must* be a keyword). There was a discussion about this on this list in 2007. I wrote some decorators to implement it this functionality. Here's one at http://code.activestate.com/recipes/521874-functions-with-positional-only-arguments/?in=user-4059385 (note that it didn't attract a lot of attention !). The recipe also refers to the original discussion. -- Arnaud From guido at python.org Fri Mar 2 21:00:14 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 12:00:14 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: I've written such decorators too, but they've got quite a bit of overhead... --Guido van Rossum (sent from Android phone) On Mar 2, 2012 11:43 AM, "Arnaud Delobelle" wrote: > On 2 March 2012 19:28, Guido van Rossum wrote: > > On Fri, Mar 2, 2012 at 4:13 AM, Nick Coghlan wrote: > >> On Fri, Mar 2, 2012 at 6:43 PM, Steven D'Aprano > wrote: > >>> +1 on adding keyword arguments to built-in methods and functions where > they > >>> would help readability, e.g str.find(c, start=23), even if this > happens in a > >>> ad-hoc fashion. > >> > >> Indeed, this is the approach we have taken to date. For example, > >> str.split() recently gained keyword support for 3.3 because > >> "text.split(maxsplit=1)" is less cryptic than "text.split(None, 1)". > >> > >> It makes the most sense when at least one of the following holds: > >> - the second argument accepts a number that is unclear if you're not > >> familiar with the full function signature > >> - the earlier arguments have sensible default values that you'd prefer > >> not to override > >> > >> So +1 on declaring "make X support keyword arguments" > >> non-controversial for multi-argument functions, +0 on also doing so > >> for single argument functions, but -0 on attempting to boil the ocean > >> and fix them wholesale. > > > > Hm. I think for many (most?) 1-arg and selected 2-arg functions (and > > rarely 3+-arg functions) this would reduce readability, as the example > > of ord(char=x) showed. > > > > I would actually like to see a syntactic feature to state that an > > argument *cannot* be given as a keyword argument (just as we already > > added syntax to state that it *must* be a keyword). > > There was a discussion about this on this list in 2007. I wrote some > decorators to implement it this functionality. Here's one at > > > http://code.activestate.com/recipes/521874-functions-with-positional-only-arguments/?in=user-4059385 > > (note that it didn't attract a lot of attention !). The recipe also > refers to the original discussion. > > -- > Arnaud > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Mar 2 21:32:21 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 02 Mar 2012 12:32:21 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: <4F512E55.6050305@stoneleaf.us> Guido van Rossum wrote: > I would actually like to see a syntactic feature to state that an > argument *cannot* be given as a keyword argument (just as we already > added syntax to state that it *must* be a keyword). So something like: def ord(char, ?): def split(self, char, ?, count) def canary(breed, ?, color, wingspan, *, name) ~Ethan~ From arnodel at gmail.com Fri Mar 2 22:09:59 2012 From: arnodel at gmail.com (Arnaud Delobelle) Date: Fri, 2 Mar 2012 21:09:59 +0000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On 2 March 2012 20:00, Guido van Rossum wrote: > On Mar 2, 2012 11:43 AM, "Arnaud Delobelle" wrote: >> On 2 March 2012 19:28, Guido van Rossum wrote: >> > I would actually like to see a syntactic feature to state that an >> > argument *cannot* be given as a keyword argument (just as we already >> > added syntax to state that it *must* be a keyword). >> >> There was a discussion about this on this list in 2007. ?I wrote some >> decorators to implement it this functionality. ?Here's one at >> >> >> http://code.activestate.com/recipes/521874-functions-with-positional-only-arguments/?in=user-4059385 >> >> (note that it didn't attract a lot of attention !). ?The recipe also >> refers to the original discussion. > > I've written such decorators too, but they've got quite a bit of overhead... The one in the above recipe (which is for 2.X) doesn't incur any runtime overhead - although it is a bit hackish as it changes the 'co_varnames' attribute of the function's code object. -- Arnaud From guido at python.org Fri Mar 2 22:48:15 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 13:48:15 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 1:09 PM, Arnaud Delobelle wrote: > On 2 March 2012 20:00, Guido van Rossum wrote: >> On Mar 2, 2012 11:43 AM, "Arnaud Delobelle" wrote: >>> On 2 March 2012 19:28, Guido van Rossum wrote: > >>> > I would actually like to see a syntactic feature to state that an >>> > argument *cannot* be given as a keyword argument (just as we already >>> > added syntax to state that it *must* be a keyword). >>> >>> There was a discussion about this on this list in 2007. ?I wrote some >>> decorators to implement it this functionality. ?Here's one at >>> >>> >>> http://code.activestate.com/recipes/521874-functions-with-positional-only-arguments/?in=user-4059385 >>> >>> (note that it didn't attract a lot of attention !). ?The recipe also >>> refers to the original discussion. >> >> I've written such decorators too, but they've got quite a bit of overhead... > > The one in the above recipe (which is for 2.X) doesn't incur any > runtime overhead - although it is a bit hackish as it changes the > 'co_varnames' attribute of the function's code object. So it's not written in Python -- it uses CPython specific hacks. -- --Guido van Rossum (python.org/~guido) From greg at krypto.org Fri Mar 2 23:01:36 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 2 Mar 2012 14:01:36 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 12:00 PM, Guido van Rossum wrote: > I've written such decorators too, but they've got quite a bit of > overhead... > yeah those fall into the gross hacks I alluded to in my original post. ;) I intentionally decided to leave out discussion of "should we allow positional-only arguments to be declared in Python" but it is a valid discussion and thing to consider... if we go that route, could it be possible to implement range([start=0, ] stop[, step=1]) such that they are positional only but mutliple arguments are treated different than strictly sequential without writing conditional code in Python to figure out each one's meaning at runtime. speaking of range... I think start and stop are plenty obvious, but I'd like to allow step to be specified as a keyword. -gps > --Guido van Rossum (sent from Android phone) > On Mar 2, 2012 11:43 AM, "Arnaud Delobelle" wrote: > >> On 2 March 2012 19:28, Guido van Rossum wrote: >> > On Fri, Mar 2, 2012 at 4:13 AM, Nick Coghlan >> wrote: >> >> On Fri, Mar 2, 2012 at 6:43 PM, Steven D'Aprano >> wrote: >> >>> +1 on adding keyword arguments to built-in methods and functions >> where they >> >>> would help readability, e.g str.find(c, start=23), even if this >> happens in a >> >>> ad-hoc fashion. >> >> >> >> Indeed, this is the approach we have taken to date. For example, >> >> str.split() recently gained keyword support for 3.3 because >> >> "text.split(maxsplit=1)" is less cryptic than "text.split(None, 1)". >> >> >> >> It makes the most sense when at least one of the following holds: >> >> - the second argument accepts a number that is unclear if you're not >> >> familiar with the full function signature >> >> - the earlier arguments have sensible default values that you'd prefer >> >> not to override >> >> >> >> So +1 on declaring "make X support keyword arguments" >> >> non-controversial for multi-argument functions, +0 on also doing so >> >> for single argument functions, but -0 on attempting to boil the ocean >> >> and fix them wholesale. >> > >> > Hm. I think for many (most?) 1-arg and selected 2-arg functions (and >> > rarely 3+-arg functions) this would reduce readability, as the example >> > of ord(char=x) showed. >> > >> > I would actually like to see a syntactic feature to state that an >> > argument *cannot* be given as a keyword argument (just as we already >> > added syntax to state that it *must* be a keyword). >> >> There was a discussion about this on this list in 2007. I wrote some >> decorators to implement it this functionality. Here's one at >> >> >> http://code.activestate.com/recipes/521874-functions-with-positional-only-arguments/?in=user-4059385 >> >> (note that it didn't attract a lot of attention !). The recipe also >> refers to the original discussion. >> >> -- >> Arnaud >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Mar 2 23:18:58 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 14:18:58 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 2:01 PM, Gregory P. Smith wrote: > > On Fri, Mar 2, 2012 at 12:00 PM, Guido van Rossum wrote: >> >> I've written such decorators too, but they've got quite a bit of >> overhead... > > yeah those fall into the gross hacks I alluded to in my original post. ;) > > I intentionally decided to leave out discussion of "should we allow > positional-only arguments to be declared in Python" but it is a valid > discussion and thing to consider... I just want to remain realistic and acknowledge that positional arguments have their place. > if we go that route, could it be possible to implement range([start=0, ] > stop[, step=1]) such that they are positional only but mutliple arguments > are treated different than strictly sequential without writing conditional > code in Python to figure out each one's meaning at runtime. Eew, I don't think this pattern is useful enough to support in syntax, even if one of the most popular builtins (but only one!) uses it. > speaking of range... I think start and stop are plenty obvious, but I'd like > to allow step to be specified as a keyword. That's fine, range() is not overloadable anyway. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Fri Mar 2 23:23:59 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 02 Mar 2012 17:23:59 -0500 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F512E55.6050305@stoneleaf.us> References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> Message-ID: On 3/2/2012 3:32 PM, Ethan Furman wrote: > Guido van Rossum wrote: >> I would actually like to see a syntactic feature to state that an >> argument *cannot* be given as a keyword argument (just as we already >> added syntax to state that it *must* be a keyword). I think this is what we need. I see the problem as being that a) C and Python functions work differently, and b) the doc does not -- and should not -- specify the implementation. One solution is to make all C functions work like Python functions. The other is to allow Python functions to work like C functions. Given the reasonable opposition to the first, we need the second. > So something like: > > def ord(char, ?): > > def split(self, char, ?, count) > > def canary(breed, ?, color, wingspan, *, name) That is probably better than using '$' or directly tagging the names. -- Terry Jan Reedy From greg at krypto.org Fri Mar 2 23:26:47 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 2 Mar 2012 14:26:47 -0800 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: References: Message-ID: On Wed, Feb 29, 2012 at 6:24 AM, Michael Foord wrote: > > > On 29 February 2012 08:23, Nick Coghlan wrote: > >> >> One way to handle this case is to use a separate if statement to make >> the flow control clear. >> >> with cm() as run_body: >> if run_body: >> # Do stuff >> >> Depending on the use case, the return value from __enter__ may be a >> simple flag as shown, or it may be a more complex object. >> > > > The trouble with this is it indents all your code an extra level. One > possibility would be allowing continue in a with statement as an early exit: > > with cm() as run_body: > if not run_body: > continue > > -1 on this as an early __exit__. It would be context dependent. For with statements within a loop body, a continue today continues to the next loop iteration. Introducing this syntax would call into question what continue does... exit the with statement within the loop body? or continue the loop (also exiting the with statement but skipping all other code in the loop body)? for x in range(5): with y() as run_body: if not run_body: continue print x, run_body Changing the existing continue semantics would break existing code and adding continue semantics to exit a context manager that care if a with is within a loop body or not seems very unwise and surprising. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg at krypto.org Fri Mar 2 23:36:42 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 2 Mar 2012 14:36:42 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 2:18 PM, Guido van Rossum wrote: > On Fri, Mar 2, 2012 at 2:01 PM, Gregory P. Smith wrote: > > > > On Fri, Mar 2, 2012 at 12:00 PM, Guido van Rossum > wrote: > >> > >> I've written such decorators too, but they've got quite a bit of > >> overhead... > > > > yeah those fall into the gross hacks I alluded to in my original post. ;) > > > > I intentionally decided to leave out discussion of "should we allow > > positional-only arguments to be declared in Python" but it is a valid > > discussion and thing to consider... > > I just want to remain realistic and acknowledge that positional > arguments have their place. > +1 > > if we go that route, could it be possible to implement range([start=0, ] > > stop[, step=1]) such that they are positional only but mutliple arguments > > are treated different than strictly sequential without writing > conditional > > code in Python to figure out each one's meaning at runtime. > > Eew, I don't think this pattern is useful enough to support in syntax, > even if one of the most popular builtins (but only one!) uses it. > Technically more than one, if you consider slice() separate from range()... but they are related so I'm willing to consider them "one" ;) anyways, agreed. keeping it simple makes sense. Though the syntax proposals so far aren't looking great to me. I need to stare at them longer. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Mar 2 23:49:08 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 02 Mar 2012 14:49:08 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> Message-ID: <4F514E64.2040406@stoneleaf.us> Terry Reedy wrote: > On 3/2/2012 3:32 PM, Ethan Furman wrote: >> Guido van Rossum wrote: >>> I would actually like to see a syntactic feature to state that an >>> argument *cannot* be given as a keyword argument (just as we already >>> added syntax to state that it *must* be a keyword). > > I think this is what we need. I see the problem as being that a) C and > Python functions work differently, and b) the doc does not -- and should > not -- specify the implementation. One solution is to make all C > functions work like Python functions. The other is to allow Python > functions to work like C functions. Given the reasonable opposition to > the first, we need the second. > >> So something like: >> >> def ord(char, ?): >> >> def split(self, char, ?, count) >> >> def canary(breed, ?, color, wingspan, *, name) > > That is probably better than using '$' or directly tagging the names. I chose '?' because it has some similarity to an incompletely-drawn 'p', and also because it suggests a sort of vagueness, as in not being able to specify the name of the argument. I do not know if it is the best possible way, and am looking forward to other ideas. ~Ethan~ From guido at python.org Fri Mar 2 23:46:43 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 14:46:43 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F514E64.2040406@stoneleaf.us> References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> Message-ID: On Fri, Mar 2, 2012 at 2:49 PM, Ethan Furman wrote: > Terry Reedy wrote: >> >> On 3/2/2012 3:32 PM, Ethan Furman wrote: >>> >>> Guido van Rossum wrote: >>>> >>>> I would actually like to see a syntactic feature to state that an >>>> argument *cannot* be given as a keyword argument (just as we already >>>> added syntax to state that it *must* be a keyword). >> >> >> I think this is what we need. I see the problem as being that a) C and >> Python functions work differently, and b) the doc does not -- and should not >> -- specify the implementation. One solution is to make all C functions work >> like Python functions. The other is to allow Python functions to work like C >> functions. Given the reasonable opposition to the first, we need the second. >> >>> So something like: >>> >>> def ord(char, ?): >>> >>> def split(self, char, ?, count) >>> >>> def canary(breed, ?, color, wingspan, *, name) >> >> >> That is probably better than using '$' or directly tagging the names. > > > I chose '?' because it has some similarity to an incompletely-drawn 'p', and > also because it suggests a sort of vagueness, as in not being able to > specify the name of the argument. > > I do not know if it is the best possible way, and am looking forward to > other ideas. I'd rather not start using a new punctuation character for this one very limited purpose; it might prevent us from using ? for some other more generic purpose in the future. Alternative proposal: how about using '/' ? It's kind of the opposite of '*' which means "keyword argument", and '/' is not a new character. -- --Guido van Rossum (python.org/~guido) From yselivanov.ml at gmail.com Fri Mar 2 23:55:50 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 2 Mar 2012 17:55:50 -0500 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> Message-ID: <5AF0E200-52CA-4A36-AC34-AA9657A122A9@gmail.com> On 2012-03-02, at 5:46 PM, Guido van Rossum wrote: > Alternative proposal: how about using '/' ? It's kind of the opposite > of '*' which means "keyword argument", and '/' is not a new character. How about ';'? Is it possible to re-use it in this context? def (a; b, *, c) def (; b) - Yury P.S. Sometimes I feel nostalgic for the moratorium... From ethan at stoneleaf.us Sat Mar 3 00:12:37 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 02 Mar 2012 15:12:37 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <5AF0E200-52CA-4A36-AC34-AA9657A122A9@gmail.com> References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> <5AF0E200-52CA-4A36-AC34-AA9657A122A9@gmail.com> Message-ID: <4F5153E5.9040503@stoneleaf.us> Yury Selivanov wrote: > On 2012-03-02, at 5:46 PM, Guido van Rossum wrote: >> Alternative proposal: how about using '/' ? It's kind of the opposite >> of '*' which means "keyword argument", and '/' is not a new character. > > How about ';'? Is it possible to re-use it in this context? > > def (a; b, *, c) > def (; b) Hmm -- not sure that is obvious enough. Also, your second example doesn't need the semi-colon at all. ~Ethan~ From ethan at stoneleaf.us Sat Mar 3 00:09:19 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 02 Mar 2012 15:09:19 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> Message-ID: <4F51531F.6040404@stoneleaf.us> Guido van Rossum wrote: > On Fri, Mar 2, 2012 at 2:49 PM, Ethan Furman wrote: >> Terry Reedy wrote: >>> On 3/2/2012 3:32 PM, Ethan Furman wrote: >>> >>>> So something like: >>>> >>>> def ord(char, ?): >>>> >>>> def split(self, char, ?, count) >>>> >>>> def canary(breed, ?, color, wingspan, *, name) >>> >>> That is probably better than using '$' or directly tagging the names. >> >> I chose '?' because it has some similarity to an incompletely-drawn 'p', and >> also because it suggests a sort of vagueness, as in not being able to >> specify the name of the argument. > > I'd rather not start using a new punctuation character for this one > very limited purpose; it might prevent us from using ? for some other > more generic purpose in the future. > > Alternative proposal: how about using '/' ? It's kind of the opposite > of '*' which means "keyword argument", and '/' is not a new character. > So it would look like: def ord(char, /): def split(self, char, /, count) def canary(breed, /, color, wingspan, *, name) I think I like that better -- it stands out, and it looks like a barrier between the positional-only and the positional-keyword arguments. ~Ethan~ From ncoghlan at gmail.com Sat Mar 3 00:42:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 3 Mar 2012 09:42:01 +1000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Sat, Mar 3, 2012 at 5:28 AM, Guido van Rossum wrote: >> So +1 on declaring "make X support keyword arguments" >> non-controversial for multi-argument functions, +0 on also doing so >> for single argument functions, but -0 on attempting to boil the ocean >> and fix them wholesale. > > Hm. I think for many (most?) 1-arg and selected 2-arg functions (and > rarely 3+-arg functions) this would reduce readability, as the example > of ord(char=x) showed. Yeah, on reflection, I'm actually -0 on adding keyword arg support to 1-arg functions. > I would actually like to see a syntactic feature to state that an > argument *cannot* be given as a keyword argument (just as we already > added syntax to state that it *must* be a keyword). I currently write such code as: def f(*args): arg1, arg2, arg3 = args This gives rubbish error messages when the caller makes a mistake, but it works. The obvious syntactic alternative is allowing tuple expansion specifically for *args: def f(*(arg1, arg2, arg3)): pass Then the interpreter would have enough info to still generate nice error messages, and we don't have to invent much in the way of new syntax. > One area where I think adding keyword args is outright wrong: Methods > of built-in types or ABCs and that are overridable. E.g. consider the > pop() method on dict. Since the argument name is currently > undocumented, if someone subclasses dict and overrides this method, or > if they create another mutable mapping class that tries to emulate > dict using duck typing, it doesn't matter what the argument name is -- > all the callers (expecting a dict, a dict subclass, or a dict-like > duck) will be using positional arguments in the call. But if we were > to document the argument names for pop(), and users started to use > these, then most dict sublcasses and ducks would suddenly be broken > (except if by luck they happened to pick the same name). Good point. The other use case is APIs like the dict constructor and dict.update which are designed to accept arbitrary keyword arguments, so you don't want to reserve particular names in the calling argument namespace for your positional arguments. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From tjreedy at udel.edu Sat Mar 3 00:45:29 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 02 Mar 2012 18:45:29 -0500 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> Message-ID: On 3/2/2012 5:46 PM, Guido van Rossum wrote: > Alternative proposal: how about using '/' ? It's kind of the opposite > of '*' which means "keyword argument", and '/' is not a new character. It took me a moment to get the pun on div / being the inverse of mul *. I like it. Very clever -- and memorable! -- Terry Jan Reedy From pyideas at rebertia.com Sat Mar 3 01:01:08 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Fri, 2 Mar 2012 16:01:08 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 3:42 PM, Nick Coghlan wrote: > On Sat, Mar 3, 2012 at 5:28 AM, Guido van Rossum wrote: >> I would actually like to see a syntactic feature to state that an >> argument *cannot* be given as a keyword argument (just as we already >> added syntax to state that it *must* be a keyword). > > I currently write such code as: > > ? ? def f(*args): > ? ? ? ? arg1, arg2, arg3 = args > > This gives rubbish error messages when the caller makes a mistake, but it works. > > The obvious syntactic alternative is allowing tuple expansion > specifically for *args: > > ? def f(*(arg1, arg2, arg3)): > ? ? ?pass > > Then the interpreter would have enough info to still generate nice > error messages, and we don't have to invent much in the way of new > syntax. Kinda incongruous with PEP 3113 though. - Chris From ethan at stoneleaf.us Sat Mar 3 00:56:48 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 02 Mar 2012 15:56:48 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: <4F515E40.80006@stoneleaf.us> Nick Coghlan wrote: > On Sat, Mar 3, 2012 at 5:28 AM, Guido van Rossum wrote: >> I would actually like to see a syntactic feature to state that an >> argument *cannot* be given as a keyword argument (just as we already >> added syntax to state that it *must* be a keyword). > > I currently write such code as: > > def f(*args): > arg1, arg2, arg3 = args > > This gives rubbish error messages when the caller makes a mistake, but it works. > > The obvious syntactic alternative is allowing tuple expansion > specifically for *args: > > def f(*(arg1, arg2, arg3)): > pass The problem with that is we then have '*' doing double duty as both tuple unpacking and keyword-only in the function signature: def herd(*(size, location), *, breed) ~Ethan~ From mikegraham at gmail.com Sat Mar 3 01:56:35 2012 From: mikegraham at gmail.com (Mike Graham) Date: Fri, 2 Mar 2012 19:56:35 -0500 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F515E40.80006@stoneleaf.us> References: <4F508819.7000809@pearwood.info> <4F515E40.80006@stoneleaf.us> Message-ID: On Fri, Mar 2, 2012 at 6:56 PM, Ethan Furman wrote: > The problem with that is we then have '*' doing double duty as both tuple > unpacking and keyword-only in the function signature: > > ? ?def herd(*(size, location), *, breed) It's not `def herd(*args, *, breed)`, so I don't see why it would be `def herd(*(size, location), *, breed)`. I think Nick's syntax is the right one, although adding the feature to Python is probably not a good idea. Mike From raymond.hettinger at gmail.com Sat Mar 3 03:22:32 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 2 Mar 2012 18:22:32 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Mar 2, 2012, at 2:01 PM, Gregory P. Smith wrote: > speaking of range... I think start and stop are plenty obvious, but I'd like to allow step to be specified as a keyword. range() has been around 20+ years and this has never been requested. In my teaching of Python, it is never arisen as an issue. AFAICT, there isn't any code that would be better if step were written as a keyword. The only expressed motivation for the change is "I'd like" it. There should be a higher bar for changing builtins. Many of the proposals in this thread are gratuitous and will create unnecessary work for other people who have to change anything that purports to have a range-like interface, people who have to change the other Python implementations, folks who who have to remember which version of Python supports it and which other slice-like functions would also take the argument etc. ISTM that having a ton of tiny nit changes to the language doesn't make it better. Instead, effort should be directed as substantive changes (better https support, completing xmlrpc, etc). Micro rearrangements of the language and a real PITA for folks who have to go back-and-forth between different versions of Python. So, we should raise the bar to something higher than "I'd like feature X" and ask for examples of code that would be better or for user requests or some actual demonstration of need. ISTM that 20 years of history with range() suggests that no one needs this (nor have I seen a need in any other language with functions that take a start/stop/step). Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Sat Mar 3 03:26:29 2012 From: dreamingforward at gmail.com (Mark Janssen) Date: Fri, 2 Mar 2012 19:26:29 -0700 Subject: [Python-ideas] doctest In-Reply-To: References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 11:14 AM, Guido van Rossum wrote: > On Fri, Mar 2, 2012 at 9:59 AM, Devin Jeanpierre > wrote: > > On Fri, Mar 2, 2012 at 12:30 PM, Guido van Rossum > wrote:> Was there something else I was supposed to write, other than the > > solution I advocated? ;) > > > > > https://bitbucket.org/devin.jeanpierre/doctest2/src/e084a682ccbc/doctest2/constants.py#cl-122 > > It's not a solution. It's a hack that only works in the simplest cases > With all respect to Guido, who has mentioned probably the best solution so far (using sys.displayhook()) , on Devin's behalf, I must say that for those of us dedicated to TDD using doctest, we tend to be writing code in tandem to an ideal that doctest engenders; generally speaking: everything being written is so fine grained, that the problems that most people are speaking of, never arise. E. g., the tests are small and uncomplicated because they are written right to the core of the source, there is semi-rigid protocol for __repr__ in line with being eval'able, and code is broken apart so that when the tests gets too complicated, it's a sign that the code is not modular or fine-grained enough. Just as the way OOP eventually evolved into the ideal of the "abstract base class" as a way to match the notion of physical objects with programmatic ones. Python + doctest approach a different apex, catching two birds with one stone: documentation + TDD. Just my small input... mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Sat Mar 3 03:57:52 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 03 Mar 2012 13:57:52 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F51531F.6040404@stoneleaf.us> References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> <4F51531F.6040404@stoneleaf.us> Message-ID: <4F5188B0.30601@pearwood.info> Ethan Furman wrote: > Guido van Rossum wrote: >> On Fri, Mar 2, 2012 at 2:49 PM, Ethan Furman wrote: >>> Terry Reedy wrote: >>>> On 3/2/2012 3:32 PM, Ethan Furman wrote: >>>> >>>>> So something like: >>>>> >>>>> def ord(char, ?): >>>>> >>>>> def split(self, char, ?, count) >>>>> >>>>> def canary(breed, ?, color, wingspan, *, name) >>>> >>>> That is probably better than using '$' or directly tagging the names. >>> >>> I chose '?' because it has some similarity to an incompletely-drawn >>> 'p', and >>> also because it suggests a sort of vagueness, as in not being able to >>> specify the name of the argument. >> >> I'd rather not start using a new punctuation character for this one >> very limited purpose; it might prevent us from using ? for some other >> more generic purpose in the future. >> >> Alternative proposal: how about using '/' ? It's kind of the opposite >> of '*' which means "keyword argument", and '/' is not a new character. >> > > So it would look like: > > def ord(char, /): > > def split(self, char, /, count) > > def canary(breed, /, color, wingspan, *, name) > > > I think I like that better -- it stands out, and it looks like a barrier > between the positional-only and the positional-keyword arguments. Urrggg, ugly and hard to read. Imagine, if you will: def spam(x, /, y, /, z, /, a=2/3, /): ... Placing the tag after the argument as an extra parameter is not the right approach in my opinion. It's excessively verbose, and it puts the tag in the wrong place: as you read from left-to-right, you see "normal argument, no, wait, it's positional only". The tag should prefix the name. With keyworld-only arguments, the * parameter is special because it flags a point in the parameter list, not an individual parameter: you read "normal arg, normal arg, start keyword-only args, keyword-only arg, ...". I believe that the right place to tag the parameter is in the parameter itself, not by adding an extra parameter after it. Hence, something like this: def spam(~x, ~y, ~z, ~a=2/3): ... where ~name means that name cannot be specified by keyword. I read it as "not name", as in, the caller can't use the name. Or if you prefer Guido's pun: def spam(/x, /y, /z, /a=2/3): ... Much less line-noise than spam(x, /, y, /, z, /, a=2/3, /). Personally, I think this is somewhat of an anti-feature. Keyword arguments are a Good Thing, and while I don't believe it is good enough to *force* all C functions to support them, I certainly don't want to discourage Python functions from supporting them. -- Steven From steve at pearwood.info Sat Mar 3 04:07:02 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 03 Mar 2012 14:07:02 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: <4F518AD6.7000307@pearwood.info> Guido van Rossum wrote: >> if we go that route, could it be possible to implement range([start=0, ] >> stop[, step=1]) such that they are positional only but mutliple arguments >> are treated different than strictly sequential without writing conditional >> code in Python to figure out each one's meaning at runtime. > > Eew, I don't think this pattern is useful enough to support in syntax, > even if one of the most popular builtins (but only one!) uses it. I read this at first that you didn't approve of the range API. I agree that the API is too specialized to take syntactical support, but I'd just like to put my hand up and say I like range's API and have very occasionally used it for my own functions. I see no point in having special syntax for it: this is a specialized enough case that I don't mind handling it manually. -- Steven From steve at pearwood.info Sat Mar 3 04:20:09 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 03 Mar 2012 14:20:09 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: <4F518DE9.6070600@pearwood.info> Raymond Hettinger wrote: > On Mar 2, 2012, at 2:01 PM, Gregory P. Smith wrote: > >> speaking of range... I think start and stop are plenty obvious, but I'd like to allow step to be specified as a keyword. > > range() has been around 20+ years and this has never been requested. I have frequently felt that range(start=5, stop=27, step=2) reads better than range(5, 27, 2), but not better *enough* to bother requesting a change, particular given the conservative nature of the Python devs to such minor interface changes (and rightly so). If it came to a vote, I'd vote 0 on the status quo, +0 to allow all three arguments to be optionally given by keyword, and -1 for singling step out as the only one that can be a keyword. That's just silly (sorry Gregory!) -- can you imagine explaining to a beginner why they can write range(0, 50, 2) or range(0, 50, step=2) but not range(0, stop=50, step=2)? -- Steven From guido at python.org Sat Mar 3 04:31:06 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 19:31:06 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F5188B0.30601@pearwood.info> References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> <4F51531F.6040404@stoneleaf.us> <4F5188B0.30601@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 6:57 PM, Steven D'Aprano wrote: > I believe that the right place to tag the parameter is in the parameter > itself, not by adding an extra parameter after it. Hence, something like > this: > > def spam(~x, ~y, ~z, ~a=2/3): > ? ?... > > where ~name means that name cannot be specified by keyword. I read it as > "not name", as in, the caller can't use the name. > > Or if you prefer Guido's pun: > > def spam(/x, /y, /z, /a=2/3): > ? ?... > > Much less line-noise than spam(x, /, y, /, z, /, a=2/3, /). That can't be right -- if a parameter is positional, surely all parameters before it are also positional, so it would be redundant to have to mark all of them up. Also ~name looks too much like an expression and /name looks just weird (I think DOS command line flags used to look like this). > Personally, I think this is somewhat of an anti-feature. Keyword arguments > are a Good Thing, and while I don't believe it is good enough to *force* all > C functions to support them, I certainly don't want to discourage Python > functions from supporting them. And yet people invent decorators and other hacks to insist on positional parameters all the time. You *can* have Too Much of a Good Thing, and for readability it's better if calls are consistent. If most calls to a function use positional arguments (at least for the first N positions), it's better to force *all* calls to use positional arguments 1-N: the reader may be unfamiliar with the parameter names. Also remember the subclassing issue I brought up before. That said, I can't come up with a syntax that I really like. Here's my best attempt, but I'm at most -0 on it: Have a stand-alone '/' indicate "all parameters to my left must be positional", just like a stand-alone '*' means "all parameters to my right must be keywords". If there's no stand-alone '*' it is assumed to be all the way on the right; so if there's no '/' it is assumed to be all the way on the left. The following should all be valid: def foo(/, a, b): ... # edge case, same as def foo(a, b): ... def foo(a, b, /): ... # all positional def foo(a, b=1, /): ... # all positional, b optional def foo(a, b, /, c, d): ... # a, b positional; c, d required and either positional or keyword def foo(a, b, /, c, d=1): ... # a, b positional; c required, d optional; c, d either positional or keyword def foo(a, b, /, c, *, d): ... # a, b positional; c required and either positional or keyword; d required keyword def foo(a, b=1, /, c=1, *, d=1): ... # same, but b, c, d optional That about covers it. But agreed it's no thing of beauty, so let's abandon it. -- --Guido van Rossum (python.org/~guido) From pyideas at rebertia.com Sat Mar 3 04:31:56 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Fri, 2 Mar 2012 19:31:56 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F5188B0.30601@pearwood.info> References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> <4F51531F.6040404@stoneleaf.us> <4F5188B0.30601@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 6:57 PM, Steven D'Aprano wrote: > Ethan Furman wrote: >> Guido van Rossum wrote: >>> On Fri, Mar 2, 2012 at 2:49 PM, Ethan Furman wrote: >>>> Terry Reedy wrote: >>>>> On 3/2/2012 3:32 PM, Ethan Furman wrote: >>>>> >>>>>> So something like: >>>>>> >>>>>> def ord(char, ?): >>>>>> >>>>>> def split(self, char, ?, count) >>>>>> >>>>>> def canary(breed, ?, color, wingspan, *, name) >>>>> >>>>> That is probably better than using '$' or directly tagging the names. >>>> >>>> I chose '?' because it has some similarity to an incompletely-drawn 'p', >>>> and >>>> also because it suggests a sort of vagueness, as in not being able to >>>> specify the name of the argument. >>> >>> I'd rather not start using a new punctuation character for this one >>> very limited purpose; it might prevent us from using ? for some other >>> more generic purpose in the future. >>> >>> Alternative proposal: how about using '/' ? It's kind of the opposite >>> of '*' which means "keyword argument", and '/' is not a new character. >> >> So it would look like: >> >> ?def ord(char, /): >> >> ?def split(self, char, /, count) >> >> ?def canary(breed, /, color, wingspan, *, name) >> >> I think I like that better -- it stands out, and it looks like a barrier >> between the positional-only and the positional-keyword arguments. > > Urrggg, ugly and hard to read. Imagine, if you will: > > def spam(x, /, y, /, z, /, a=2/3, /): > ? ?... > > Placing the tag after the argument as an extra parameter is not the right > approach in my opinion. I don't believe that's what the proposal is anyway. Note Ethan's "barrier" comment. > It's excessively verbose, and it puts the tag in the > wrong place: as you read from left-to-right, you see "normal argument, no, > wait, it's positional only". The tag should prefix the name. > > With keyworld-only arguments, the * parameter is special because it flags a > point in the parameter list, not an individual parameter: you read "normal > arg, normal arg, start keyword-only args, keyword-only arg, ...". That's in the same vein as what I understand the proposal to be. "/" would flag "end of positional-only args"; there's effectively an implicit leading "/" if you don't use one in a function's definition. > Personally, I think this is somewhat of an anti-feature. Keyword arguments > are a Good Thing, and while I don't believe it is good enough to *force* all > C functions to support them, I certainly don't want to discourage Python > functions from supporting them. +1 I can see not modifying every single C implementation as it's for little gain, and in a bunch of cases concerns 1-or-2-argument functions which are arguably worth special-casing. But making the function definition syntax even more complicated (we have annotations now, remember?) just to allow forcing calls to be (slightly) more restrictive/opaque? Hard to get on board with that. Cheers, Chris From guido at python.org Sat Mar 3 04:36:39 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 2 Mar 2012 19:36:39 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F518DE9.6070600@pearwood.info> References: <4F508819.7000809@pearwood.info> <4F518DE9.6070600@pearwood.info> Message-ID: On Fri, Mar 2, 2012 at 7:20 PM, Steven D'Aprano wrote: > Raymond Hettinger wrote: >> >> On Mar 2, 2012, at 2:01 PM, Gregory P. Smith wrote: >> >>> speaking of range... I think start and stop are plenty obvious, but I'd >>> like to allow step to be specified as a keyword. >> >> >> range() has been around 20+ years and this has never been requested. > > > I have frequently felt that range(start=5, stop=27, step=2) reads better > than range(5, 27, 2), but not better *enough* to bother requesting a change, > particular given the conservative nature of the Python devs to such minor > interface changes (and rightly so). > > If it came to a vote, I'd vote 0 on the status quo, +0 to allow all three > arguments to be optionally given by keyword, and -1 for singling step out as > the only one that can be a keyword. That's just silly (sorry Gregory!) -- > can you imagine explaining to a beginner why they can write range(0, 50, 2) > or range(0, 50, step=2) but not range(0, stop=50, step=2)? There's no need to explain anything to beginners, they just accept whatever rules you give them. It's the people who are no longer beginners but not quite experts you have to deal with. But a true zen master, even a zen-of-Python master, would just hit them over the head with a wooden plank. (Seriously, there are plenty of APIs that have some positional parameters and some keyword parameters, and I see nothing wrong with that. You seem to be too in love with keyword parameters to see clearly. :-) Still, I can't think of a reason why we should upset Raymond over such a minor thing, so let's forget about "fixing" range. And that's final. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Sat Mar 3 08:54:58 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 02 Mar 2012 23:54:58 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> <4F51531F.6040404@stoneleaf.us> <4F5188B0.30601@pearwood.info> Message-ID: <4F51CE52.9050503@stoneleaf.us> Guido van Rossum wrote: > On Fri, Mar 2, 2012 at 6:57 PM, Steven D'Aprano wrote: >> Personally, I think this is somewhat of an anti-feature. Keyword arguments >> are a Good Thing, and while I don't believe it is good enough to *force* all >> C functions to support them, I certainly don't want to discourage Python >> functions from supporting them. > > And yet people invent decorators and other hacks to insist on > positional parameters all the time. You *can* have Too Much of a Good > Thing, and for readability it's better if calls are consistent. If > most calls to a function use positional arguments (at least for the > first N positions), it's better to force *all* calls to use positional > arguments 1-N: the reader may be unfamiliar with the parameter names. > Also remember the subclassing issue I brought up before. > > That said, I can't come up with a syntax that I really like. Here's my > best attempt, but I'm at most -0 on it: Have a stand-alone '/' > indicate "all parameters to my left must be positional", just like a > stand-alone '*' means "all parameters to my right must be keywords". > If there's no stand-alone '*' it is assumed to be all the way on the > right; so if there's no '/' it is assumed to be all the way on the > left. The following should all be valid: > > def foo(/, a, b): ... # edge case, same as def foo(a, b): ... > > def foo(a, b, /): ... # all positional > > def foo(a, b=1, /): ... # all positional, b optional > > def foo(a, b, /, c, d): ... # a, b positional; c, d required and > either positional or keyword > > def foo(a, b, /, c, d=1): ... # a, b positional; c required, d > optional; c, d either positional or keyword > > def foo(a, b, /, c, *, d): ... # a, b positional; c required and > either positional or keyword; d required keyword > > def foo(a, b=1, /, c=1, *, d=1): ... # same, but b, c, d optional > > That about covers it. But agreed it's no thing of beauty, so let's abandon it. > And I was just starting to like it, too. :( Personally, I don't see it as any uglier than having the lone '*' in the signature; although, I don't see lone '*'s all that much, whereas the '/' could be quite prevalent. Is this something we want? We could have a built-in decorator, like property or staticmethod, to make the changes for us (each implementation would have to supply their own, of course): @positional(2) def foo(a, b) Or we could use brackets or more parentheses: def foo([a, b]) def foo((a, b)) That doesn't seem too bad... def foo((a, b=1), c=2, *, d=3) Since tuple-unpacking no longer happens in the definition signature, we don't need the leading * before the parentheses. Just my last two cents (unless the idea garners encouragement, of course ;). ~Ethan~ From storchaka at gmail.com Sat Mar 3 10:21:22 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 03 Mar 2012 11:21:22 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: 03.03.12 00:18, Guido van Rossum ???????(??): > On Fri, Mar 2, 2012 at 2:01 PM, Gregory P. Smith wrote: >> > Eew, I don't think this pattern is useful enough to support in syntax, > even if one of the most popular builtins (but only one!) uses it. range([start,] stop[, step]) slice([start,] stop[, step]) itertools.islice(iterable, [start,] stop [, step]) random.randrange([start,] stop[, step]) syslog.syslog([priority,] message) curses.newwin([nlines, ncols,] begin_y, begin_x) curses.window.addch([y, x,] ch[, attr]) curses.window.addnstr([y, x,] str, n[, attr]) curses.window.addstr([y, x,] str[, attr]) curses.window.chgat([y, x, ] [num,] attr) curses.window.derwin([nlines, ncols,] begin_y, begin_x) curses.window.hline([y, x,] ch, n) curses.window.insch([y, x,] ch[, attr]) curses.window.insnstr([y, x,] str, n [, attr]) curses.window.subpad([nlines, ncols,] begin_y, begin_x) curses.window.subwin([nlines, ncols,] begin_y, begin_x) curses.window.vline([y, x,] ch, n) >> speaking of range... I think start and stop are plenty obvious, but I'd like >> to allow step to be specified as a keyword. > That's fine, range() is not overloadable anyway. There are number of range-like functions in third-party modules. From storchaka at gmail.com Sat Mar 3 10:29:09 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 03 Mar 2012 11:29:09 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F518DE9.6070600@pearwood.info> References: <4F508819.7000809@pearwood.info> <4F518DE9.6070600@pearwood.info> Message-ID: 03.03.12 05:20, Steven D'Aprano ???????(??): > That's just silly (sorry > Gregory!) -- can you imagine explaining to a beginner why they can write > range(0, 50, 2) or range(0, 50, step=2) but not range(0, stop=50, step=2)? You can write print(1, 2, 3, end='') but can't write... what? From storchaka at gmail.com Sat Mar 3 10:53:08 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 03 Mar 2012 11:53:08 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> Message-ID: 03.03.12 00:46, Guido van Rossum ???????(??): > Alternative proposal: how about using '/' ? It's kind of the opposite > of '*' which means "keyword argument", and '/' is not a new character. How about using '**' (and left '/' for some purpose in the future)? From storchaka at gmail.com Sat Mar 3 10:53:59 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 03 Mar 2012 11:53:59 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <5AF0E200-52CA-4A36-AC34-AA9657A122A9@gmail.com> References: <4F508819.7000809@pearwood.info> <4F512E55.6050305@stoneleaf.us> <4F514E64.2040406@stoneleaf.us> <5AF0E200-52CA-4A36-AC34-AA9657A122A9@gmail.com> Message-ID: 03.03.12 00:55, Yury Selivanov ???????(??): > How about ';'? Is it possible to re-use it in this context? > > def (a; b, *, c) > def (; b) How about lambdas? foo = lambda a; b: return a From arnodel at gmail.com Sat Mar 3 14:51:12 2012 From: arnodel at gmail.com (Arnaud Delobelle) Date: Sat, 3 Mar 2012 13:51:12 +0000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On 2 March 2012 21:48, Guido van Rossum wrote: > On Fri, Mar 2, 2012 at 1:09 PM, Arnaud Delobelle wrote: >> On 2 March 2012 20:00, Guido van Rossum wrote: >>> On Mar 2, 2012 11:43 AM, "Arnaud Delobelle" wrote: >>>> On 2 March 2012 19:28, Guido van Rossum wrote: >> >>>> > I would actually like to see a syntactic feature to state that an >>>> > argument *cannot* be given as a keyword argument (just as we already >>>> > added syntax to state that it *must* be a keyword). >>>> >>>> There was a discussion about this on this list in 2007. ?I wrote some >>>> decorators to implement it this functionality. ?Here's one at >>>> >>>> >>>> http://code.activestate.com/recipes/521874-functions-with-positional-only-arguments/?in=user-4059385 >>>> >>>> (note that it didn't attract a lot of attention !). ?The recipe also >>>> refers to the original discussion. >>> >>> I've written such decorators too, but they've got quite a bit of overhead... >> >> The one in the above recipe (which is for 2.X) doesn't incur any >> runtime overhead - although it is a bit hackish as it changes the >> 'co_varnames' attribute of the function's code object. > > So it's not written in Python -- it uses CPython specific hacks. True. OTOH if you decided to put such a decorator in CPython's standard library (and I'm not talking about this specific implementation of the decorator), then implementors of other Pythons would have to provide the same functionality. We would then get the ability to have positional only arguments free of overhead without having to make the syntax of function signatures even more complex. Also, a decorator would be a signal to users that positional only argument are not often necessary, whereas offering syntactical support for them may encourage over use of the feature. -- Arnaud From ron3200 at gmail.com Sat Mar 3 16:20:13 2012 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 03 Mar 2012 09:20:13 -0600 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: <1330788013.30181.27.camel@Gutsy> On Fri, 2012-03-02 at 18:22 -0800, Raymond Hettinger wrote: > > On Mar 2, 2012, at 2:01 PM, Gregory P. Smith wrote: > > > speaking of range... I think start and stop are plenty obvious, but > > I'd like to allow step to be specified as a keyword. > > > range() has been around 20+ years and this has never been requested. > In my teaching of Python, it is never arisen as an issue. > AFAICT, there isn't any code that would be better if step were written > as a keyword. > > > The only expressed motivation for the change is "I'd like" it. > There should be a higher bar for changing builtins. > > > Many of the proposals in this thread are gratuitous and will create > unnecessary work for other people who have to change anything > that purports to have a range-like interface, people who have to > change the other Python implementations, folks who who have > to remember which version of Python supports it and which other > slice-like functions would also take the argument etc. > > > ISTM that having a ton of tiny nit changes to the language > doesn't make it better. Instead, effort should be directed > as substantive changes (better https support, completing xmlrpc, etc). +1 > Micro rearrangements of the language and a real PITA for folks > who have to go back-and-forth between different versions of Python. > So, we should raise the bar to something higher than "I'd like feature > X" > and ask for examples of code that would be better or for user requests > or some actual demonstration of need. ISTM that 20 years of history > with range() suggests that no one needs this (nor have I seen a need > in any other language with functions that take a start/stop/step). On a more general note... It seems to me that sometimes the writer of functions wish to have more control of how the function is called, but I think it is better that the user of a function can select the calling form that perhaps matches the data and/or style they are using more closely. I hope in the future that we find ways to simplify function signatures in a way that make them both easier to use and more efficient for the function user, rather than continue adding specific little tweaks that give the function designer more control over how the function user calls it. My 2cents, Ron From simon.sapin at kozea.fr Sat Mar 3 16:47:21 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Sat, 03 Mar 2012 16:47:21 +0100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <1330788013.30181.27.camel@Gutsy> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: <4F523D09.6060901@kozea.fr> Le 03/03/2012 16:20, Ron Adam a ?crit : > It seems to me that sometimes the writer of functions wish to have more > control of how the function is called, but I think it is better that the > user of a function can select the calling form that perhaps matches the > data and/or style they are using more closely. I agree with that, but it can still make sense to have positional-only arguments. For example, we want d.update(self=4) to update the 'self' key on any Mapping, so the default implementation update on the ABC has to accept *args, **kwargs and have some code to extract self: http://hg.python.org/cpython/file/e67b3a9bd2dc/Lib/collections/abc.py#l511 Without this "hack", passing self=4 would give TypeError: got multiple values for keyword argument 'self'. It would be so much nicer to be able to declare self and other positional-only in "def update(self, other=(), **kwargs):" Regards, -- Simon Sapin From guido at python.org Sat Mar 3 18:54:03 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 3 Mar 2012 09:54:03 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <1330788013.30181.27.camel@Gutsy> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: On Sat, Mar 3, 2012 at 7:20 AM, Ron Adam wrote: > It seems to me that sometimes the writer of functions wish to have more > control of how the function is called, but I think it is better that the > user of a function can select the calling form that perhaps matches the > data and/or style they are using more closely. Um, the function author chooses the signature. If you disagree with that signature, tough luck. > I hope in the future that we find ways to simplify function signatures > in a way that make them both easier to use and more efficient for the > function user, rather than continue adding specific little tweaks that > give the function designer more control over how the function user calls > it. You seem to forget that API design is an art and that it is the function author's prerogative to design an API that minimizes mistakes for all users of the function. Sometimes that includes requiring that everyone uses positional arguments for a certain situation. Anyway, I now think that adding a built-in @positional(N) decorator makes the most sense since it doesn't require changes to the parser. The built-in can be implemented efficiently. This should be an easy patch for someone who wants to contribute some C code. -- --Guido van Rossum (python.org/~guido) From greg at krypto.org Sat Mar 3 20:54:54 2012 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 3 Mar 2012 11:54:54 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Sat, Mar 3, 2012 at 5:51 AM, Arnaud Delobelle wrote: > > Also, a decorator would be a signal to users that positional only > argument are not often necessary, whereas offering syntactical support > for them may encourage over use of the feature. > > +1 Agreed. -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Sat Mar 3 21:01:39 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 03 Mar 2012 15:01:39 -0500 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On 3/3/2012 8:51 AM, Arnaud Delobelle wrote: > True. OTOH if you decided to put such a decorator in CPython's > standard library (and I'm not talking about this specific > implementation of the decorator), then implementors of other Pythons > would have to provide the same functionality. We would then get the > ability to have positional only arguments free of overhead without > having to make the syntax of function signatures even more complex. > > Also, a decorator would be a signal to users that positional only > argument are not often necessary, whereas offering syntactical support > for them may encourage over use of the feature. A decorator does not solve the problem of *documenting* position-only args, unless you propose to put them in the doc also. -- Terry Jan Reedy From tjreedy at udel.edu Sat Mar 3 21:11:14 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sat, 03 Mar 2012 15:11:14 -0500 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: On 3/3/2012 12:54 PM, Guido van Rossum wrote: > Anyway, I now think that adding a built-in @positional(N) decorator > makes the most sense since it doesn't require changes to the parser. > The built-in can be implemented efficiently. This should be an easy > patch for someone who wants to contribute some C code. Would you then be okay with using that in documentation? @positional(1) ord(char) Return the integer code for char If you prefer that to ord(char, /) Return the integer code for char fine with me. I care more about being able to document existing apis for C-implemented functions than about being able to limit Python functions I write. (Of course, being able to make C and Python versions of stdlib modules match would also be great!) Currently, one may need to experiment before using name-passing to be sure it will work, which tends to discourage name-passing of args even when it would be more readable. -- Terry Jan Reedy From guido at python.org Sat Mar 3 21:39:32 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 3 Mar 2012 12:39:32 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: On Sat, Mar 3, 2012 at 12:11 PM, Terry Reedy wrote: > On 3/3/2012 12:54 PM, Guido van Rossum wrote: > >> Anyway, I now think that adding a built-in @positional(N) decorator >> makes the most sense since it doesn't require changes to the parser. >> The built-in can be implemented efficiently. This should be an easy >> patch for someone who wants to contribute some C code. > > > Would you then be okay with using that in documentation? > > @positional(1) > ord(char) > Return the integer code for char > > If you prefer that to > > ord(char, /) > Return the integer code for char > > fine with me. The @positional(1) form looks like it would be easier to understand if you aren't familiar with it than the / form. > I care more about being able to document existing apis for > C-implemented functions than about being able to limit Python functions I > write. (Of course, being able to make C and Python versions of stdlib > modules match would also be great!) Currently, one may need to experiment > before using name-passing to be sure it will work, which tends to discourage > name-passing of args even when it would be more readable. Yeah, so it does make sense to standardize on a solution for this. Let it be @positional(N). Can you file an issue? -- --Guido van Rossum (python.org/~guido) From simon.sapin at kozea.fr Sat Mar 3 22:03:47 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Sat, 03 Mar 2012 22:03:47 +0100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: <4F528733.6070103@kozea.fr> Le 03/03/2012 21:39, Guido van Rossum a ?crit : > Yeah, so it does make sense to standardize on a solution for this. Let > it be @positional(N). Is the N in positional(N) positional-only itself? ;) More seriously, is N required or can we omit it when all arguments are to be positional? Regards, -- Simon Sapin From g.brandl at gmx.net Sat Mar 3 22:29:27 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 03 Mar 2012 22:29:27 +0100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: On 03.03.2012 21:11, Terry Reedy wrote: > On 3/3/2012 12:54 PM, Guido van Rossum wrote: > >> Anyway, I now think that adding a built-in @positional(N) decorator >> makes the most sense since it doesn't require changes to the parser. >> The built-in can be implemented efficiently. This should be an easy >> patch for someone who wants to contribute some C code. > > Would you then be okay with using that in documentation? > > @positional(1) > ord(char) > Return the integer code for char I don't think that is a good idea. We currently put argument default values in the function signatures in Python syntax, but only because that also makes sense from a documentation PoV. We also wouldn't write @property name(self) just because that's (one) way for creating properties from Python. Georg (I am -0 on @positional myself: IMO such a completely different way of declaring positional-only and keyword-only arguments lacks grace.) From cs at zip.com.au Sat Mar 3 22:40:11 2012 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 4 Mar 2012 08:40:11 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: Message-ID: <20120303214010.GA28795@cskk.homeip.net> On 03Mar2012 11:53, Serhiy Storchaka wrote: | 03.03.12 00:46, Guido van Rossum ???????(??): | > Alternative proposal: how about using '/' ? It's kind of the opposite | > of '*' which means "keyword argument", and '/' is not a new character. | | How about using '**' (and left '/' for some purpose in the future)? -1 from me; too much overlap with **kwargs keyword argument insertion in calls. We'd have ** in calls for keyword arguments and ** in definitions for not keyword arguments. -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ What if there were no hypothetical situations? - Jeff Sauder From storchaka at gmail.com Sat Mar 3 22:54:39 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 03 Mar 2012 23:54:39 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <20120303214010.GA28795@cskk.homeip.net> References: <20120303214010.GA28795@cskk.homeip.net> Message-ID: 03.03.12 23:40, Cameron Simpson ???????(??): > On 03Mar2012 11:53, Serhiy Storchaka wrote: > | 03.03.12 00:46, Guido van Rossum ???????(??): > |> Alternative proposal: how about using '/' ? It's kind of the opposite > |> of '*' which means "keyword argument", and '/' is not a new character. > | > | How about using '**' (and left '/' for some purpose in the future)? > > -1 from me; too much overlap with **kwargs keyword argument insertion in > calls. We'd have ** in calls for keyword arguments and ** in definitions > for not keyword arguments. "*identifier" is a tuple receiving any excess positional parameters. "**identifier" is a dictionary receiving any excess keyword arguments. Parameters after "*" are keyword-only parameters. ? Parameters before "**" are positional-only parameters. From cs at zip.com.au Sat Mar 3 23:17:22 2012 From: cs at zip.com.au (Cameron Simpson) Date: Sun, 4 Mar 2012 09:17:22 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: Message-ID: <20120303221722.GA31871@cskk.homeip.net> On 02Mar2012 14:01, Gregory P. Smith wrote: | On Fri, Mar 2, 2012 at 12:00 PM, Guido van Rossum wrote: | > I've written such decorators too, but they've got quite a bit of | > overhead... | > | yeah those fall into the gross hacks I alluded to in my original post. ;) | | I intentionally decided to leave out discussion of "should we allow | positional-only arguments to be declared in Python" but it is a valid | discussion and thing to consider... | | if we go that route, could it be possible to implement range([start=0, ] | stop[, step=1]) such that they are positional only but mutliple arguments | are treated different than strictly sequential without writing conditional | code in Python to figure out each one's meaning at runtime. More excitiingly, one could embed a slice in the hypothetical positional-only syntax to say the 0th, 2nd and 4th parameters are positional-only. Or an arbitrary sequence! Hmm, a callable that returns an iterable at function call time! Sorry, too much coffee... -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Give me the luxuries of life and I will willingly do without the necessities. - Frank Lloyd Wright From ethan at stoneleaf.us Sun Mar 4 00:10:39 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 03 Mar 2012 15:10:39 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: <4F52A4EF.4080303@stoneleaf.us> Georg Brandl wrote: > On 03.03.2012 21:11, Terry Reedy wrote: >> On 3/3/2012 12:54 PM, Guido van Rossum wrote: >> >>> Anyway, I now think that adding a built-in @positional(N) decorator >>> makes the most sense since it doesn't require changes to the parser. >>> The built-in can be implemented efficiently. This should be an easy >>> patch for someone who wants to contribute some C code. >> >> Would you then be okay with using that in documentation? >> >> @positional(1) >> ord(char) >> Return the integer code for char > > I don't think that is a good idea. We currently put argument default > values in the function signatures in Python syntax, but only because > that also makes sense from a documentation PoV. > > We also wouldn't write > > @property > name(self) > > just because that's (one) way for creating properties from Python. > > Georg > > (I am -0 on @positional myself: IMO such a completely different way > of declaring positional-only and keyword-only arguments lacks grace.) Also -0 for the same reason. ~Ethan~ From ron3200 at gmail.com Sun Mar 4 01:15:48 2012 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 03 Mar 2012 18:15:48 -0600 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: <1330820148.30507.149.camel@Gutsy> On Sat, 2012-03-03 at 09:54 -0800, Guido van Rossum wrote: > On Sat, Mar 3, 2012 at 7:20 AM, Ron Adam wrote: > > It seems to me that sometimes the writer of functions wish to have more > > control of how the function is called, but I think it is better that the > > user of a function can select the calling form that perhaps matches the > > data and/or style they are using more closely. > > Um, the function author chooses the signature. If you disagree with > that signature, tough luck. Yes, except the caller does have the option to use the *args and/or **kwds and other variations of packing and unpacking when it's convenient. And yes, I realize it doesn't really change the signature it self, but it is a different way to spell a function call and sometimes is very helpful when the data can be matched to the signature. For example, by not specifying any arguments as keyword only, or position only, both of the following examples work, and the function caller has these options. >>> def foo(a, b): ... return a, b ... >>> kwds = dict(a=1, b=2) >>> foo(**kwds) (1, 2) >>> args = (1, 2) >>> foo(*args) (1, 2) But by specify an argument as keyword only, then it removes the *args option. And also if an argument is specified as position only, then the **kwds spelling wont work. I'm not suggesting there isn't sometimes a need for being more specific, but that quite often it's nicer to let the caller have those options rather than limit the API too narrowly. > > I hope in the future that we find ways to simplify function signatures > > in a way that make them both easier to use and more efficient for the > > function user, rather than continue adding specific little tweaks that > > give the function designer more control over how the function user calls > > it. > > You seem to forget that API design is an art and that it is the > function author's prerogative to design an API that minimizes mistakes > for all users of the function. Sometimes that includes requiring that > everyone uses positional arguments for a certain situation. I was trying to make a more general point which is why I preceded my comments with, "On a more general note...", which was left out of the reply. Yes, it most definitely is an ART to create a good API. And also yes, sometimes minimizing errors take priority, especially when those errors can be very costly. It seems to me, binding an object by name is less likely to be wrong than binding a object by it's position. The advantage of 'by position' is that many objects are stored in ordered lists. ie.. [start, stop, step], [x, y, z]. So it's both easier and more efficient to use the position rather than a name especially if the object can be *unpacked directly into the signature. I just can't think of a good case where I would want to prohibit setting an argument by name on on purpose. But I suppose if I encountered a certain error that may have been caught by doing so, I may think about doing that. Cheers, Ron > Anyway, I now think that adding a built-in @positional(N) decorator > makes the most sense since it doesn't require changes to the parser. > The built-in can be implemented efficiently. This should be an easy > patch for someone who wants to contribute some C code. From guido at python.org Sun Mar 4 01:46:29 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 3 Mar 2012 16:46:29 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <1330820148.30507.149.camel@Gutsy> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <1330820148.30507.149.camel@Gutsy> Message-ID: On Sat, Mar 3, 2012 at 4:15 PM, Ron Adam wrote: > I just can't think of a good case where I would want to prohibit setting > an argument by name on on purpose. ?But I suppose if I encountered a > certain error that may have been caught by doing so, I may think about > doing that. Apparently you skipped part of the thread. -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Sun Mar 4 01:55:02 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 4 Mar 2012 01:55:02 +0100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: <20120304015502.76ac34fc@pitrou.net> On Sat, 3 Mar 2012 09:54:03 -0800 Guido van Rossum wrote: > > > I hope in the future that we find ways to simplify function signatures > > in a way that make them both easier to use and more efficient for the > > function user, rather than continue adding specific little tweaks that > > give the function designer more control over how the function user calls > > it. > > You seem to forget that API design is an art and that it is the > function author's prerogative to design an API that minimizes mistakes > for all users of the function. Sometimes that includes requiring that > everyone uses positional arguments for a certain situation. Those situations are probably very rare. AFAIK we haven't seen anyone mention a serious use case. I think concerns of built-in functions shadowed by Python functions or the reverse are mostly academic, since we don't see anyone complaining about dict-alikes accepting keyword args. (besides, what happened to "consenting adults"? :-)) > Anyway, I now think that adding a built-in @positional(N) decorator > makes the most sense since it doesn't require changes to the parser. -1 on a built-in for that. The functools module would probably be a good recipient (assuming the decorator is useful at all, of course). Regards Antoine. From guido at python.org Sun Mar 4 03:18:52 2012 From: guido at python.org (Guido van Rossum) Date: Sat, 3 Mar 2012 18:18:52 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <20120304015502.76ac34fc@pitrou.net> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <20120304015502.76ac34fc@pitrou.net> Message-ID: On Sat, Mar 3, 2012 at 4:55 PM, Antoine Pitrou wrote: > On Sat, 3 Mar 2012 09:54:03 -0800 > Guido van Rossum wrote: >> >> > I hope in the future that we find ways to simplify function signatures >> > in a way that make them both easier to use and more efficient for the >> > function user, rather than continue adding specific little tweaks that >> > give the function designer more control over how the function user calls >> > it. >> >> You seem to forget that API design is an art and that it is the >> function author's prerogative to design an API that minimizes mistakes >> for all users of the function. Sometimes that includes requiring that >> everyone uses positional arguments for a certain situation. > > Those situations are probably very rare. AFAIK we haven't seen anyone > mention a serious use case. I think concerns of built-in functions > shadowed by Python functions or the reverse are mostly academic, since > we don't see anyone complaining about dict-alikes accepting keyword > args. No, because the base class's insistence on positional args makes it a non-starter to use keyword args. But APIs that are implemented in Python don't have this nudge. Given that some folks here have expressed a desire to use keyword args *everywhere*, which I consider going way overboard, as a readability and consistency advocate I want to be able to remind them strongly in some cases not to do that. > (besides, what happened to "consenting adults"? :-)) We used to say that about the lone star feature too. But it's turned out quite useful, both for readability (require callers to name the options they're passing in) and for allowing evolution of a signature by leaving the option to add another positional argument in the future. Some folks seem to believe that keywords are always better. I want to send a strong message that I disagree. >> Anyway, I now think that adding a built-in @positional(N) decorator >> makes the most sense since it doesn't require changes to the parser. > > -1 on a built-in for that. The functools module would probably be a > good recipient (assuming the decorator is useful at all, of course). TBH, I've never gotten the hang of functools. It seems mostly a refuge for things I don't like; so @positional() doesn't belong there. :-) -- --Guido van Rossum (python.org/~guido) From steve at pearwood.info Sun Mar 4 04:42:30 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 04 Mar 2012 14:42:30 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <20120304015502.76ac34fc@pitrou.net> Message-ID: <4F52E4A6.30508@pearwood.info> Guido van Rossum wrote: > On Sat, Mar 3, 2012 at 4:55 PM, Antoine Pitrou wrote: >> On Sat, 3 Mar 2012 09:54:03 -0800 >> Guido van Rossum wrote: >>>> I hope in the future that we find ways to simplify function signatures >>>> in a way that make them both easier to use and more efficient for the >>>> function user, rather than continue adding specific little tweaks that >>>> give the function designer more control over how the function user calls >>>> it. >>> You seem to forget that API design is an art and that it is the >>> function author's prerogative to design an API that minimizes mistakes >>> for all users of the function. Sometimes that includes requiring that >>> everyone uses positional arguments for a certain situation. >> Those situations are probably very rare. AFAIK we haven't seen anyone >> mention a serious use case. I think concerns of built-in functions >> shadowed by Python functions or the reverse are mostly academic, since >> we don't see anyone complaining about dict-alikes accepting keyword >> args. > > No, because the base class's insistence on positional args makes it a > non-starter to use keyword args. > > But APIs that are implemented in Python don't have this nudge. Given > that some folks here have expressed a desire to use keyword args > *everywhere*, which I consider going way overboard, as a readability > and consistency advocate I want to be able to remind them strongly in > some cases not to do that. I think you're reading too much into what has been a pretty luke-warm response to Gregory's suggestion. As far as I can see, I've been the least negative about the idea, and that was a half-hearted +0. I described it as "nice to have" specifically on the basis of consistency, to minimize the differences between pure-Python functions and built-ins. On reflection, your argument about subclassing built-ins has convinced me to drop that to a -1: if we were designing Python from scratch, it would be a nice-to-have for built-ins to take named arguments, but since they don't, it would be too disruptive to add them en-mass. I'm still +1 on adding named arguments to built-ins where needed, e.g. >>> "spam".find('a', end=1) Traceback (most recent call last): File "", line 1, in TypeError: find() takes no keyword arguments but I hope that would be uncontroversial! >> (besides, what happened to "consenting adults"? :-)) > > We used to say that about the lone star feature too. But it's turned > out quite useful, both for readability (require callers to name the > options they're passing in) and for allowing evolution of a signature > by leaving the option to add another positional argument in the > future. > > Some folks seem to believe that keywords are always better. I want to > send a strong message that I disagree. Positional-only arguments should be considered a new feature that requires justification, not just to send a message against mass re-engineering of built-ins. Some arguments in favour: * Consistency with built-ins. * Functions that take a single argument don't need to be called by keyword. (But is that a reason to prohibit it?) * It gives the API developer another choice when designing their function API. Maybe some people just don't like keyword args. (But is "I don't like them" a good reason to prohibit them? I'd feel more positive about this argument if I knew a good use-case for designing a new function with positional-only arguments.) Arguments against: * YAGNI. The subclass issue is hardly new, and as far as I know, has never been an actual problem in practice. Since people aren't calling subclass methods using keywords, why try to enforce it? * It's another thing to learn about functions. "Does this function take keyword arguments or not?" So far, I'm +0 on this. >>> Anyway, I now think that adding a built-in @positional(N) decorator >>> makes the most sense since it doesn't require changes to the parser. >> -1 on a built-in for that. The functools module would probably be a >> good recipient (assuming the decorator is useful at all, of course). > > TBH, I've never gotten the hang of functools. It seems mostly a refuge > for things I don't like; so @positional() doesn't belong there. :-) What, you don't like @wraps? Astonishing! :-) -- Steven From ncoghlan at gmail.com Sun Mar 4 04:46:33 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 4 Mar 2012 13:46:33 +1000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: On Sun, Mar 4, 2012 at 6:39 AM, Guido van Rossum wrote: > Yeah, so it does make sense to standardize on a solution for this. Let > it be @positional(N). Can you file an issue? How could that even work? Consider the following subset of the Mapping API: class C: @positional(2): def __init__(self, data=None, **kwds): self._stored_data = stored = {} if data is not None: stored.update(data) stored.update(kwds) @positional(2): def update(self, data=None, **kwds): stored = self._stored_data if data is not None: stored.update(data) stored.update(kwds) Without gross hacking of the function internals, there's no way for a decorator to make the following two calls work properly: x = C(self=5, data=10) x.update(self=10, data=5) Both will complain about duplicate "self" and "data" arguments, unless the "positional" decorator truly rips the function definition apart and creates a new one that alters how the interpreter maps arguments to parameters. As Simon Sapin pointed out, the most correct way to write such code currently is to accept *args and unpack it manually, which is indeed exactly how the Mapping ABC implementation currently works [1]. While the Mapping implementation doesn't currently use it, one simple way to write such code is to use a *second* parameter binding step like this: class C: def _unpack_args(self, data=None): return self, data def __init__(*args, **kwds): self, data = C._unpack_args(*args) self._stored_data = stored = {} if data: stored.update(data) stored.update(kwds) def update(*args, **kwds): self, data = C._unpack_args(*args) stored = self._stored_data if data is not None: stored.update(data) stored.update(kwds) The downside, of course, is that the error messages that come out of such a binding operation may be rather cryptic (which is why the Mapping ABC instead uses manual unpacking - so it can generate nice error messages) The difficulty of implementing the Mapping ABC correctly in pure Python is the poster child for why the lack of positional-only argument syntax is a language wart - we define APIs (in C) that work that way, which people then have to reconstruct manually in Python. My proposal is that we simply added a *third* alternative for "*args": a full function parameter specification to be used to bind the positional-only arguments. That is: 1. '*args' collects the additional positional arguments and places them in a tuple 2. '*' disallows any further positional arguments. 3. '*(SPEC)' binds the additional positional arguments according to the parameter specification. In all 3 cases, any subsequent parameter defintions are keyword only. The one restriction placed on the nested SPEC is that it would only allow "*args" at the end. The keyword only argument and positional only argument forms would not be allowed, since they would make no sense (as all arguments to the inner parameter binding operation are positional by design). Then the "_unpack_args" hack above would be unnecessary, and you could just write: class C: def __init__(*(self, data=None), **kwds): self._stored_data = stored = {} if data: stored.update(data) stored.update(kwds) def update(*(self, data=None), **kwds): stored = self._stored_data if data is not None: stored.update(data) stored.update(kwds) The objection was raised that this runs counter to the philosophy behind PEP 3113 (which removed tuple unpacking from function signatures). I disagree: - this is not tuple unpacking, it is parameter binding - it does not apply to arbitrary arguments, solely to the "extra arguments" parameter, which is guaranteed to be a tuple - it allows positional-only arguments to be clearly expressed in the function signature, allowing the *interpreter* to deal with the creation of nice error messages - it *improves* introspection, since the binding of positional only arguments is now expressed clearly in the function header (and associated additional metadata on the function object), rather than being hidden inside the function implementation Regards, Nick. [1] http://hg.python.org/cpython/file/e67b3a9bd2dc/Lib/collections/abc.py#l511 -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ron3200 at gmail.com Sun Mar 4 05:16:29 2012 From: ron3200 at gmail.com (Ron Adam) Date: Sat, 03 Mar 2012 22:16:29 -0600 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <1330820148.30507.149.camel@Gutsy> Message-ID: <1330834589.31666.102.camel@Gutsy> On Sat, 2012-03-03 at 16:46 -0800, Guido van Rossum wrote: > On Sat, Mar 3, 2012 at 4:15 PM, Ron Adam wrote: > > I just can't think of a good case where I would want to prohibit setting > > an argument by name on on purpose. But I suppose if I encountered a > > certain error that may have been caught by doing so, I may think about > > doing that. > > Apparently you skipped part of the thread. Yep, I missed Nicks message where he points out... > The other use case is APIs like the dict constructor and dict.update > which are designed to accept arbitrary keyword arguments, so you don't > want to reserve particular names in the calling argument namespace for > your positional arguments. >>> def dct(a, **kwds): ... return a, kwds >>> dct(42, a=2) Traceback (most recent call last): File "", line 1, in TypeError: dct() got multiple values for keyword argument 'a' Would the positional decorator fix this particular case? It seems like it would work for forcing an error, but not for multiple values with the same name. The way to currently get around this is to use *args along with **kwds. >>> def dct(*args, **kwds): ... (n,) = args # errors here if incorrect number of args. ... return n, kwds ... >>> dct(3, n=7, args=42, kwds='ok') (3, {'args': 42, 'kwds': 'ok', 'n': 7}) The names used with '*' and '**' are already anonymous as far as the foo signature is concerned, so you can use args or kwds as keywords without a problem. I'm not sure what the positional decorator would gains over this. The other use case mentioned is the one where you point out overriding an undocumented variable name. Seems like this is a question of weather or not it is better to make python behavior match the C builtins behavior, vs making the C builtins behavior match python behavior. Cheers, Ron From aquavitae69 at gmail.com Sun Mar 4 08:16:32 2012 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 4 Mar 2012 09:16:32 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <1330834589.31666.102.camel@Gutsy> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <1330820148.30507.149.camel@Gutsy> <1330834589.31666.102.camel@Gutsy> Message-ID: On Sun, Mar 4, 2012 at 6:16 AM, Ron Adam wrote: > On Sat, 2012-03-03 at 16:46 -0800, Guido van Rossum wrote: > > On Sat, Mar 3, 2012 at 4:15 PM, Ron Adam wrote: > > > I just can't think of a good case where I would want to prohibit > setting > > > an argument by name on on purpose. But I suppose if I encountered a > > > certain error that may have been caught by doing so, I may think about > > > doing that. > > > > Apparently you skipped part of the thread. > > Yep, I missed Nicks message where he points out... > > > The other use case is APIs like the dict constructor and dict.update > > which are designed to accept arbitrary keyword arguments, so you don't > > want to reserve particular names in the calling argument namespace for > > your positional arguments. > > > >>> def dct(a, **kwds): > ... return a, kwds > > >>> dct(42, a=2) > Traceback (most recent call last): > File "", line 1, in > TypeError: dct() got multiple values for keyword argument 'a' > > Would the positional decorator fix this particular case? It seems like > it would work for forcing an error, but not for multiple values with the > same name. > > > The way to currently get around this is to use *args along with **kwds. > > >>> def dct(*args, **kwds): > ... (n,) = args # errors here if incorrect number of args. > ... return n, kwds > ... > >>> dct(3, n=7, args=42, kwds='ok') > (3, {'args': 42, 'kwds': 'ok', 'n': 7}) > > The names used with '*' and '**' are already anonymous as far as the foo > signature is concerned, so you can use args or kwds as keywords without > a problem. > > I'm not sure what the positional decorator would gains over this. > > > The other use case mentioned is the one where you point out overriding > an undocumented variable name. Seems like this is a question of weather > or not it is better to make python behavior match the C builtins > behavior, vs making the C builtins behavior match python behavior. > > > Cheers, > Ron > > > > > > > > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > There are two issues being discussed here: 1. A new syntax for positional-only arguments. I don't really see any good use case for this which can't already be dealt with quite easily using *args. True, it means a bit more work on the documentation, but is it really worth adding new syntax (or even a built-in decorator) just for that? 2. How to avoid the problems with name-binding to an intended positional only argument. Once again, this can be dealt with using *args. In both cases it would be nice to be able to avoid having to manually parse *args and **kwargs, but I haven't really seen anything better that the status quo for dealing with them. The only way I see this really working is to somehow bind positional-only arguments without binding each them to a specific name, and the only way I can think of to do that is to store them in a tuple. Perhaps, then, the syntax should reflect a C-style array: # pos(2) indicates 2 positional arguments def f(pos(2), arg1, *args, **kwargs): print(pos) print(arg1) print(args) print(kwargs) >>> f(1, 2, 'other', 'args', pos='arguments') (1, 2) 'other' ('args',) {'pos': 'arguments'} I'm +0 on the whole idea, but only if it deals with both issues. David -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sun Mar 4 15:00:59 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 4 Mar 2012 15:00:59 +0100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <20120304015502.76ac34fc@pitrou.net> <4F52E4A6.30508@pearwood.info> Message-ID: <20120304150059.761c8bb6@pitrou.net> On Sun, 04 Mar 2012 14:42:30 +1100 Steven D'Aprano wrote: > > >>> Anyway, I now think that adding a built-in @positional(N) decorator > >>> makes the most sense since it doesn't require changes to the parser. > >> -1 on a built-in for that. The functools module would probably be a > >> good recipient (assuming the decorator is useful at all, of course). > > > > TBH, I've never gotten the hang of functools. It seems mostly a refuge > > for things I don't like; so @positional() doesn't belong there. :-) > > What, you don't like @wraps? Astonishing! :-) @wraps is actually quite useful. functools contains other decorators such as @lru_cache. I think it's the right place for little-used things like @positional. Regards Antoine. From guido at python.org Sun Mar 4 17:23:23 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 4 Mar 2012 08:23:23 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F52E4A6.30508@pearwood.info> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <20120304015502.76ac34fc@pitrou.net> <4F52E4A6.30508@pearwood.info> Message-ID: On Sat, Mar 3, 2012 at 7:42 PM, Steven D'Aprano wrote: > I'm still +1 on adding named arguments to built-ins where needed, e.g. > >>>> "spam".find('a', end=1) > Traceback (most recent call last): > ?File "", line 1, in > TypeError: find() takes no keyword arguments > > but I hope that would be uncontroversial! No, because str may be subclassed, so this change would break backwards compatibility for subclasses that chose a different name instead of 'end'. -- --Guido van Rossum (python.org/~guido) From aquavitae69 at gmail.com Sun Mar 4 17:24:27 2012 From: aquavitae69 at gmail.com (David Townshend) Date: Sun, 4 Mar 2012 18:24:27 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F538B99.8080607@stoneleaf.us> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <1330820148.30507.149.camel@Gutsy> <1330834589.31666.102.camel@Gutsy> <4F538B99.8080607@stoneleaf.us> Message-ID: On Sun, Mar 4, 2012 at 5:34 PM, Ethan Furman wrote: > David Townshend wrote: > >> There are two issues being discussed here: >> >> 1. A new syntax for positional-only arguments. I don't really see any >> good use case for this which can't already be dealt with quite easily using >> *args. True, it means a bit more work on the documentation, but is it >> really worth adding new syntax (or even a built-in decorator) just for that? >> > > The problem with *args is that it allows 0-N arguments, when we want, say, > 2. > > > > 2. How to avoid the problems with name-binding to an intended positional >> only argument. Once again, this can be dealt with using *args. >> > > Again, the problem is *args accepts a range of possible arguments. > > Agreed, but why not this? def f(*args, **kwargs): assert len(args) == 2 The exception raised could easily be a TypeError too, so it would appear, to a user, the same as defining it in the function signature. There are of course, other limitations (e.g. introspection), but without a specific use case it is difficult to know how important those limitations are. > > > In both cases it would be nice to be able to avoid having to manually >> parse *args and **kwargs, but I haven't really seen anything better that >> the status quo for dealing with them. The only way I see this really >> working is to somehow bind positional-only arguments without binding each >> them to a specific name, and the only way I can think of to do that is to >> store them in a tuple. Perhaps, then, the syntax should reflect a C-style >> array: >> >> # pos(2) indicates 2 positional arguments >> def f(pos(2), arg1, *args, **kwargs): >> print(pos) >> print(arg1) >> print(args) >> print(kwargs) >> > > Not good. The issue is not restricting the author from binding the > positional arguments to names, the issue is restricting the user from > binding the arguments to names -- but even then, the user (and the author!) > need to have those names apparent. > Why does the user need the names? If the user cannot use them as keywords, then it doesn't matter what the names are, so anything can be used in the documentation (which is the only place they would appear). The author doesn't need the names either, just the data they refer to, in this case a tuple. > > For example: > > str.split(pos(2)) > > Quick, what should be supplied for the two positional arguments? > Is that a function call or definition? My suggestion was to use the parenthesis in the definition, not the call. Since str.split only has optional arguments its not a good example, but if you were to redefine str.replace (in python) it would look like this: def replace(pos(3), count=None): self, old, new = pos ... It would still be called in the same way though: >>> 'some string'.replace('s', 't', 1) 'tome string' > We want the arguments bound to names *in the function* -- we don't want > the arguments bound to names *in the function call*. > That is what I proposed, I just suggested binding all of the positinoal-only arguments to a single name. Having given it a bit more thought, though, maybe it would be easier to optionally apply the parenthesis to *args: def replace(self, *args(2), count=None); old, new = args This looks much closer to current situation, and I suppose could be extended to **kwargs, limiting the number of keyword arguments (although I have no idea why this would ever be wanted!) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun Mar 4 16:34:49 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 04 Mar 2012 07:34:49 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <1330820148.30507.149.camel@Gutsy> <1330834589.31666.102.camel@Gutsy> Message-ID: <4F538B99.8080607@stoneleaf.us> David Townshend wrote: > There are two issues being discussed here: > > 1. A new syntax for positional-only arguments. I don't really see any > good use case for this which can't already be dealt with quite easily > using *args. True, it means a bit more work on the documentation, but > is it really worth adding new syntax (or even a built-in decorator) just > for that? The problem with *args is that it allows 0-N arguments, when we want, say, 2. > 2. How to avoid the problems with name-binding to an intended > positional only argument. Once again, this can be dealt with using *args. Again, the problem is *args accepts a range of possible arguments. > In both cases it would be nice to be able to avoid having to manually > parse *args and **kwargs, but I haven't really seen anything better that > the status quo for dealing with them. The only way I see this really > working is to somehow bind positional-only arguments without binding > each them to a specific name, and the only way I can think of to do that > is to store them in a tuple. Perhaps, then, the syntax should reflect a > C-style array: > > # pos(2) indicates 2 positional arguments > def f(pos(2), arg1, *args, **kwargs): > print(pos) > print(arg1) > print(args) > print(kwargs) Not good. The issue is not restricting the author from binding the positional arguments to names, the issue is restricting the user from binding the arguments to names -- but even then, the user (and the author!) need to have those names apparent. For example: str.split(pos(2)) Quick, what should be supplied for the two positional arguments? We want the arguments bound to names *in the function* -- we don't want the arguments bound to names *in the function call*. ~Ethan~ From guido at python.org Sun Mar 4 17:37:27 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 4 Mar 2012 08:37:27 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: On Sat, Mar 3, 2012 at 7:46 PM, Nick Coghlan wrote: > On Sun, Mar 4, 2012 at 6:39 AM, Guido van Rossum wrote: >> Yeah, so it does make sense to standardize on a solution for this. Let >> it be @positional(N). Can you file an issue? > > How could that even work? > > Consider the following subset of the Mapping API: > > ? ?class C: > > ? ? ? ?@positional(2): > ? ? ? ?def __init__(self, data=None, **kwds): > ? ? ? ? ? ?self._stored_data = stored = {} > ? ? ? ? ? ?if data is not None: > ? ? ? ? ? ? ? ?stored.update(data) > ? ? ? ? ? ?stored.update(kwds) > > ? ? ? ?@positional(2): > ? ? ? ?def update(self, data=None, **kwds): > ? ? ? ? ? ?stored = self._stored_data > ? ? ? ? ? ?if data is not None: > ? ? ? ? ? ? ? ?stored.update(data) > ? ? ? ? ? ?stored.update(kwds) > > Without gross hacking of the function internals, there's no way for a > decorator to make the following two calls work properly: > > ? ?x = C(self=5, data=10) > ? ?x.update(self=10, data=5) I am very well aware of this (it occurs in two different places in the NDB library that I've been developing for Google App Engine). But either I missed some messages in the thread (quite possible) or you're bringing this up for the first time now -- the @positional decorator wasn't meant to solve this case (which only occurs when **kwds is used in this particular way). *If* you want to solve this I agree that some actual new syntax is probably needed. > Both will complain about duplicate "self" and "data" arguments, unless > the "positional" decorator truly rips the function definition apart > and creates a new one that alters how the interpreter maps arguments > to parameters. > > As Simon Sapin pointed out, the most correct way to write such code > currently is to accept *args and unpack it manually, which is indeed > exactly how the Mapping ABC implementation currently works [1]. While > the Mapping implementation doesn't currently use it, one simple way to > write such code is to use a *second* parameter binding step like this: > > ? ?class C: > > ? ? ? ?def _unpack_args(self, data=None): > ? ? ? ? ? ?return self, data > > ? ? ? ?def __init__(*args, **kwds): > ? ? ? ? ? ?self, data = C._unpack_args(*args) > ? ? ? ? ? ?self._stored_data = stored = {} > ? ? ? ? ? ?if data: > ? ? ? ? ? ? ? ?stored.update(data) > ? ? ? ? ? ?stored.update(kwds) > > ? ? ? ?def update(*args, **kwds): > ? ? ? ? ? ?self, data = C._unpack_args(*args) > ? ? ? ? ? ?stored = self._stored_data > ? ? ? ? ? ?if data is not None: > ? ? ? ? ? ? ? ?stored.update(data) > ? ? ? ? ? ?stored.update(kwds) Nice; that idiom should be more widely known. > The downside, of course, is that the error messages that come out of > such a binding operation may be rather cryptic (which is why the > Mapping ABC instead uses manual unpacking - so it can generate nice > error messages) Still, a naming convention for the helper function can probably make this fairly painless -- perhaps you'll need a separate helper function for each API function, named in a systematic fashion. > The difficulty of implementing the Mapping ABC correctly in pure > Python is the poster child for why the lack of positional-only > argument syntax is a language wart - we define APIs (in C) that work > that way, which people then have to reconstruct manually in Python. Nobody else seems to have seen the importance of solving *this* particular issue directly in the function signature -- but I personally support trying! > My proposal is that we simply added a *third* alternative for "*args": > a full function parameter specification to be used to bind the > positional-only arguments. > > That is: > > 1. '*args' collects the additional positional arguments and places > them in a tuple > 2. '*' disallows any further positional arguments. > 3. '*(SPEC)' binds the additional positional arguments according to > the parameter specification. > > In all 3 cases, any subsequent parameter definitions are keyword only. > > The one restriction placed on the nested SPEC is that it would only > allow "*args" at the end. The keyword only argument and positional > only argument forms would not be allowed, since they would make no > sense (as all arguments to the inner parameter binding operation are > positional by design). > > Then the "_unpack_args" hack above would be unnecessary, and you could > just write: > > ? ?class C: > > ? ? ? ?def __init__(*(self, data=None), **kwds): > ? ? ? ? ? ?self._stored_data = stored = {} > ? ? ? ? ? ?if data: > ? ? ? ? ? ? ? ?stored.update(data) > ? ? ? ? ? ?stored.update(kwds) > > ? ? ? ?def update(*(self, data=None), **kwds): > ? ? ? ? ? ?stored = self._stored_data > ? ? ? ? ? ?if data is not None: > ? ? ? ? ? ? ? ?stored.update(data) > ? ? ? ? ? ?stored.update(kwds) > > The objection was raised that this runs counter to the philosophy > behind PEP 3113 (which removed tuple unpacking from function > signatures). I disagree: > - this is not tuple unpacking, it is parameter binding > - it does not apply to arbitrary arguments, solely to the "extra > arguments" parameter, which is guaranteed to be a tuple > - it allows positional-only arguments to be clearly expressed in the > function signature, allowing the *interpreter* to deal with the > creation of nice error messages > - it *improves* introspection, since the binding of positional only > arguments is now expressed clearly in the function header (and > associated additional metadata on the function object), rather than > being hidden inside the function implementation +1. This is certainly the most thorough solution for both problems at hand (simply requiring some parameters to be positional, and the specific issue when combining this with **kwds). > Regards, > Nick. > > [1] http://hg.python.org/cpython/file/e67b3a9bd2dc/Lib/collections/abc.py#l511 -- --Guido van Rossum (python.org/~guido) From solipsis at pitrou.net Sun Mar 4 17:57:58 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 4 Mar 2012 17:57:58 +0100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: <20120304175758.58459066@pitrou.net> On Sun, 4 Mar 2012 13:46:33 +1000 Nick Coghlan wrote: > > Then the "_unpack_args" hack above would be unnecessary, and you could > just write: > > class C: > > def __init__(*(self, data=None), **kwds): > self._stored_data = stored = {} > if data: > stored.update(data) > stored.update(kwds) > > def update(*(self, data=None), **kwds): > stored = self._stored_data > if data is not None: > stored.update(data) > stored.update(kwds) > > The objection was raised that this runs counter to the philosophy > behind PEP 3113 (which removed tuple unpacking from function > signatures). I disagree: > - this is not tuple unpacking, it is parameter binding > - it does not apply to arbitrary arguments, solely to the "extra > arguments" parameter, which is guaranteed to be a tuple > - it allows positional-only arguments to be clearly expressed in the > function signature, allowing the *interpreter* to deal with the > creation of nice error messages > - it *improves* introspection, since the binding of positional only > arguments is now expressed clearly in the function header (and > associated additional metadata on the function object), rather than > being hidden inside the function implementation Then please consider also re-introducing parameter tuple unpacking, since that was genuinely useful. Regards Antoine. From guido at python.org Sun Mar 4 18:15:52 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 4 Mar 2012 09:15:52 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <20120304175758.58459066@pitrou.net> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <20120304175758.58459066@pitrou.net> Message-ID: On Sun, Mar 4, 2012 at 8:57 AM, Antoine Pitrou wrote: > On Sun, 4 Mar 2012 13:46:33 +1000 > Nick Coghlan wrote: >> >> Then the "_unpack_args" hack above would be unnecessary, and you could >> just write: >> >> ? ? class C: >> >> ? ? ? ? def __init__(*(self, data=None), **kwds): >> ? ? ? ? ? ? self._stored_data = stored = {} >> ? ? ? ? ? ? if data: >> ? ? ? ? ? ? ? ? stored.update(data) >> ? ? ? ? ? ? stored.update(kwds) >> >> ? ? ? ? def update(*(self, data=None), **kwds): >> ? ? ? ? ? ? stored = self._stored_data >> ? ? ? ? ? ? if data is not None: >> ? ? ? ? ? ? ? ? stored.update(data) >> ? ? ? ? ? ? stored.update(kwds) >> >> The objection was raised that this runs counter to the philosophy >> behind PEP 3113 (which removed tuple unpacking from function >> signatures). I disagree: >> - this is not tuple unpacking, it is parameter binding >> - it does not apply to arbitrary arguments, solely to the "extra >> arguments" parameter, which is guaranteed to be a tuple >> - it allows positional-only arguments to be clearly expressed in the >> function signature, allowing the *interpreter* to deal with the >> creation of nice error messages >> - it *improves* introspection, since the binding of positional only >> arguments is now expressed clearly in the function header (and >> associated additional metadata on the function object), rather than >> being hidden inside the function implementation > > Then please consider also re-introducing parameter tuple unpacking, > since that was genuinely useful. That's debatable - reread PEP 3113. I added my +1 to Nick's proposal a little hastily, it should have been +0. I think that *if* we want to solve this, my '/' solution should also be on the table. It has the downside of not being obvious, but I don't think that Nick's proposal is all that obvious either to people who encounter it for the first time -- you have to combine a bunch of powerful ideas to "get" it. And the () inside () just *begs* for arbitrary nesting, which we don't want to reintroduce. We don't want this: def foo(*(*(*(a, b), c), d), e): ... :-) -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Sun Mar 4 18:20:07 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 04 Mar 2012 09:20:07 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: <4F53A447.1010209@stoneleaf.us> Guido van Rossum wrote: > On Sat, Mar 3, 2012 at 7:46 PM, Nick Coghlan wrote: >> The difficulty of implementing the Mapping ABC correctly in pure >> Python is the poster child for why the lack of positional-only >> argument syntax is a language wart - we define APIs (in C) that work >> that way, which people then have to reconstruct manually in Python. > > Nobody else seems to have seen the importance of solving *this* > particular issue directly in the function signature -- but I > personally support trying! > >> My proposal is that we simply added a *third* alternative for "*args": >> a full function parameter specification to be used to bind the >> positional-only arguments. >> >> That is: >> >> 1. '*args' collects the additional positional arguments and places >> them in a tuple >> 2. '*' disallows any further positional arguments. >> 3. '*(SPEC)' binds the additional positional arguments according to >> the parameter specification. >> >> In all 3 cases, any subsequent parameter definitions are keyword only. >> >> The one restriction placed on the nested SPEC is that it would only >> allow "*args" at the end. The keyword only argument and positional >> only argument forms would not be allowed, since they would make no >> sense (as all arguments to the inner parameter binding operation are >> positional by design). I don't understand -- example of what's allowed and not allowed? > +1. This is certainly the most thorough solution for both problems at > hand (simply requiring some parameters to be positional, and the > specific issue when combining this with **kwds). So a (more or less) complete rundown would look like this: def foo(*(a, b)): ... # all positional def foo(*(a, b=1)): ... # all positional, b optional def foo(*(a, b), c, d): ... # a, b positional; c, d required and keyword def foo(*(a, b), c, d=1): ... # a, b positional; c required, d optional; c & d keyword def foo(*(a, b=1), c=1, d=1): ... # same, but b, c, d optional If I understand correctly, there is no way to have positional-only, position-or-keyword, and keyword-only in the same signature? From ethan at stoneleaf.us Sun Mar 4 18:05:07 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 04 Mar 2012 09:05:07 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <20120304175758.58459066@pitrou.net> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <20120304175758.58459066@pitrou.net> Message-ID: <4F53A0C3.3070708@stoneleaf.us> Antoine Pitrou wrote: > Then please consider also re-introducing parameter tuple unpacking, > since that was genuinely useful. It may have been useful, but my understanding is that it was removed because the complications in implementing it were greater, particularly where introspection was concerned. ~Ethan~ From guido at python.org Sun Mar 4 19:15:50 2012 From: guido at python.org (Guido van Rossum) Date: Sun, 4 Mar 2012 10:15:50 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F53A447.1010209@stoneleaf.us> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <4F53A447.1010209@stoneleaf.us> Message-ID: On Sun, Mar 4, 2012 at 9:20 AM, Ethan Furman wrote: > If I understand correctly, there is no way to have positional-only, > position-or-keyword, and keyword-only in the same signature? Heh. If that's true, my '/' proposal wins: def foo(pos_only, /, pos_or_kw, *, kw_only): ... Defaults can be added to taste. The restrictions on args-without-defaults being unable to follow args-with-defaults may need to be revisited so we can combine optional positional arguments with required keyword arguments, if we want to support that. Nevertheless all this is pretty esoteric and I wouldn't cry if it wasn't added. There exist solutions for the Mapping-API problem, and a @positional decorator would cover most other cases. -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Sun Mar 4 21:59:56 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 04 Mar 2012 15:59:56 -0500 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: On 3/3/2012 3:39 PM, Guido van Rossum wrote: > Yeah, so it does make sense to standardize on a solution for this Agreed. There are actually two issues. Doc: indicate intent, regardless of how enforced in code. Code: indicate intent to interpreter so it enforces intent rather than programmer doing do with *args, defaults if any, and error messages. > Let it be @positional(N). You seem to have backed off on that. I would like a solution for the docs that Georg can tolerate. > Can you file an issue? When you have settled on one thing for at least a day ;-). Until then, I think it is better to keep discussion in one place, which is here. --- The pos(n) idea does not work because position-only args may still have defaults. For instance, range() takes 1 to 3 args. That proposal did give me this idea: tag positional names with their index. In a sense, the index *is* the internal name while apparent alphabetic name is suggestive for human understanding. For doc purposes, the tag could be either a prefix or suffix. Either way, it would be a convention that does not conflict with any stdlib names that I know of. range(start_0 = 0, stop_1, step_2 = 1) Retern ... range(0_start = 0, 1_stop, 2_step = 1) Return ... For Python code, I presume the prefix form would be rejected by the lexer. A possibility would be 'mangled' dunder names in the signature, such as __0_start__, which would be stripped to 'start' for use in the code. If this idea makes '/' look better, fine with me. -- Terry Jan Reedy From greg.ewing at canterbury.ac.nz Sun Mar 4 22:23:34 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Mon, 05 Mar 2012 10:23:34 +1300 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <20120303214010.GA28795@cskk.homeip.net> Message-ID: <4F53DD56.1040006@canterbury.ac.nz> Another bikeshed idea on positional-only parameters: def foo([self], a, b, *args, **kwds): ... The square brackets are meant to suggest that the name is something only of interest to the implementation of the function, and not to be taken as part of the API. -- Greg From storchaka at gmail.com Sun Mar 4 22:47:28 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 04 Mar 2012 23:47:28 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F53DD56.1040006@canterbury.ac.nz> References: <20120303214010.GA28795@cskk.homeip.net> <4F53DD56.1040006@canterbury.ac.nz> Message-ID: 04.03.12 23:23, Greg Ewing ???????(??): > Another bikeshed idea on positional-only parameters: > > def foo([self], a, b, *args, **kwds): > ... > > The square brackets are meant to suggest that the name is > something only of interest to the implementation of the function, > and not to be taken as part of the API. Or _name, as for "private" class and module members. From storchaka at gmail.com Sun Mar 4 23:03:41 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 05 Mar 2012 00:03:41 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> Message-ID: 04.03.12 22:59, Terry Reedy ???????(??): > The pos(n) idea does not work because position-only args may still have > defaults. For instance, range() takes 1 to 3 args. That proposal did > give me this idea: tag positional names with their index. In a sense, > the index *is* the internal name while apparent alphabetic name is > suggestive for human understanding. > > For doc purposes, the tag could be either a prefix or suffix. Either > way, it would be a convention that does not conflict with any stdlib > names that I know of. Extend this for function arguments: """ However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice. """ From steve at pearwood.info Sun Mar 4 23:28:15 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 05 Mar 2012 09:28:15 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <20120303214010.GA28795@cskk.homeip.net> <4F53DD56.1040006@canterbury.ac.nz> Message-ID: <4F53EC7F.4050309@pearwood.info> Serhiy Storchaka wrote: > 04.03.12 23:23, Greg Ewing ???????(??): >> Another bikeshed idea on positional-only parameters: >> >> def foo([self], a, b, *args, **kwds): >> ... >> >> The square brackets are meant to suggest that the name is >> something only of interest to the implementation of the function, >> and not to be taken as part of the API. Please do not give syntactic meaning to [parameter], unless it matches the existing convention for optional parameters. Besides, positional-only arguments are not only of interest to the implementation, they are part of the API. > Or _name, as for "private" class and module members. In my own functions, I use _name for private implementation arguments, and usually explicitly document that callers should not rely on them. In the implementation, sometimes I need to use that private argument, and I always do so by name so that it stands out that I'm using a special argument, e.g. something like: def func(spam, ham, cheese, _name=eggs): if condition: x = func(spam, ham, cheese, _name=beans) ... If you also overload _name to also mean "positional only", I would have to write x = func(spam, ham, cheese, beans) which looks like a "normal" argument. And as already mentioned, the use of _name to mean positional-only and private would clash with functions which want public positional-only. -- Steven From steve at pearwood.info Sun Mar 4 23:32:10 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 05 Mar 2012 09:32:10 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <4F53A447.1010209@stoneleaf.us> Message-ID: <4F53ED6A.2050301@pearwood.info> Guido van Rossum wrote: > On Sun, Mar 4, 2012 at 9:20 AM, Ethan Furman wrote: >> If I understand correctly, there is no way to have positional-only, >> position-or-keyword, and keyword-only in the same signature? > > Heh. If that's true, my '/' proposal wins: > > def foo(pos_only, /, pos_or_kw, *, kw_only): ... > > Defaults can be added to taste. Now that I understand that / will only appear in at most one place, like * (and not following each and every positional-only arg) this is the nicest syntax I've seen yet. If we have to have this feature, +1 on this syntax. I'm still only +0 on the feature itself. -- Steven From fuzzyman at gmail.com Sun Mar 4 23:44:15 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Sun, 4 Mar 2012 22:44:15 +0000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F53ED6A.2050301@pearwood.info> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <4F53A447.1010209@stoneleaf.us> <4F53ED6A.2050301@pearwood.info> Message-ID: On 4 March 2012 22:32, Steven D'Aprano wrote: > Guido van Rossum wrote: > >> On Sun, Mar 4, 2012 at 9:20 AM, Ethan Furman wrote: >> >>> If I understand correctly, there is no way to have positional-only, >>> position-or-keyword, and keyword-only in the same signature? >>> >> >> Heh. If that's true, my '/' proposal wins: >> >> def foo(pos_only, /, pos_or_kw, *, kw_only): ... >> >> Defaults can be added to taste. >> > > Now that I understand that / will only appear in at most one place, like * > (and not following each and every positional-only arg) this is the nicest > syntax I've seen yet. > > If we have to have this feature, +1 on this syntax. > Agreed. However I've *never* wanted to create a "positional args only" parameter to an api that wasn't covered by *args. I also think that in *general* allowing keyword args is an improvement to APIs. So I guess I'm -0 on the feature, but the "/" syntax seems the best of the ones suggested so far. Michael > I'm still only +0 on the feature itself. > > > > -- > Steven > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Mon Mar 5 00:32:38 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 05 Mar 2012 10:32:38 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <20120304015502.76ac34fc@pitrou.net> <4F52E4A6.30508@pearwood.info> Message-ID: <4F53FB96.9090209@pearwood.info> Guido van Rossum wrote: > On Sat, Mar 3, 2012 at 7:42 PM, Steven D'Aprano wrote: >> I'm still +1 on adding named arguments to built-ins where needed, e.g. >> >>>>> "spam".find('a', end=1) >> Traceback (most recent call last): >> File "", line 1, in >> TypeError: find() takes no keyword arguments >> >> but I hope that would be uncontroversial! > > No, because str may be subclassed, so this change would break > backwards compatibility for subclasses that chose a different name > instead of 'end'. I think that's a genuine problem in theory. But is it a problem in practice? Since find('a', end=1) doesn't currently work, there won't be any code using it in practice. Even if a subclass looks like this: class MyString(str): def find(self, substring, beginning=0, last=None): ... internally MyString.find must be using positional arguments if it calls str.find, because keyword arguments don't currently work. So this suggested change will not break existing code. I can see one other objection to the change: if str.find accepts keywords, and MyString.find accepts *different* keywords, that is a violation of the Liskov Substitution Principle. Those who care about this would feel obliged to fix their code to match the argument names used by str.find, so if you call that mismatch "breaking backwards compatibility", I accept that. [Aside: in my experience, most programmers are unaware of Liskov, and accidentally or deliberately violate it frequently.] But given that str.find has been documented as "S.find(sub[, start[, end]])" forever, I don't have a lot of sympathy for anyone choosing different argument names. (I'm one of them. I'm sure I've written string subclasses that used s instead of sub.) I think that the practical benefit in clarity and readability in being able to write s.find('a', end=42) instead of s.find('a', 0, 42) outweighs the theoretical harm, but I will accept that there is a downside. -- Steven From ncoghlan at gmail.com Mon Mar 5 01:31:14 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 5 Mar 2012 10:31:14 +1000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <4F53A447.1010209@stoneleaf.us> Message-ID: On Mon, Mar 5, 2012 at 4:15 AM, Guido van Rossum wrote: > On Sun, Mar 4, 2012 at 9:20 AM, Ethan Furman wrote: >> If I understand correctly, there is no way to have positional-only, >> position-or-keyword, and keyword-only in the same signature? > > Heh. If that's true, my '/' proposal wins: > > def foo(pos_only, /, pos_or_kw, *, kw_only): ... > > Defaults can be added to taste. Yes, I only realised after Ethan's reply that my approach puts the "positional only" parameters in the wrong place relative to normal parameters (I didn't notice because I'm mainly interested in the Mapping use case and that doesn't accept any normal parameters - just positional only and arbitrary keywords). So, *if* syntactic support for positional-only arguments were added, I think Guido's syntax would be the way to do it. However, now that I've realised the "arbitrary keyword arguments" problem can be solved fairly cleanly by a helper function that binds the positional arguments, I'm more inclined to just leave it alone and tell people to just accept *args and process it that way. OTOH, having a docs-friendly syntax, and better programmatic introspection for the cases where it does come up would be nice, too... > The restrictions on args-without-defaults being unable to follow > args-with-defaults may need to be revisited so we can combine optional > positional arguments with required keyword arguments, if we want to > support that. Already done: >>> def f(a, b=1, *, c): return a, b, c ... >>> f(2, c=3) (2, 1, 3) > Nevertheless all this is pretty esoteric and I wouldn't cry if it > wasn't added. There exist solutions for the Mapping-API problem, and a > @positional decorator would cover most other cases. Yep. While I do think it's a slight language wart that we can't cleanly express all the function and method signatures that are used by our own builtins and ABC definitions, it's a *very* minor concern overall. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From cmjohnson.mailinglist at gmail.com Mon Mar 5 06:34:26 2012 From: cmjohnson.mailinglist at gmail.com (Carl M. Johnson) Date: Sun, 4 Mar 2012 19:34:26 -1000 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> Message-ID: On Mar 2, 2012, at 11:21 PM, Serhiy Storchaka wrote: > range([start,] stop[, step]) > slice([start,] stop[, step]) > itertools.islice(iterable, [start,] stop [, step]) > random.randrange([start,] stop[, step]) > syslog.syslog([priority,] message) > curses.newwin([nlines, ncols,] begin_y, begin_x) > curses.window.addch([y, x,] ch[, attr]) > curses.window.addnstr([y, x,] str, n[, attr]) > curses.window.addstr([y, x,] str[, attr]) > curses.window.chgat([y, x, ] [num,] attr) > curses.window.derwin([nlines, ncols,] begin_y, begin_x) > curses.window.hline([y, x,] ch, n) > curses.window.insch([y, x,] ch[, attr]) > curses.window.insnstr([y, x,] str, n [, attr]) > curses.window.subpad([nlines, ncols,] begin_y, begin_x) > curses.window.subwin([nlines, ncols,] begin_y, begin_x) > curses.window.vline([y, x,] ch, n) I think this use of brackets is really elegant. Not sure if it would work as syntax or not, but it's great at conveying intent. From ron3200 at gmail.com Mon Mar 5 07:55:37 2012 From: ron3200 at gmail.com (Ron Adam) Date: Mon, 05 Mar 2012 00:55:37 -0600 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <4F53A447.1010209@stoneleaf.us> Message-ID: <1330930537.2815.132.camel@Gutsy> On Mon, 2012-03-05 at 10:31 +1000, Nick Coghlan wrote: > So, *if* syntactic support for positional-only arguments were added, I > think Guido's syntax would be the way to do it. However, now that I've > realised the "arbitrary keyword arguments" problem can be solved > fairly cleanly by a helper function that binds the positional > arguments, I'm more inclined to just leave it alone and tell people to > just accept *args and process it that way. Yeah, I think I agree with that for now. I feel signatures are already pretty complex. If we found a solution that worked while simplifying or merging some of that complexity in a nice way, I'd be +1. Your suggested syntax was leaning in that direction I think. I liked that there was a possibly a clearer separation between positional arguments and keywords arguments. If you look at the C code that parses signatures, it's not simple or straight forward. It's not easy to write a python function that maps (*args, **kwds) to a signature in the same way. I tried it to test out some ideas a while back. What makes it difficult is some of the "*args" values can be keyword arguments assigned by position. Or some values in "**kwds" values may be positional arguments assigned by name. I think the order not being preserved in kwds was also a factor. While in the simple cases, it's fairly easy to mentally parse a signature, the mental slope gets steeper as you start to combine the different concepts into one signature. Maybe it's easy for those who do it every day, but not as easy for those doing it less often, or for those who are just beginning to learn python. > OTOH, having a docs-friendly syntax, and better programmatic > introspection for the cases where it does come up would be nice, > too... When I was looking at your syntax I was thinking of it like this... def foo(*(a, b=2)->args, **(c=3)->kwds): ... return args, kwds Which would map the positional only arguments to args, and the rest to kwds and include the default values as well. But that's a different thing. That wouldn't be friendly to duck typing because functions in the front of the chain should be not care what the signature of the function at the end of the chain will be. It would be limited to functions (ducks) who's signatures are compatible with each other. The (*args, **kwds) signature only corresponds to positional and keywords arguments in the simple cases where no positional (only) argument has a default value, and no keyword arguments are assigned by position. As far as better docs-friendly syntax, and introspection are concerned, I think we will need a signature object that can be introspected. It also might be helpful in evaluating ideas like these. Cheers, Ron From storchaka at gmail.com Mon Mar 5 09:27:28 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Mon, 05 Mar 2012 10:27:28 +0200 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F53EC7F.4050309@pearwood.info> References: <20120303214010.GA28795@cskk.homeip.net> <4F53DD56.1040006@canterbury.ac.nz> <4F53EC7F.4050309@pearwood.info> Message-ID: 05.03.12 00:28, Steven D'Aprano ???????(??): > Serhiy Storchaka wrote: >> Or _name, as for "private" class and module members. > > In my own functions, I use _name for private implementation arguments, > and usually explicitly document that callers should not rely on them. In > the implementation, sometimes I need to use that private argument, and I > always do so by name so that it stands out that I'm using a special > argument, e.g. something like: > > > def func(spam, ham, cheese, _name=eggs): > if condition: > x = func(spam, ham, cheese, _name=beans) > ... > > > If you also overload _name to also mean "positional only", I would have > to write > > x = func(spam, ham, cheese, beans) > > which looks like a "normal" argument. > > And as already mentioned, the use of _name to mean positional-only and > private would clash with functions which want public positional-only. There is no clash. Both the application means the same thing -- you advise a client not to use this name as keyword argument. Of course, he may, but need not, if not aware. 'spam'.find('a', _end=1) looks terrible and no one will inadvertently use it. Or use a double underscore for the reinforcement to prevent the use of this name. def find(self, sub, __start=0, __end=None) From jimjjewett at gmail.com Mon Mar 5 17:44:37 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 5 Mar 2012 11:44:37 -0500 Subject: [Python-ideas] revisit pep 377: good use case? In-Reply-To: References: <6264a70c-d8cf-49a1-85da-d7d3b40343c3@me.com> <4F4EBC03.8050609@pearwood.info> <08D38B07-A393-4895-A0A6-F47EDB82B812@me.com> <212AF954-846A-47A0-BE20-6432625C8817@me.com> Message-ID: On Thu, Mar 1, 2012 at 11:39 AM, Jim Jewett wrote: > Is it just that you want the stuff inside both your function and your > context manager/decorator to have access to the same locals, and don't > want to use a closure and/or pass around a dict? It turned out that one important piece was that what a user considered a function was too large of a chunk for appropriate caching. That is an argument for a suite decorator, so as to avoid boilerplate code around each call. I'm not sure how *strong* the argument is though, because at least in his particular case, the cachable parts are sufficiently similar that they can be wrapped inside a calls to a "service provider", and the boilerplate can *probably* be moved there. -jJ From barry at python.org Mon Mar 5 20:40:58 2012 From: barry at python.org (Barry Warsaw) Date: Mon, 5 Mar 2012 14:40:58 -0500 Subject: [Python-ideas] doctest References: <4F5042F3.8000507@pearwood.info> <4F505C79.4060501@pearwood.info> <4F50DFF0.8040400@pearwood.info> <20120302132754.5dfcad6e@bhuda.mired.org> Message-ID: <20120305144058.5ca202d8@resist.wooz.org> On Mar 02, 2012, at 01:27 PM, Mike Meyer wrote: >So the question is - why isn't dealing with this the responsibility of >the test writer? Yeah, it's not quite the spirit of documentation to >turn a dictionary into a sorted list in the output, but neither is >littering the documentation with +LITERAL_EVAL and the like. Yeah, it basically is. My contention, based on years of experience (though of course YMMV) is that the best way to solve this is to write better documentation. I personally don't find bare dict prints to be very useful or readable, but a nice little sort/loop/print works fine. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From greg.ewing at canterbury.ac.nz Mon Mar 5 22:30:47 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 06 Mar 2012 10:30:47 +1300 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F53EC7F.4050309@pearwood.info> References: <20120303214010.GA28795@cskk.homeip.net> <4F53DD56.1040006@canterbury.ac.nz> <4F53EC7F.4050309@pearwood.info> Message-ID: <4F553087.50703@canterbury.ac.nz> Steven D'Aprano wrote: > Please do not give syntactic meaning to [parameter], unless it matches > the existing convention for optional parameters. Why should it have to do that? We already have a syntax for optional parameters, and there is no reason for a reader to think that a new piece of syntax is simply duplicating existing functionality. > Besides, positional-only arguments are not only of interest to the > implementation, they are part of the API. The fact that a parameter exists in that slot is part of the API, but the *name* of it is not. This is reflected in the fact that the comma is outside the brackets and the name is inside. -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 5 22:35:13 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 06 Mar 2012 10:35:13 +1300 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F53ED6A.2050301@pearwood.info> References: <4F508819.7000809@pearwood.info> <1330788013.30181.27.camel@Gutsy> <4F53A447.1010209@stoneleaf.us> <4F53ED6A.2050301@pearwood.info> Message-ID: <4F553191.2000309@canterbury.ac.nz> Steven D'Aprano wrote: > Now that I understand that / will only appear in at most one place, like > * (and not following each and every positional-only arg) this is the > nicest syntax I've seen yet. It's reasonably nice, but I'm not sure about giving the '/' its own slot with commas either side. This works for * and ** because they (optionally now in the case of *) take a name after them, but the '/' never will. So how about treating '/' as a separator instead: def foo(pos1, pos2 / arg3, arg4, *args, **kwds): -- Greg From greg.ewing at canterbury.ac.nz Mon Mar 5 22:53:27 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 06 Mar 2012 10:53:27 +1300 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F553087.50703@canterbury.ac.nz> References: <20120303214010.GA28795@cskk.homeip.net> <4F53DD56.1040006@canterbury.ac.nz> <4F53EC7F.4050309@pearwood.info> <4F553087.50703@canterbury.ac.nz> Message-ID: <4F5535D7.7060600@canterbury.ac.nz> I wrote: > Steven D'Aprano wrote: > >> Please do not give syntactic meaning to [parameter], unless it matches >> the existing convention for optional parameters. > > Why should it have to do that? Sorry, I just realised what you meant by that -- you weren't talking about Python syntax, but rather *metasyntax* used in documentation. You have a point there. -- Greg From ethan at stoneleaf.us Mon Mar 5 23:23:16 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 05 Mar 2012 14:23:16 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F553191.2000309@canterbury.ac.nz> References: <1330788013.30181.27.camel@Gutsy> <4F53A447.1010209@stoneleaf.us> <4F53ED6A.2050301@pearwood.info> <4F553191.2000309@canterbury.ac.nz> Message-ID: <4F553CD4.9060806@stoneleaf.us> Greg Ewing wrote: > Steven D'Aprano wrote: > >> Now that I understand that / will only appear in at most one place, >> like * (and not following each and every positional-only arg) this is >> the nicest syntax I've seen yet. > > It's reasonably nice, but I'm not sure about giving the '/' its > own slot with commas either side. This works for * and ** because > they (optionally now in the case of *) take a name after them, > but the '/' never will. > > So how about treating '/' as a separator instead: > > def foo(pos1, pos2 / arg3, arg4, *args, **kwds): Looks a lot like division to me. Plus you then have the signature different from the call (as it *would* be division if you tried to use it as a separator when calling it). Unless we have a good reason to treat it differently from a lone '*', I think we should be consistent and treat it the same. (I obviously don't think the lack of a name is a good enough reason to be inconsistent. ;) ~Ethan~ From anikom15 at gmail.com Tue Mar 6 00:01:46 2012 From: anikom15 at gmail.com (Westley =?iso-8859-1?Q?Mart=EDnez?=) Date: Mon, 5 Mar 2012 15:01:46 -0800 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: References: Message-ID: <20120305230146.GA811@kubrick> I'm -1 on this issue after some thought: I think we need to look at this from the function user's perspective. For example let's take this hypothetical declaration: def func(a, b, /, x, y, *, name, color): This function may be called like this: func(v1, v2) func(v1, v2, v3, v4) func(v1, v2, y=v4, x=v3) func(v1, v2, x=v3, y=v4) func(v1, v2, v3, v4, name='westley', color='0xffffff') func(v1, v2, name='westley', color='0xffffff', x=v3, y=v4) func(v1, name='westley', color='0xffffff', x=v3, y=v4, v2) # ERROR! To me, this just feels a little too ... mutable. In C we have one way to call functions that is equal to it's function declaration. I'd be +1 for functions that have ONLY non-keyword arguments which would be declared via decorator: @positional # This name is a bit ambiguous I guess.... def func(a, b) From steve at pearwood.info Tue Mar 6 01:00:31 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 06 Mar 2012 11:00:31 +1100 Subject: [Python-ideas] keyword arguments everywhere (stdlib) - issue8706 In-Reply-To: <4F553087.50703@canterbury.ac.nz> References: <20120303214010.GA28795@cskk.homeip.net> <4F53DD56.1040006@canterbury.ac.nz> <4F53EC7F.4050309@pearwood.info> <4F553087.50703@canterbury.ac.nz> Message-ID: <4F55539F.9090900@pearwood.info> Greg Ewing wrote: > Steven D'Aprano wrote: > >> Please do not give syntactic meaning to [parameter], unless it matches >> the existing convention for optional parameters. > > Why should it have to do that? We already have a syntax for > optional parameters, and there is no reason for a reader to > think that a new piece of syntax is simply duplicating existing > functionality. I see your later comment about metasyntax, but to clarify in case there is still any lingering doubt what I mean: When reading function signatures in *documentation*, we often see func([parameter]) used to indicate that parameter is an optional argument. If your proposal is enacted, when reading function signatures in *code*, we will see func([parameter]) used to indicate that you can't use parameter as a keyword argument. The two uses clash, which means that every time we see a [parameter] in a function signature, there will be a moment of disorientation where we have to decide whether it should be interpreted using the convention for code or the convention for documentation. Certainly there are ways of distinguishing the two from context, particularly if the default value is shown in the signature, or from subtleties of whether the comma is inside the brackets or not, or perhaps from the order ([] early in the signature probably means positional, [] at the end probably means optional). My point is not that it is impossible to distinguish optional from positional arguments, but that the similarity of syntax makes it difficult to distinguish the two *at a glance* and comprehensibility will be reduced. And heaven help us if we have a function like range with positional-only optional parameters: range([[start]=0,] [end] [, [step]=1]) --> iterable For the avoidance of doubt, I am *not* suggesting that we introduce [parameter] as syntax for optional arguments. I don't want to see [] used syntactically inside def func(...) at all, except for the VERY rare use of lists as mutable default arguments. -- Steven From techtonik at gmail.com Thu Mar 8 09:04:38 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 8 Mar 2012 11:04:38 +0300 Subject: [Python-ideas] "Submit your hash" - stats for stdlib Message-ID: While writing the answer for the stackoverflow question about cross-language standard libraries, I realized that there is not enough statistics about module/function usage in Python. If there could be an utility that just walks through your project and gathers stats about calls from stdlib, then this utility can as well submit those stats to some stats.python.org endpoint for summarizing. The stats about stdlib module usage/function calls can be combined into reversible hashes for easy posting through at URL form at the site. Hashes are also useful to see if some big subtree is actually popular library, such as Django, to avoid duplicating the stats. If every installed package is run through such tool, then it might be even possible to detect the versions of libraries used regardless of metadata location (although MD5 seems a better solution in this case). 1. http://stackoverflow.com/questions/7654826/cross-language-standard-libraries -- anatoly t. From chris at simplistix.co.uk Thu Mar 8 18:55:35 2012 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 08 Mar 2012 09:55:35 -0800 Subject: [Python-ideas] import fallback syntax Message-ID: <4F58F297.9050507@simplistix.co.uk> Hi All, I see a lot of Python like this: try: from cDecimal import Decimal except ImportError: from decimal import Decimal ...and nest deeper if you're talking about ElementTree. How about some syntactical sugar to make this less obnoxious: from cDecimal or decimal import Decimal from x.y.z import X as X or a.b.z import Y as X or what do people think? Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From guido at python.org Thu Mar 8 19:03:34 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 8 Mar 2012 10:03:34 -0800 Subject: [Python-ideas] import fallback syntax In-Reply-To: <4F58F297.9050507@simplistix.co.uk> References: <4F58F297.9050507@simplistix.co.uk> Message-ID: We used to have a lot more of these. The agreed-upon solution is to have this hidden inside (for instance) the decimal module. E.g. if you import heapq, you get heapq.py which attempts to import _heapq but gives you the Python implementation if that fails. --Guido On Thu, Mar 8, 2012 at 9:55 AM, Chris Withers wrote: > Hi All, > > I see a lot of Python like this: > > try: > ?from cDecimal import Decimal > except ImportError: > ?from decimal import Decimal > > ...and nest deeper if you're talking about ElementTree. > > How about some syntactical sugar to make this less obnoxious: > > from cDecimal or decimal import Decimal > > from x.y.z import X as X or > ? ? a.b.z import Y as X or > > what do people think? > > Chris > > -- > Simplistix - Content Management, Batch Processing & Python Consulting > ? ? ? ? ? ?- http://www.simplistix.co.uk > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) From mikegraham at gmail.com Thu Mar 8 19:08:01 2012 From: mikegraham at gmail.com (Mike Graham) Date: Thu, 8 Mar 2012 13:08:01 -0500 Subject: [Python-ideas] import fallback syntax In-Reply-To: <4F58F297.9050507@simplistix.co.uk> References: <4F58F297.9050507@simplistix.co.uk> Message-ID: On Thu, Mar 8, 2012 at 12:55 PM, Chris Withers wrote: > > I see a lot of Python like this: > > try: > ?from cDecimal import Decimal > except ImportError: > ?from decimal import Decimal > > ...and nest deeper if you're talking about ElementTree. > > How about some syntactical sugar to make this less obnoxious: > > from cDecimal or decimal import Decimal > > from x.y.z import X as X or > ? ? a.b.z import Y as X or > > what do people think? > > Chris It's worthy of note that the `import cFoo as foo` pattern has given way to implementing foo to try to defer to cFoo on import. The other situation--a dropin replacement potentially-third-party module--like the various etree implementations is somewhat less that compelling since a) these are rare and a few lines to do it doesn't hurt and b) they're usually lies--a lot of use-lxml-but-fallback-on-xml.etree import programs out there would fail if lxml wasn't present anyhow. If something like this is introduced, we should consider an "import foo or None" kind of case (maybe not written exactly like that) to replace the conditional-dependency pattern of try: import foo except ImportError: foo = None Mike From chris at simplistix.co.uk Thu Mar 8 19:14:36 2012 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 08 Mar 2012 10:14:36 -0800 Subject: [Python-ideas] import fallback syntax In-Reply-To: References: <4F58F297.9050507@simplistix.co.uk> Message-ID: <4F58F70C.70600@simplistix.co.uk> On 08/03/2012 10:03, Guido van Rossum wrote: > We used to have a lot more of these. The agreed-upon solution is to > have this hidden inside (for instance) the decimal module. E.g. if you > import heapq, you get heapq.py which attempts to import _heapq but > gives you the Python implementation if that fails. Okay, but I'm thinking about the situation where there are genuinely several possible import sources, with no clear order of fallbacks and where not all of the packages know about each other. The ElementTree example is currently a good one... This seems particularly relevant with things like argparse, unittest2, distutils2, packaging... I suspect it'll also be relevant where code uses libraries where the author has chosen to have differently named packages for Python 2 and 3. It would also be extremely relevant if any of the misguided "provisional packages in stdlib" PEPs make it through... Chris -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From anacrolix at gmail.com Thu Mar 8 19:15:32 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 9 Mar 2012 02:15:32 +0800 Subject: [Python-ideas] import fallback syntax In-Reply-To: References: <4F58F297.9050507@simplistix.co.uk> Message-ID: from foo or bar import quux else None :P -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Thu Mar 8 19:52:08 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 9 Mar 2012 02:52:08 +0800 Subject: [Python-ideas] Loop labels Message-ID: Several languages allow break and continue statements to take a count or loop label for greater flexibility. Dealing with nested loops and control flow around them is definitely something I and probably most programmers deal with everyday. Generally for very complex control flow one might employ functions, and use return statements to work around any shortcomings. This is not always ideal in CPython because of the performance cost of function calls, and lack of anonymous functions. The other work around is usually goto statements, which clearly aren't available or appropriate in Python. So what about the following extensions? Allow for and while statements to be labelled using "as". Allow break and continue to take the name of a containing loop, or an integer. The corresponding named loop, or the nth containing loop are treated as though they are the nearest enclosing loop. loop_label ::= identifier break_stmt ::= "break" [decimalinteger | loop_label] continue_stmt ::= "continue" [decimalinteger | loop_label] while_stmt ::= "while" expression ["as" loop_label] ":" suite ["else" ":" suite] for_stmt ::= "for" target_list "in" expression_list ["as" loop_label] ":" suite ["else" ":" suite] Here's a really naive example: for a in b as c: ??? for d in e: ??????? break c # or break 2 ??? for f in g: ??????? continue c # or continue 2 From simon.sapin at kozea.fr Thu Mar 8 19:57:54 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Thu, 08 Mar 2012 19:57:54 +0100 Subject: [Python-ideas] Loop labels In-Reply-To: References: Message-ID: <4F590132.3090409@kozea.fr> Le 08/03/2012 19:52, Matt Joiner a ?crit : > Allow for and while statements to be labelled using "as". I like the idea. But regardless of whether it should be added at all, I see one issue with the proposition in its current form. The with statement has a variable name after its own "as" keyword. Do loop labels live in the same namespace as variables? Or can I have a variable with the same name as a loop label without the two interfering? Regards, -- Simon Sapin From pjenvey at underboss.org Thu Mar 8 20:06:45 2012 From: pjenvey at underboss.org (Philip Jenvey) Date: Thu, 8 Mar 2012 11:06:45 -0800 Subject: [Python-ideas] import fallback syntax In-Reply-To: <4F58F297.9050507@simplistix.co.uk> References: <4F58F297.9050507@simplistix.co.uk> Message-ID: On Mar 8, 2012, at 9:55 AM, Chris Withers wrote: > Hi All, > > I see a lot of Python like this: > > try: > from cDecimal import Decimal > except ImportError: > from decimal import Decimal > > ...and nest deeper if you're talking about ElementTree. > > How about some syntactical sugar to make this less obnoxious: > > from cDecimal or decimal import Decimal > > from x.y.z import X as X or > a.b.z import Y as X or > > what do people think? It's been proposed before, you might want to read this old thread: http://mail.python.org/pipermail/python-dev/2008-January/075788.html -- Philip Jenvey From anacrolix at gmail.com Thu Mar 8 20:14:12 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 9 Mar 2012 03:14:12 +0800 Subject: [Python-ideas] Loop labels In-Reply-To: <4F590132.3090409@kozea.fr> References: <4F590132.3090409@kozea.fr> Message-ID: On Mar 9, 2012 3:04 AM, "Simon Sapin" wrote: > > Le 08/03/2012 19:52, Matt Joiner a ?crit : > >> Allow for and while statements to be labelled using "as". > > > I like the idea. But regardless of whether it should be added at all, I see one issue with the proposition in its current form. > > The with statement has a variable name after its own "as" keyword. Do loop labels live in the same namespace as variables? Or can I have a variable with the same name as a loop label without the two interfering? The with statement is not a loop statement and would not be extended, so this is not an issue. It would be nonsensical to assign to loop labels, so I figure there would be some non trivial checks done when a function is compiled. You raise an interesting point. > > Regards, > -- > Simon Sapin > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas I neglected to mention that nested looping often requires a lot of undesirable temporaries and additional state checks. Avoiding this a very common goal, that can be circumvented with the aforementioned constructs. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris at simplistix.co.uk Thu Mar 8 20:16:32 2012 From: chris at simplistix.co.uk (Chris Withers) Date: Thu, 08 Mar 2012 11:16:32 -0800 Subject: [Python-ideas] import fallback syntax In-Reply-To: References: <4F58F297.9050507@simplistix.co.uk> Message-ID: <4F590590.5070800@simplistix.co.uk> On 08/03/2012 11:06, Philip Jenvey wrote: > > It's been proposed before, you might want to read this old thread: > > http://mail.python.org/pipermail/python-dev/2008-January/075788.html Heh, interesting that the same thing came up without me ever reading that thread. I think it's got legs, why do other people dislike the idea? Chris PS For clarification, only thinking about this for 3.x, not 2.x... -- Simplistix - Content Management, Batch Processing & Python Consulting - http://www.simplistix.co.uk From ubershmekel at gmail.com Thu Mar 8 20:36:58 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Thu, 8 Mar 2012 21:36:58 +0200 Subject: [Python-ideas] Loop labels In-Reply-To: References: <4F590132.3090409@kozea.fr> Message-ID: On Thu, Mar 8, 2012 at 9:14 PM, Matt Joiner wrote: > > On Mar 9, 2012 3:04 AM, "Simon Sapin" wrote: > > > > Le 08/03/2012 19:52, Matt Joiner a ?crit : > ... Guido has already pronounced "Labeled break and continue" rejected in PEP 3136. http://www.python.org/dev/peps/pep-3136/ tl;dr - Most use cases are better off refactored into functions and the other use cases aren't worth how much this complicates the language. Yuval Greenfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.sapin at kozea.fr Thu Mar 8 20:39:00 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Thu, 08 Mar 2012 20:39:00 +0100 Subject: [Python-ideas] Loop labels In-Reply-To: References: <4F590132.3090409@kozea.fr> Message-ID: <4F590AD4.4040108@kozea.fr> Le 08/03/2012 20:14, Matt Joiner a ?crit : > The with statement is not a loop statement and would not be extended, so > this is not an issue. Yes, no conflict here. Only the precedent of the with statement set the expectation that "as" is followed by a variable name. > It would be nonsensical to assign to loop labels, so I figure there > would be some non trivial checks done when a function is compiled. You > raise an interesting point. So: same namespace, but incompatible usage? Ie, if a name is a loop label, any operation on it other than break/continue is forbidden? -- Simon Sapin From anacrolix at gmail.com Thu Mar 8 20:40:30 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 9 Mar 2012 03:40:30 +0800 Subject: [Python-ideas] Loop labels In-Reply-To: <4F590AD4.4040108@kozea.fr> References: <4F590132.3090409@kozea.fr> <4F590AD4.4040108@kozea.fr> Message-ID: That's definitely how I'd do it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From merwok at netwok.org Thu Mar 8 20:59:17 2012 From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=) Date: Thu, 08 Mar 2012 20:59:17 +0100 Subject: [Python-ideas] import fallback syntax In-Reply-To: <4F58F70C.70600@simplistix.co.uk> References: <4F58F297.9050507@simplistix.co.uk> <4F58F70C.70600@simplistix.co.uk> Message-ID: <4F590F95.3010405@netwok.org> Hi, [Chris Withers] > This seems particularly relevant with things like argparse, unittest2, > distutils2, packaging... argparse has the same name regardless of its location in site-packages or the standard library. For unittest/unittest2, what we do in distutils2 is to do the import dance in one place and all other modules import unittest from distutils2.tests. You can do the same with lxml.etree/xml.etree.cElementTree/xml.etree.ElementTree. > I suspect it'll also be relevant where code uses libraries where the > author has chosen to have differently named packages for Python 2 and 3. Can you give examples of projects doing that and their reasons? > It would also be extremely relevant if any of the misguided "provisional > packages in stdlib" PEPs make it through... One of them is withdrawn; the other one is still open for feedback, please share your comments on python-dev. Regards From dreamingforward at gmail.com Thu Mar 8 22:08:23 2012 From: dreamingforward at gmail.com (Mark Janssen) Date: Thu, 8 Mar 2012 14:08:23 -0700 Subject: [Python-ideas] PEP Message-ID: On Thu, Feb 9, 2012 at 5:18 PM, Guido van Rossum wrote: > A dictionary would (then) be a SET of these. (Voila! things have already >> gotten simplified.) >> > > Really? So {a:1, a:2} would be a dict of length 2? > Eventually, I also think this will seque and integrate nicely into Mark >> Shannon's "shared-key dict" proposal (PEP 412). >> >> I just noticed something in Guido's example. Something gives me a strange feeling that using a variable as a key doesn't smell right. Presumably Python just hashes the variable's id, or uses the id itself as the key, but I wonder if anyone's noticed any problems with this, and whether the hash collision problems could be solved by removing this?? Does anyone even use this functionality -- of a *variable* (not a string) as a dict key? mark -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Mar 8 22:36:31 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 08 Mar 2012 13:36:31 -0800 Subject: [Python-ideas] PEP In-Reply-To: References: Message-ID: <4F59265F.8020500@stoneleaf.us> Mark Janssen wrote: > On Thu, Feb 9, 2012 at 5:18 PM, Guido van Rossum wrote: > >> A dictionary would (then) be a SET of these. (Voila! things have >> already gotten simplified.) > > > Really? So {a:1, a:2} would be a dict of length 2? > > >> Eventually, I also think this will seque and integrate nicely >> into Mark Shannon's "shared-key dict" proposal (PEP 412). > > > I just noticed something in Guido's example. Something gives me a > strange feeling that using a variable as a key doesn't smell right. > Presumably Python just hashes the variable's id, or uses the id itself > as the key, but I wonder if anyone's noticed any problems with this, and > whether the hash collision problems could be solved by removing this?? > Does anyone even use this functionality -- of a *variable* (not a > string) as a dict key? Um, yes? As in, most of the time. Don't you? And Python uses whatever object the name is bound to: --> huh = 5 --> d = {} --> d[huh] = 'hrm' --> d {5: 'hrm'} ~Ethan~ From ethan at stoneleaf.us Thu Mar 8 23:02:48 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 08 Mar 2012 14:02:48 -0800 Subject: [Python-ideas] PEP In-Reply-To: References: <4F59265F.8020500@stoneleaf.us> Message-ID: <4F592C88.6070801@stoneleaf.us> Mark Janssen wrote: > On Thu, Mar 8, 2012 at 2:36 PM, Ethan Furman > wrote: > > Um, yes? As in, most of the time. Don't you? > > And Python uses whatever object the name is bound to: > > --> huh = 5 > --> d = {} > --> d[huh] = 'hrm' > --> d > {5: 'hrm'} > > > Hmm, wow, like never. But then didn't I decide that you're in a very > different Python group than I? Not worse, just different.... > > :) Apparently so. :) Yay, Python! ~Ethan~ From masklinn at masklinn.net Thu Mar 8 22:59:31 2012 From: masklinn at masklinn.net (Masklinn) Date: Thu, 8 Mar 2012 22:59:31 +0100 Subject: [Python-ideas] PEP In-Reply-To: References: Message-ID: On 2012-03-08, at 22:08 , Mark Janssen wrote: > I just noticed something in Guido's example. Something gives me a strange > feeling that using a variable as a key doesn't smell right. Presumably > Python just hashes the variable's id, or uses the id itself as the key Python calls ``hash`` on the object and uses the result. > , but > I wonder if anyone's noticed any problems with this, and whether the hash > collision problems could be solved by removing this?? No. Not that it makes sense, people could ask for object hashes on their own and end up with the same result. > Does anyone even > use this functionality -- of a *variable* (not a string) as a dict key? What you're asking does not make sense, the dict key is not the name but whatever object is bound to the name. And yes I've used non-string objects as names before: tuples, frozensets, integers, my own objects, ? From steve at pearwood.info Thu Mar 8 23:58:33 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 09 Mar 2012 09:58:33 +1100 Subject: [Python-ideas] import fallback syntax In-Reply-To: <4F590590.5070800@simplistix.co.uk> References: <4F58F297.9050507@simplistix.co.uk> <4F590590.5070800@simplistix.co.uk> Message-ID: <4F593999.3050003@pearwood.info> Chris Withers wrote: > On 08/03/2012 11:06, Philip Jenvey wrote: >> >> It's been proposed before, you might want to read this old thread: >> >> http://mail.python.org/pipermail/python-dev/2008-January/075788.html > > Heh, interesting that the same thing came up without me ever reading > that thread. > > I think it's got legs, why do other people dislike the idea? Because it adds more complexity to the language and parser for minimal benefit. In my experience, the construct try: import ham except ImportError: import spam as ham is not common enough or difficult enough to need special syntax for it. Not every special case needs special syntax, and if you don't like that it is a four-liner you can cut it down to a two-liner: try: import ham except ImportError: import spam as ham Every new piece of syntax adds complexity to the language, increasing the overall burden of writing a piece of code. The try...except idiom is a general idiom that applies *everywhere* -- you try something, and if that fails, you do something else, regardless of the nature of the things you are trying. You're not limited to only catching ImportError and retrying the import, so it is easy to extend the idiom to variations such as: try: from spam import x except ImportError: # Must be an old version. Log it and fall back. log('using old version of spam') from spam import y and here's a real piece of code from one of my libraries: try: from math import isfinite except ImportError: # Python 2.6 or older. try: from math import isnan, isinf except ImportError: # Python 2.5. Quick and dirty substitutes. def isnan(x): return x != x def isinf(x): return x - x != 0 def isfinite(x): return not (isnan(x) or isinf(x)) "import x or y" doesn't have anywhere near the flexibility or power of a generalised try...except block because it is limited to a tiny subset of the actions you might want to take after catching ImportError. Most special syntax for special cases doesn't add any new idioms for solving general problems, they just solve a single, narrowly defined, problem. Your suggestion is one of them: it can be used to solve two specific import problems: import ham or spam as ham from ham or spam import x and I suppose it could even be extended, at the cost of even more complexity, to solve a third: from ham or spam import x or y as z but even so, it is still too narrowly focused on single idiom: try: import some thing except ImportError: import some other thing as a fallback with no ability to do anything else. That makes it fall foul of the "Special cases" line from the Zen. If the idiom being replaced was difficult enough, then maybe your suggestion would fly, but I don't believe it does. It's a pretty trivial transformation. Compare your suggestion with the "yield from" PEP, and you will see that while yield from initially seems trivial, there are in fact a whole lot of complicated special cases with coroutines that make it compelling. I don't believe that yours has anything like that. Consequently, although I think your syntax is kinda cute, I don't think it adds enough benefit to be worth the change. My vote is +0. -- Steven From anikom15 at gmail.com Fri Mar 9 00:00:06 2012 From: anikom15 at gmail.com (Westley =?iso-8859-1?Q?Mart=EDnez?=) Date: Thu, 8 Mar 2012 15:00:06 -0800 Subject: [Python-ideas] PEP In-Reply-To: References: Message-ID: <20120308230006.GA15865@kubrick> On Thu, Mar 08, 2012 at 02:08:23PM -0700, Mark Janssen wrote: > On Thu, Feb 9, 2012 at 5:18 PM, Guido van Rossum wrote: > > > A dictionary would (then) be a SET of these. (Voila! things have already > >> gotten simplified.) > >> > > > > Really? So {a:1, a:2} would be a dict of length 2? > > > > Eventually, I also think this will seque and integrate nicely into Mark > >> Shannon's "shared-key dict" proposal (PEP 412). > >> > >> > I just noticed something in Guido's example. Something gives me a strange > feeling that using a variable as a key doesn't smell right. Presumably > Python just hashes the variable's id, or uses the id itself as the key, but > I wonder if anyone's noticed any problems with this, and whether the hash > collision problems could be solved by removing this?? Does anyone even > use this functionality -- of a *variable* (not a string) as a dict key? > > mark Using a variable as a key is valid since the object itself is used for a key. There've been times when I've used ints as keys but beyond that nothing else (besides strings). From anikom15 at gmail.com Fri Mar 9 00:05:32 2012 From: anikom15 at gmail.com (Westley =?iso-8859-1?Q?Mart=EDnez?=) Date: Thu, 8 Mar 2012 15:05:32 -0800 Subject: [Python-ideas] import fallback syntax In-Reply-To: <4F58F297.9050507@simplistix.co.uk> References: <4F58F297.9050507@simplistix.co.uk> Message-ID: <20120308230532.GB15865@kubrick> On Thu, Mar 08, 2012 at 09:55:35AM -0800, Chris Withers wrote: > Hi All, > > I see a lot of Python like this: > > try: > from cDecimal import Decimal > except ImportError: > from decimal import Decimal > > ...and nest deeper if you're talking about ElementTree. > > How about some syntactical sugar to make this less obnoxious: > > from cDecimal or decimal import Decimal > > from x.y.z import X as X or > a.b.z import Y as X or > > what do people think? > > Chris I seem to recall something like this proposed some time ago. I believe one argument against it was that or would have to change its meaning based on context. Changes to the language need to make expressing a particular idiom possible, simpler, and more readable. To me the proposed expression is less clear, and the current method is very clear and works quite well. -1 From anikom15 at gmail.com Fri Mar 9 00:06:47 2012 From: anikom15 at gmail.com (Westley =?iso-8859-1?Q?Mart=EDnez?=) Date: Thu, 8 Mar 2012 15:06:47 -0800 Subject: [Python-ideas] Loop labels In-Reply-To: References: Message-ID: <20120308230647.GC15865@kubrick> On Fri, Mar 09, 2012 at 02:52:08AM +0800, Matt Joiner wrote: > Several languages allow break and continue statements to take a count > or loop label for greater flexibility. > > Dealing with nested loops and control flow around them is definitely > something I and probably most programmers deal with everyday. > Generally for very complex control flow one might employ functions, > and use return statements to work around any shortcomings. This is not > always ideal in CPython because of the performance cost of function > calls, and lack of anonymous functions. The other work around is > usually goto statements, which clearly aren't available or appropriate > in Python. > > So what about the following extensions? > > Allow for and while statements to be labelled using "as". > > Allow break and continue to take the name of a containing loop, or an > integer. The corresponding named loop, or the nth containing loop are > treated as though they are the nearest enclosing loop. > > loop_label ::= identifier > > break_stmt ::= "break" [decimalinteger | loop_label] > > continue_stmt ::= "continue" [decimalinteger | loop_label] > > while_stmt ::= "while" expression ["as" loop_label] ":" suite > ["else" ":" suite] > > for_stmt ::= "for" target_list "in" expression_list ["as" loop_label] ":" suite > ["else" ":" suite] > > Here's a really naive example: > for a in b as c: > ??? for d in e: > ??????? break c # or break 2 > ??? for f in g: > ??????? continue c # or continue 2 -1; if we do this we might as well add goto labels. From aquavitae69 at gmail.com Fri Mar 9 06:06:47 2012 From: aquavitae69 at gmail.com (David Townshend) Date: Fri, 9 Mar 2012 07:06:47 +0200 Subject: [Python-ideas] Loop labels In-Reply-To: <20120308230647.GC15865@kubrick> References: <20120308230647.GC15865@kubrick> Message-ID: The biggest problem is not the new syntax. It's the new type of object that would be needed, Label, which lies in some magical place half way between a name and a keyword. What would be the result of the following code? loops = [] for i in range(4) as label: print(type(label), dir(label)) loops.append(label) for label in loops as newlabel: break label David On Mar 9, 2012 1:08 AM, "Westley Mart?nez" wrote: > On Fri, Mar 09, 2012 at 02:52:08AM +0800, Matt Joiner wrote: > > Several languages allow break and continue statements to take a count > > or loop label for greater flexibility. > > > > Dealing with nested loops and control flow around them is definitely > > something I and probably most programmers deal with everyday. > > Generally for very complex control flow one might employ functions, > > and use return statements to work around any shortcomings. This is not > > always ideal in CPython because of the performance cost of function > > calls, and lack of anonymous functions. The other work around is > > usually goto statements, which clearly aren't available or appropriate > > in Python. > > > > So what about the following extensions? > > > > Allow for and while statements to be labelled using "as". > > > > Allow break and continue to take the name of a containing loop, or an > > integer. The corresponding named loop, or the nth containing loop are > > treated as though they are the nearest enclosing loop. > > > > loop_label ::= identifier > > > > break_stmt ::= "break" [decimalinteger | loop_label] > > > > continue_stmt ::= "continue" [decimalinteger | loop_label] > > > > while_stmt ::= "while" expression ["as" loop_label] ":" suite > > ["else" ":" suite] > > > > for_stmt ::= "for" target_list "in" expression_list ["as" loop_label] > ":" suite > > ["else" ":" suite] > > > > Here's a really naive example: > > for a in b as c: > > for d in e: > > break c # or break 2 > > for f in g: > > continue c # or continue 2 > > -1; if we do this we might as well add goto labels. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Fri Mar 9 09:22:26 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 9 Mar 2012 16:22:26 +0800 Subject: [Python-ideas] Loop labels In-Reply-To: References: <20120308230647.GC15865@kubrick> Message-ID: Yeah it's definitely non trivial. Reading this is what got me thinking about it in Python: mortoray.com/2011/10/23/the-ideal-language-has-goto/ On Mar 9, 2012 1:07 PM, "David Townshend" wrote: > The biggest problem is not the new syntax. It's the new type of object > that would be needed, Label, which lies in some magical place half way > between a name and a keyword. What would be the result of the following > code? > > loops = [] > for i in range(4) as label: > print(type(label), dir(label)) > loops.append(label) > for label in loops as newlabel: > break label > > David > On Mar 9, 2012 1:08 AM, "Westley Mart?nez" wrote: > >> On Fri, Mar 09, 2012 at 02:52:08AM +0800, Matt Joiner wrote: >> > Several languages allow break and continue statements to take a count >> > or loop label for greater flexibility. >> > >> > Dealing with nested loops and control flow around them is definitely >> > something I and probably most programmers deal with everyday. >> > Generally for very complex control flow one might employ functions, >> > and use return statements to work around any shortcomings. This is not >> > always ideal in CPython because of the performance cost of function >> > calls, and lack of anonymous functions. The other work around is >> > usually goto statements, which clearly aren't available or appropriate >> > in Python. >> > >> > So what about the following extensions? >> > >> > Allow for and while statements to be labelled using "as". >> > >> > Allow break and continue to take the name of a containing loop, or an >> > integer. The corresponding named loop, or the nth containing loop are >> > treated as though they are the nearest enclosing loop. >> > >> > loop_label ::= identifier >> > >> > break_stmt ::= "break" [decimalinteger | loop_label] >> > >> > continue_stmt ::= "continue" [decimalinteger | loop_label] >> > >> > while_stmt ::= "while" expression ["as" loop_label] ":" suite >> > ["else" ":" suite] >> > >> > for_stmt ::= "for" target_list "in" expression_list ["as" loop_label] >> ":" suite >> > ["else" ":" suite] >> > >> > Here's a really naive example: >> > for a in b as c: >> > for d in e: >> > break c # or break 2 >> > for f in g: >> > continue c # or continue 2 >> >> -1; if we do this we might as well add goto labels. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From steve at pearwood.info Fri Mar 9 10:18:42 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 9 Mar 2012 20:18:42 +1100 Subject: [Python-ideas] Loop labels In-Reply-To: References: <20120308230647.GC15865@kubrick> Message-ID: <20120309091841.GB30395@ando> On Fri, Mar 09, 2012 at 07:06:47AM +0200, David Townshend wrote: > The biggest problem is not the new syntax. It's the new type of object > that would be needed, Label, which lies in some magical place half way > between a name and a keyword. Labels are neither names nor keywords nor objects. They would be instructions to the compiler, nothing more. The idea of labelled break/continue is not a new invention. Before criticising it (or for that matter, praising it), we should see how it works in other languages. Java has labelled loops: http://www.cs.umd.edu/~clin/MoreJava/ControlFlow/break.html So does Javascript: http://www.tutorialspoint.com/javascript/javascript_loop_control.htm And Groovy: http://docs.codehaus.org/display/GROOVY/JN2535-Control > What would be the result of the following code? > > loops = [] > for i in range(4) as label: > print(type(label), dir(label)) NameError, because there is no name "label". > loops.append(label) Again, NameError. > for label in loops as newlabel: > break label SyntaxError, because the "label" loop is not enclosing the break. For what it's worth, I used to be against this idea as adding unnecessary complexity, until a year or so ago when I came across a use-case which was naturally written as nested loops with labelled breaks. Now I'm sold on the idea. I ended up re-writing the code to use functions, but it really wasn't natural. It felt forced. I just wish I could remember what the algorithm was... :( -- Steven From anacrolix at gmail.com Fri Mar 9 12:08:05 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 9 Mar 2012 19:08:05 +0800 Subject: [Python-ideas] Loop labels In-Reply-To: <20120309091841.GB30395@ando> References: <20120308230647.GC15865@kubrick> <20120309091841.GB30395@ando> Message-ID: > Labels are neither names nor keywords nor objects. They would be > instructions to the compiler, nothing more. Yep > The idea of labelled break/continue is not a new invention. Before > criticising it (or for that matter, praising it), we should see how it > works in other languages. It's also in C (in the form of goto), some shells. > For what it's worth, I used to be against this idea as adding > unnecessary complexity, until a year or so ago when I came across a > use-case which was naturally written as nested loops with labelled > breaks. Now I'm sold on the idea. I ended up re-writing the code to use > functions, but it really wasn't natural. It felt forced. It's something that comes up enough to be of use. I'd compare it to the with statement, which isn't always useful, but handles the majority of such cases. A goto is the full fledged try/finally equivalent. Just as the with statement prevents people from incorrectly implementing their try/finally's for common cases, Labelled break/continues would prevent people from using state variables erroneously, and taking the corresponding performance hit either there, or by creating additional functions. From ncoghlan at gmail.com Fri Mar 9 12:27:46 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 9 Mar 2012 21:27:46 +1000 Subject: [Python-ideas] Loop labels In-Reply-To: References: <20120308230647.GC15865@kubrick> <20120309091841.GB30395@ando> Message-ID: On Fri, Mar 9, 2012 at 9:08 PM, Matt Joiner wrote: > It's something that comes up enough to be of use. I'd compare it to > the with statement, which isn't always useful, but handles the > majority of such cases. A goto is the full fledged try/finally > equivalent. You are setting your bar for new syntax proposals *way* too low. In a language with an iterator protocol, many nested loops are better written as generators with an embedded return. Labelled loops then become an attractive nuisance that distracts people from other ways to structure their algorithm such that it is easier to maintain. So no, you can't just wave your hands in the air and say "It's something that comes up enough to be of use" and get away with it. PEP 380 only got away with being a bit sketchy on concrete use cases because generators that were used with send() and throw() so clearly couldn't be refactored properly in a post-PEP 342 world. PEP 343 met the much higher bar of making it easy to deal with a wide range of *extraordinarily* common tasks like closing files, using thread synchronisation primitives, standardising exception handling, etc, etc. Loop labels have never even come *close* to attaining that standard. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From jkbbwr at gmail.com Fri Mar 9 18:05:51 2012 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 9 Mar 2012 17:05:51 +0000 Subject: [Python-ideas] Serializable method Message-ID: So originally I presented this idea for serialize methods something like to_json() but the idea was rightly shot down where the language would have to support most formats and this could lead to confusion or complication. So this is a newer version of that idea. I think that object should provide an __serializable__ method which in-turn allows the user to define in it how the object is to be serialized, the default operation should be something along the lines of return self.__dict__ but that is just semantics. The idea with an object that offers a serializable method means that the object can be passed directly to any formater in python that offers a .dump method and the object is immediately formatted how the end user wants the data to be, without needing to write a middle layer formatter for this object. Is this another terrible idea from me? Or is there some ground in this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From phd at phdru.name Fri Mar 9 18:22:40 2012 From: phd at phdru.name (Oleg Broytman) Date: Fri, 9 Mar 2012 21:22:40 +0400 Subject: [Python-ideas] Serializable method In-Reply-To: References: Message-ID: <20120309172240.GA10480@iskra.aviel.ru> On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: > I think that object should provide an __serializable__ method which in-turn > allows the user to define in it how the object is to be serialized, the > default operation should be something along the lines of return > self.__dict__ Do you mean __getstate__? http://docs.python.org/library/pickle.html#the-pickle-protocol Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From jkbbwr at gmail.com Fri Mar 9 18:27:14 2012 From: jkbbwr at gmail.com (Jakob Bowyer) Date: Fri, 9 Mar 2012 17:27:14 +0000 Subject: [Python-ideas] Serializable method In-Reply-To: <20120309172240.GA10480@iskra.aviel.ru> References: <20120309172240.GA10480@iskra.aviel.ru> Message-ID: Another occasion I should have read the docs :L On Fri, Mar 9, 2012 at 5:22 PM, Oleg Broytman wrote: > On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: > > I think that object should provide an __serializable__ method which > in-turn > > allows the user to define in it how the object is to be serialized, the > > default operation should be something along the lines of return > > self.__dict__ > > Do you mean __getstate__? > http://docs.python.org/library/pickle.html#the-pickle-protocol > > Oleg. > -- > Oleg Broytman http://phdru.name/ phd at phdru.name > Programmers don't die, they just GOSUB without RETURN. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Mar 9 19:16:24 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Fri, 09 Mar 2012 13:16:24 -0500 Subject: [Python-ideas] Loop labels In-Reply-To: References: <20120308230647.GC15865@kubrick> Message-ID: On 3/9/2012 3:22 AM, Matt Joiner wrote: > Yeah it's definitely non trivial. Reading this is what got me thinking > about it in Python: mortoray.com/2011/10/23/the-ideal-language-has-goto/ > I disagree as most would understand the claim. Let us consider his first example: location_t where; for( int i=0; i < num_rows; ++i ) { for( int j=0; j < num_cols; ++j ) { if( matrix(i,j).is_what_we_want() ) { where.set(i,j); goto found; } } } throw error( "not-found" ); found: //do something with it Let us leave aside the fact that searching the matrix should normally be factored out as a function or method unto itself, separate from code that uses the found object. A Python solution is to use the implied goto of try/except/else and make 'where' an exception: class where(Exception): def __init__(self, i, j): self.i = i self.j = j try: for i in range(num_rows): for j in range(num_cols): if matrix(i,j).is_what_we_want(): raise where(i,j) raise ValueError('matrix does not have what we want') except where as w: do_something(w) If one wants to reuse the app specific exception, replace 'raise ValueError' with the following at the end. else: raise where(None,None) Either way, this is as least as clean and clear as the goto example. One advantage is that the jump information object is created and initialized only when and where needed. A virtue of try: is that it tells the reader that there is going to be some jumpy control-flow. People too often think that exceptions are errors or only for errors. There are not (which is why those that *are* are named SomethingError). They are jump objects that carry information with the jump. Also exception gotos obey the restriction of only going up the stack. So Python already has a version of what the Mortoray advocates. (And hence, in a way, I agree with the claim ;-). The redo example can be done similarly. There are two types of jumps. class Redo(Exception): "Restart the loop process from the top" class Break(Exception): "We are completely done with the loop process" while True: try: if must_redo_from_top: raise Redo() if completely_done: raise Break except Redo: pass except Break: break The error-handling example is easily done with a Broken exception. If performance is an issue for a state machine, it is best written in C. If the machine is very much more complex than the example, it is best generated programmatically from higher-level specs. Code in the example is the sort of spaghetti code that become unreadable and unmaintainable with very many more states and input conditions. (I am old enough to have tried -- and sometimes given up -- working with such code.) Even in C, it is often better, when possible, to use tables to map state and input to action, output, and next state. That, of course, can be done just as well (though slower) in Python. In fact, it may be easier in Python if the table is sparse. The result can be structurally simple code like current_state = initial_state for item in input_stream: action, output, current_state = table[current_state][item] # or table[current_state].get(item,default) for sparse rows # replace 'item' with 'group(item)' to classify inputs perform(action) print(output) if current_state == exit_state: return -- Terry Jan Reedy From techtonik at gmail.com Fri Mar 9 23:36:53 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 10 Mar 2012 01:36:53 +0300 Subject: [Python-ideas] Serializable method In-Reply-To: References: <20120309172240.GA10480@iskra.aviel.ru> Message-ID: Pickle is insecure, unfortunately, so a generic module to serialize and unserialize Python objects (or data containers) securely, without the need of constructor, would be awesome. However, magical methods are evil. It will be hard to find the source of error if the logic in your magic level fail. -- anatoly t. On Fri, Mar 9, 2012 at 8:27 PM, Jakob Bowyer wrote: > Another?occasion?I should have read the docs :L > > > On Fri, Mar 9, 2012 at 5:22 PM, Oleg Broytman wrote: >> >> On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: >> > I think that object should provide an __serializable__ method which >> > in-turn >> > allows the user to define in it how the object is to be serialized, the >> > default operation should be something along the lines of ?return >> > self.__dict__ >> >> ? Do you mean __getstate__? >> http://docs.python.org/library/pickle.html#the-pickle-protocol >> >> Oleg. >> -- >> ? ? Oleg Broytman ? ? ? ? ? ?http://phdru.name/ ? ? ? ? ? ?phd at phdru.name >> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From guido at python.org Fri Mar 9 23:40:48 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 9 Mar 2012 14:40:48 -0800 Subject: [Python-ideas] Serializable method In-Reply-To: References: <20120309172240.GA10480@iskra.aviel.ru> Message-ID: So what's your proposal) --Guido van Rossum (sent from Android phone) On Mar 9, 2012 2:38 PM, "anatoly techtonik" wrote: > Pickle is insecure, unfortunately, so a generic module to serialize > and unserialize Python objects (or data containers) securely, without > the need of constructor, would be awesome. However, magical methods > are evil. It will be hard to find the source of error if the logic in > your magic level fail. > -- > anatoly t. > > > > On Fri, Mar 9, 2012 at 8:27 PM, Jakob Bowyer wrote: > > Another occasion I should have read the docs :L > > > > > > On Fri, Mar 9, 2012 at 5:22 PM, Oleg Broytman wrote: > >> > >> On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: > >> > I think that object should provide an __serializable__ method which > >> > in-turn > >> > allows the user to define in it how the object is to be serialized, > the > >> > default operation should be something along the lines of return > >> > self.__dict__ > >> > >> Do you mean __getstate__? > >> http://docs.python.org/library/pickle.html#the-pickle-protocol > >> > >> Oleg. > >> -- > >> Oleg Broytman http://phdru.name/ > phd at phdru.name > >> Programmers don't die, they just GOSUB without RETURN. > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Fri Mar 9 23:42:01 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 9 Mar 2012 23:42:01 +0100 Subject: [Python-ideas] Serializable method References: <20120309172240.GA10480@iskra.aviel.ru> Message-ID: <20120309234201.664c3a40@pitrou.net> On Sat, 10 Mar 2012 01:36:53 +0300 anatoly techtonik wrote: > Pickle is insecure, http://docs.python.org/dev/library/pickle.html#restricting-globals From masklinn at masklinn.net Fri Mar 9 23:51:40 2012 From: masklinn at masklinn.net (Masklinn) Date: Fri, 9 Mar 2012 23:51:40 +0100 Subject: [Python-ideas] Serializable method In-Reply-To: <20120309234201.664c3a40@pitrou.net> References: <20120309172240.GA10480@iskra.aviel.ru> <20120309234201.664c3a40@pitrou.net> Message-ID: <38E4FB48-24F5-40D8-8F91-DD1445C7DE5D@masklinn.net> On 2012-03-09, at 23:42 , Antoine Pitrou wrote: > On Sat, 10 Mar 2012 01:36:53 +0300 > anatoly techtonik > wrote: >> Pickle is insecure, > > http://docs.python.org/dev/library/pickle.html#restricting-globals Even with that, isn't Pickle open to the same issues `eval` (with restricted locals and globals) is, of innocuous code giving indirect access to "unsafe" structures and functions? From techtonik at gmail.com Fri Mar 9 23:56:02 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 10 Mar 2012 01:56:02 +0300 Subject: [Python-ideas] Serializable method In-Reply-To: References: <20120309172240.GA10480@iskra.aviel.ru> Message-ID: Improve documentation to point users to JSON module? http://docs.python.org/library/json.html I didn't make any analysis if it is secure, but it seems a good starting point. The API seems a little hackish - perhaps there should be a recipe book. There is also http://home.gna.org/oomadness/en/cerealizer/index.html linked from comments on this pickle insecurity research that can be handy http://nadiana.com/python-pickle-insecure -- anatoly t. On Sat, Mar 10, 2012 at 1:40 AM, Guido van Rossum wrote: > So what's your proposal) > > --Guido van Rossum (sent from Android phone) > > On Mar 9, 2012 2:38 PM, "anatoly techtonik" wrote: >> >> Pickle is insecure, unfortunately, so a generic module to serialize >> and unserialize Python objects (or data containers) securely, without >> the need of constructor, would be awesome. However, magical methods >> are evil. It will be hard to find the source of error if the logic in >> your magic level fail. >> -- >> anatoly t. >> >> >> >> On Fri, Mar 9, 2012 at 8:27 PM, Jakob Bowyer wrote: >> > Another?occasion?I should have read the docs :L >> > >> > >> > On Fri, Mar 9, 2012 at 5:22 PM, Oleg Broytman wrote: >> >> >> >> On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: >> >> > I think that object should provide an __serializable__ method which >> >> > in-turn >> >> > allows the user to define in it how the object is to be serialized, >> >> > the >> >> > default operation should be something along the lines of ?return >> >> > self.__dict__ >> >> >> >> ? Do you mean __getstate__? >> >> http://docs.python.org/library/pickle.html#the-pickle-protocol >> >> >> >> Oleg. >> >> -- >> >> ? ? Oleg Broytman ? ? ? ? ? ?http://phdru.name/ >> >> ?phd at phdru.name >> >> ? ? ? ? ? Programmers don't die, they just GOSUB without RETURN. >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> http://mail.python.org/mailman/listinfo/python-ideas >> > >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > http://mail.python.org/mailman/listinfo/python-ideas >> > >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas From jim at jimrollenhagen.com Fri Mar 9 23:58:50 2012 From: jim at jimrollenhagen.com (Jim Rollenhagen) Date: Fri, 9 Mar 2012 14:58:50 -0800 Subject: [Python-ideas] Serializable method In-Reply-To: References: <20120309172240.GA10480@iskra.aviel.ru> Message-ID: This idea was actually started because we were talking about how not all objects/types are JSON-serializable. The example at hand was the bytes type. // Jim On Friday, March 9, 2012 at 2:56 PM, anatoly techtonik wrote: > Improve documentation to point users to JSON module? > http://docs.python.org/library/json.html > > I didn't make any analysis if it is secure, but it seems a good starting point. > The API seems a little hackish - perhaps there should be a recipe book. > > There is also http://home.gna.org/oomadness/en/cerealizer/index.html linked > from comments on this pickle insecurity research that can be handy > http://nadiana.com/python-pickle-insecure > -- > anatoly t. > > > > On Sat, Mar 10, 2012 at 1:40 AM, Guido van Rossum wrote: > > So what's your proposal) > > > > --Guido van Rossum (sent from Android phone) > > > > On Mar 9, 2012 2:38 PM, "anatoly techtonik" wrote: > > > > > > Pickle is insecure, unfortunately, so a generic module to serialize > > > and unserialize Python objects (or data containers) securely, without > > > the need of constructor, would be awesome. However, magical methods > > > are evil. It will be hard to find the source of error if the logic in > > > your magic level fail. > > > -- > > > anatoly t. > > > > > > > > > > > > On Fri, Mar 9, 2012 at 8:27 PM, Jakob Bowyer wrote: > > > > Another occasion I should have read the docs :L > > > > > > > > > > > > On Fri, Mar 9, 2012 at 5:22 PM, Oleg Broytman wrote: > > > > > > > > > > On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: > > > > > > I think that object should provide an __serializable__ method which > > > > > > in-turn > > > > > > allows the user to define in it how the object is to be serialized, > > > > > > the > > > > > > default operation should be something along the lines of return > > > > > > self.__dict__ > > > > > > > > > > > > > > > > > > > > > Do you mean __getstate__? > > > > > http://docs.python.org/library/pickle.html#the-pickle-protocol > > > > > > > > > > Oleg. > > > > > -- > > > > > Oleg Broytman http://phdru.name/ > > > > > phd at phdru.name (mailto:phd at phdru.name) > > > > > Programmers don't die, they just GOSUB without RETURN. > > > > > _______________________________________________ > > > > > Python-ideas mailing list > > > > > Python-ideas at python.org (mailto:Python-ideas at python.org) > > > > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > Python-ideas mailing list > > > > Python-ideas at python.org (mailto:Python-ideas at python.org) > > > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > > > _______________________________________________ > > > Python-ideas mailing list > > > Python-ideas at python.org (mailto:Python-ideas at python.org) > > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org (mailto:Python-ideas at python.org) > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Mar 10 00:00:54 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 10 Mar 2012 00:00:54 +0100 Subject: [Python-ideas] Serializable method References: <20120309172240.GA10480@iskra.aviel.ru> <20120309234201.664c3a40@pitrou.net> <38E4FB48-24F5-40D8-8F91-DD1445C7DE5D@masklinn.net> Message-ID: <20120310000054.04d9f7a3@pitrou.net> On Fri, 9 Mar 2012 23:51:40 +0100 Masklinn wrote: > On 2012-03-09, at 23:42 , Antoine Pitrou wrote: > > On Sat, 10 Mar 2012 01:36:53 +0300 > > anatoly techtonik > > wrote: > >> Pickle is insecure, > > > > http://docs.python.org/dev/library/pickle.html#restricting-globals > > Even with that, isn't Pickle open to the same issues `eval` > (with restricted locals and globals) is, of innocuous code giving > indirect access to "unsafe" structures and functions? I don't know, does anyone have a proof-of-concept exploit for that? Regards Antoine. From masklinn at masklinn.net Sat Mar 10 00:07:36 2012 From: masklinn at masklinn.net (Masklinn) Date: Sat, 10 Mar 2012 00:07:36 +0100 Subject: [Python-ideas] Serializable method In-Reply-To: References: <20120309172240.GA10480@iskra.aviel.ru> Message-ID: <496248BE-83CF-4251-ACDF-4D4F274AD968@masklinn.net> On 2012-03-09, at 23:58 , Jim Rollenhagen wrote: > This idea was actually started because we were talking about how not all objects/types are JSON-serializable. The example at hand was the bytes type. Technically, all object types are JSON-serializable since you can plug custom encoding schemes in. Of course, practically few types/libraries provide a `JSONEncoder` and an object hook so you'll have to build your own if you want to serialize and deserialize non-core types. On the other hand, since I'm not sure there's any community standard for the JSON serialization of e.g. a datetime, it's probably for the best that providing that is your job, because the library would very likely provide something you don't want or can't work with. From steve at pearwood.info Sat Mar 10 00:22:57 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sat, 10 Mar 2012 10:22:57 +1100 Subject: [Python-ideas] Serializable method In-Reply-To: References: Message-ID: <20120309232257.GB2442@ando> On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: > I think that object should provide an __serializable__ method which in-turn > allows the user to define in it how the object is to be serialized, I don't think this is a sensible approach. In general, you don't serialise a single object, you serialise a bunch of them. Suppose you serialise three objects spam, ham, and eggs to a single file. Unfortunately, spam uses pickle, ham uses JSON, and eggs uses plist. How would you read the data back later? How do you know which de-serialiser you should use for each object? What happens if you end up with ambiguous content? You would need some sort of meta-serialiser, that not just recorded each serialised string, but also the format of that string. I don't think that it is helpful to ask objects to serialise themselves, giving them the choice of what serialisation scheme to use. While freedom of choice is good, it should be the *caller* who chooses the scheme, not the individual objects. So at the very least, for this idea to have legs, you would have to mandate a serialisation scheme which well-behaved objects ought to support. But Python already has that: pickle. If you want to mandate a second scheme, to overcome the known deficiencies of pickle, that's a separate issue. > the > default operation should be something along the lines of return > self.__dict__ but that is just semantics. Returning __dict__ can't work, because not all objects have a self.__dict__, and for those that do, it is not a string but a dict. [Aside: I don't understand why people say "that is just semantics" to dismiss something as trivial. Semantics is the *meaning* of something. It is the most fundamental, critical property of language. Without semantics, if I say "I got in the car and drove to work" you would not know if I actually got in the car and drove to work, or stayed home to watch television.] -- Steven From ncoghlan at gmail.com Sat Mar 10 03:47:19 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sat, 10 Mar 2012 12:47:19 +1000 Subject: [Python-ideas] Loop labels In-Reply-To: References: <20120308230647.GC15865@kubrick> Message-ID: On Sat, Mar 10, 2012 at 4:16 AM, Terry Reedy wrote: > Let us leave aside the fact that searching the matrix should normally be > factored out as a function or method unto itself, separate from code that > uses the found object. No, let's *not* leave that out, because it gets to the heart of the "attractive nuisance" aspect of labelled loop proposals. The way this code should be written in Python: # You should have a class Matrix. Python is not C. # That class should offer a method to iterate over all the points in the matrix # Or, if it doesn't, you just write your own helper iterator def iter_points(matrix): for i in range(matrix.num_rows): for j in range(matrix.num_columns): yield i, j # And now the problem devolves to a conventional Pythonic search loop for i, j in iter_points(matrix): item = matrix[i, j] if item.is_interesting(): break else: raise RuntimeError("Nothing interesting found in {!r}".format(matrix)) # We use our interesting item here Note that continue in the loop body also always applies to the outermost loop because, from the compiler's point of view, there is only *one* loop. The fact that there is a second loop internally is hidden by the custom iterator. So, whenever I hear "we should have labelled loops" I hear "I forgot I could simply write a generator that produces the nested loop variables as a tuple and hides any extra loops needed to create them from the compiler". As Terry said: """Code in the example is the sort of spaghetti code that become unreadable and unmaintainable with very many more states and input conditions.""" Python offers higher level constructs for a reason. Changing the language definition because people choose not to use them would be foolish. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From masklinn at masklinn.net Sat Mar 10 12:59:35 2012 From: masklinn at masklinn.net (Masklinn) Date: Sat, 10 Mar 2012 12:59:35 +0100 Subject: [Python-ideas] Serializable method In-Reply-To: <20120309232257.GB2442@ando> References: <20120309232257.GB2442@ando> Message-ID: On 2012-03-10, at 00:22 , Steven D'Aprano wrote: > On Fri, Mar 09, 2012 at 05:05:51PM +0000, Jakob Bowyer wrote: > >> I think that object should provide an __serializable__ method which in-turn >> allows the user to define in it how the object is to be serialized, > > I don't think this is a sensible approach. In general, you don't > serialise a single object, you serialise a bunch of them. Suppose you > serialise three objects spam, ham, and eggs to a single file. > Unfortunately, spam uses pickle, ham uses JSON, and eggs uses plist. How > would you read the data back later? How do you know which de-serialiser > you should use for each object? What happens if you end up with > ambiguous content? If we consider the example of __getstate__ or custom JSON encoders, the output is not a string but a serializable structure (usually some sort of tagged dictionary, for pickle the protocol itself does the tagging via the typename): the object tree is first converted to fully serializable structures, then serialized at once (using a single format) avoiding this issue. But of course there's then the trouble of what a "serializable structure" is for a given serialization format (JSON will accept arbitrary dicts, but I think Protocol Buffer only accepts predefined structures, and XML or S-Expressions will require some sort of schema to encode how they represent key:value maps for this document) meaning a single "serializable" protocol likely won't work at the end of the day, as it can have but one output which may or may not match the capabilities of the format the user wants to serialize from and to. > I don't think that it is helpful to ask objects to serialise themselves, > giving them the choice of what serialisation scheme to use. While > freedom of choice is good, it should be the *caller* who chooses the > scheme, not the individual objects. See above, other serializers could hook themselves onto __getstate__ (I originally thought this was Oleg's suggestion, I must have been mistaken since nobody else interpreted it that way) but it still ends up with the format's semantics not necessarily mapping 1:1 to Python semantics. From techtonik at gmail.com Sat Mar 10 15:02:42 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 10 Mar 2012 17:02:42 +0300 Subject: [Python-ideas] Saving state or data from Python objects (Was: Serializable method) Message-ID: On Sat, Mar 10, 2012 at 2:59 PM, Masklinn wrote: > but it still ends > up with the format's semantics not necessarily mapping 1:1 to Python > semantics This is the right problem. If we take user approach to see what is the hard time users have - there are two different user stories: 1. Is about saving current state of a program by magically taking all object space (or a portion of) and save it in into a file Goals: Don't make me think Security: Not important (well, you're saving your program - what did you expect if everybody can alter it?) Concerns: Independence mismatch - what if source code for the saved state changes? - this needs further experiments Status: pickle or marshal seems to be designed just for this purpose, but does they provide an easy (one shot) solution for this scenario? 2. Is about saving data stored in Python objects into an some format Goals: Transform data into the format suitable for external requirement Goals: Exchange/alter data outside of Python process Goals: Make it easy, transparent and automatic (note no 'g' here) (May I call this Data Transformation Theory?) Basically, what we need it is to transform data. This requires some assumptions: 1. Transformation can be symmetrical (lossless) and non-symmetrical (lossy) 2. Non-symmetrical transformation can be made symmetrical by providing additional data about what is missed Now about Python serialization: 1. The problem is complex. 2. Up to the point that it becomes complicated 3. So there should be a systematic approach Systematic approach: 1. Define the scope of serialization (data in Python objects are fields and their values) 2. Define output format (JSON - for example) 3. Create mapping 3.1 Make sure lossy transformations are marked 3.2 Make sure 'additional data' to make them symmetrical is analysed 4. Create transformation rules 5. Alter rules to warn users (or raise) when transformation fails with explanation why 6. Summarize fundamental problems when transformations are not possible in documentation So, with such Data Transformation Framework it doesn't matter which is the target format. You can choose any. You can create your rules and test your particular chain of objects against default rules to see if transformation is possible, and if not - know why exactly. Full control -> confidence -> better tools and problem descriptions. Of course, such framework doesn't exist yet. ;) -- anatoly t. From storchaka at gmail.com Sat Mar 10 18:33:30 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 10 Mar 2012 19:33:30 +0200 Subject: [Python-ideas] Combining stat/lstat/fstatat etc. Message-ID: In Python3 added a large number of "at"-functions from the latter Posix standard. Such a large number of names only litters namespace. In C it is necessary for historical reasons (note that fstatat combines the stat and lstat and includes the possibility of extension) and because of static typing. But in Python because of dynamic typing we can extend existing functions. So instead of stat, lstat, and fstatat could use the same function `stat(path, *, dirfd=None, followlinks=True)`. Then `lstat(path)` == `stat(path, followlinks=False)`, and `fstatat(dirfd, path, flags)` == `stat(path, dirfd=(AT_FDCWD if dirfd is None else AT_FDCWD), followlinks=not (flags & AT_SYMLINK_NOFOLLOW))`. As special value for dirfd I suggest using the None, not AT_FDCWD (it's more pythonish). fstat could be included too, by specifically treating the case where the path is integer. And same for other functions. Old lstat and lchown will remain for compatibility as light wrappers around the stat, with time they may become deprecated. From ncoghlan at gmail.com Tue Mar 13 01:03:08 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 10:03:08 +1000 Subject: [Python-ideas] My objections to implicit package directories Message-ID: It seems the consensus at the PyCon US sprints is that implicit package directories are a wonderful idea and we should have more of those. I still disagree (emphatically), but am prepared to go along with it so long as my documented objections are clearly and explicitly addressed in the new combined PEP, and the benefits ascribed to implicit package directories in the new PEP are more compelling than "other languages do it that way, so we should too". To save people having to trawl around various mailing list threads and reading through PEP 395, I'm providing those objections in a consolidated form here. If reading these objections in one place causes people to have second thoughts about the wisdom of implicit package directories, even better. 1. Implicit package directories go against the Zen of Python Getting this one out of the way first. As I see it, implicit package directories violate at least 4 of the design principles in the Zen: - Explicit is better than implicit (my calling them implicit package directories is a deliberate rhetorical ploy to harp on this point, although it's also an accurate name) - If the implementation is hard to explain, it's a bad idea (see the section about backwards compatibility challenges) - Readability counts (see the section about introducing ambiguity into filesystem layouts) - Errors should never pass silently (see the section about implicit relative imports from main) 2. Implicit package directories pose awkward backwards compatibility challenges It concerns me gravely that the consensus proposal MvL posted is *backwards incompatible with Python 3.2*, as it deliberately omits one of the PEP 402 features that provided that backwards compatibility. Specifically, under the consensus, a subdirectory "foo" of a directory on sys.path will shadow a "foo.py" or "foo/__init__.py" that appears later on sys.path. As Python 3.2 would have found that latter module/package correctly, this is an unacceptable breach of the backwards compatibility requirements. PEP 402 at least got this right by always executing the first "foo.py" or "foo/__init__.py" it found, even if another "foo" directory was found earlier in sys.path. We can't just wave that additional complexity away if an implicit package directory proposal is going to remain backwards compatible with current layouts (e.g. if an application's starting directory included a "json" subfolder containing json files rather than Python code, the consensus approach as posted by MvL would render the standard library's json module inaccessible) 3. Implicit package directories introduce ambiguity into filesystem layouts With the current Python package design, there is a clear 1:1 mapping between the filesystem layout and the module hierarchy. For example: parent/ # This directory goes on sys.path project/ # The "project" package __init__.py # Explicit package marker code.py # The "project.code" module tests/ # The "project.tests" package __init__.py # Explicit package marker test_code.py # The "projects.tests.test_code" module Any explicit package directory approach will preserve this 1:1 mapping. For example, under PEP 382: parent/ # This directory goes on sys.path project.pyp/ # The "project" package code.py # The "project.code" module tests.pyp/ # The "project.tests" package test_code.py # The "projects.tests.test_code" module With implicit package directories, you can no longer tell purely from the code structure which directory is meant to be added to sys.path, as there are at least two valid mappings to the Python module hierarchy: parent/ # This directory goes on sys.path project/ # The "project" package code.py # The "project.code" module tests/ # The "project.tests" package test_code.py # The "projects.tests.test_code" module parent/ project/ # This directory goes on sys.path code.py # The "code" module tests/ # The "tests" package test_code.py # The "tests.test_code" module What are implicit package directories buying us in exchange for this inevitable ambiguity? What can we do with them that can't be done with explicit package directories? And no, "Java does it that way" is not a valid argument. 4. Implicit package directories will permanently entrench current newbie-hostile behaviour in __main__ It's a fact of life that Python beginners learn that they can do a quick sanity check on modules they're writing by including an "if __name__ == '__main__':" section at the end and doing one of 3 things: - run "python mymodule.py" - hit F5 (or the relevant hot key) in their IDE - double click the module in their filesystem browser - start the Python REPL and do "import mymodule" However, there are some serious caveats to that as soon as you move the module inside a package: - if you use explicit relative imports, you can import it, but not run it directly using any of the above methods - if you rely on implicit relative imports, the above direct execution methods should work most of the time, but you won't be able to import it - if you use absolute imports for your own package, nothing will work (unless the parent directory for your package is already on sys.path) - if you only use absolute imports for *other* packages, everything should be fine The errors you get in these cases are *horrible*. The interpreter doesn't really know what is going on, so it gives the user bad error messages. In large part, the "Why are my imports broken?" section in PEP 395 exists because I sat down to try to document what does and doesn't work when you attempt to directly execute a module from inside a package directory. In building the list of what would work properly ("python -m" from the parent directory of the package) and what would sometimes break (everything else), I realised that instead of documenting the entire hairy mess, the 1:1 mapping from the filesystem layout to the Python module hierarchy meant we could *just fix it* to not do the wrong thing by default. If implicit package directories are blessed for inclusion in Python 3.3, that opportunity is lost forever - with the loss of the unambiguous 1:1 mapping from the filesystem layout to the module hierarchy, it's no longer possible for the interpreter to figure out the right thing to do without guessing. PJE proposed that newbies be instructed to add the following boilerplate to their modules if they want to use "if __name__ == '__main__':" for sanity checking: import pkgutil pkgutil.script_module(__name__, 'project.code.test_code') This completely defeats the purpose of having explicit relative imports in the language, as it embeds the absolute name of the module inside the module itself. If a package subtree is ever moved or renamed, you will have to manually fix every script_module() invocation in that subtree. Double-keying data like this is just plain bad design. The package structure should be recorded explicitly in exactly one place: the filesystem. PJE has other objections to the PEP 395 proposal, specifically relating to its behaviour on package layouts where the directories added to sys.path contain __init__.py files, such that the developer's intent is not accurately reflected in their filesystem layout. Such layouts are *broken*, and the misbehaviour under PEP 395 won't be any worse than the misbehaviour with the status quo (sys.path[0] is set incorrectly in either case, it will just be fixable under PEP 395 by removing the extraneous __init__.py files). A similar argument applies to cases where a parent package __init__ plays games with sys.path (although the PEP 395 algorithm could likely be refined to better handle that situation). Regardless, if implicit package directories are accepted into Python 3.3 in any form, I *will* be immediately marking PEP 395 as Rejected due to incompatibility with an accepted PEP. I'll then (eventually, once I'm less annoyed about the need to do so) write a new PEP to address a subset of the issues previously covered by PEP 395 that omits any proposals that rely on explicit package directories. Also, I consider it a requirement that any implicit packages PEP include an update to the tutorial to explain to beginners what will and won't work when they attempt to directly execute a module from inside a Python package. After all, such a PEP is closing off any possibility of ever fixing the problem: it should have to deal with the consequences. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From larry at hastings.org Tue Mar 13 01:10:28 2012 From: larry at hastings.org (Larry Hastings) Date: Mon, 12 Mar 2012 17:10:28 -0700 Subject: [Python-ideas] Save memory when forking with *really* immutable objects Message-ID: <4F5E9074.70701@hastings.org> I'll admit in advance that this is in all likelyhood a terrible idea. What I'm curious about is why it wouldn't work, or if it wouldn't even help ;-) One problem for CPython is that it can't share data across processes very often. If you have an application server, and you fork a hundred processes to handle requests, your memory use will be "C * n * p" where C is a constant, n is the number of processes, and p is the average memory consumption of your app. I fear C is very nearly near 1.0. Most of Python's memory usage is on the heap, and Python uses its memory to store objects, and objects are reference counted, and reference counts change. So all the COW data pages get written to sooner or later. The obvious first step: add a magical reference count number that never changes, called Py_REF_ETERNAL. I added this to CPython trunk with a quick hack. It seemed to work; I threw in some asserts to test it, which passed, and it was passing the unit test suite. I discussed this with Martin who (as usual) made some excellent points. Martin suggests that this wouldn't help unless we could concentrate the Py_REF_ETERNAL objects in their own memory pools in the small block allocator. Otherwise we'd never get a page that didn't get written to sooner or later. Obviously all interned strings could get Py_REF_ETERNAL. A slightly more controversial idea: mark code objects (but only those that get unmarshaled, says Martin!) as Py_REF_ETERNAL too. Yes, you can unload code from sys.modules, but in practice if you ever import something you never throw it away for the life of the process. If we went this route we could probably mark most (all?) immutable objects that get unmarshaled with Py_REF_ETERNAL. Martin's statistics from writing the flexible string representation says that for a toy Django app, memory consumption is mostly strings, and most strings are short (< 16 or even < 8 bytes)... in other words, identifiers. So if you ran 100 toy Django instances it seems likely this would help! And no I haven't benchmarked it, /arry From larry at hastings.org Tue Mar 13 01:10:33 2012 From: larry at hastings.org (Larry Hastings) Date: Mon, 12 Mar 2012 17:10:33 -0700 Subject: [Python-ideas] Save memory when forking with *really* immutable objects Message-ID: <4F5E9079.2030009@hastings.org> I'll admit in advance that this is in all likelyhood a terrible idea. What I'm curious about is why it wouldn't work, or if it wouldn't even help ;-) One problem for CPython is that it can't share data across processes very often. If you have an application server, and you fork a hundred processes to handle requests, your memory use will be "C * n * p" where C is a constant, n is the number of processes, and p is the average memory consumption of your app. I fear C is very nearly near 1.0. Most of Python's memory usage is on the heap, and Python uses its memory to store objects, and objects are reference counted, and reference counts change. So all the COW data pages get written to sooner or later. The obvious first step: add a magical reference count number that never changes, called Py_REF_ETERNAL. I added this to CPython trunk with a quick hack. It seemed to work; I threw in some asserts to test it, which passed, and it was passing the unit test suite. I discussed this with Martin who (as usual) made some excellent points. Martin suggests that this wouldn't help unless we could concentrate the Py_REF_ETERNAL objects in their own memory pools in the small block allocator. Otherwise we'd never get a page that didn't get written to sooner or later. Obviously all interned strings could get Py_REF_ETERNAL. A slightly more controversial idea: mark code objects (but only those that get unmarshaled, says Martin!) as Py_REF_ETERNAL too. Yes, you can unload code from sys.modules, but in practice if you ever import something you never throw it away for the life of the process. If we went this route we could probably mark most (all?) immutable objects that get unmarshaled with Py_REF_ETERNAL. Martin's statistics from writing the flexible string representation says that for a toy Django app, memory consumption is mostly strings, and most strings are short (< 16 or even < 8 bytes)... in other words, identifiers. So if you ran 100 toy Django instances it seems likely this would help! And no I haven't benchmarked it, /arry From ncoghlan at gmail.com Tue Mar 13 01:16:32 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 10:16:32 +1000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: Gah, wrong list. Please don't reply here - that message will be showing up on import-sig shortly. From ericsnowcurrently at gmail.com Tue Mar 13 03:43:26 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 12 Mar 2012 19:43:26 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 5:03 PM, Nick Coghlan wrote: > It seems the consensus at the PyCon US sprints is that implicit > package directories are a wonderful idea and we should have more of > those. I still disagree (emphatically), but am prepared to go along > with it so long as my documented objections are clearly and explicitly > addressed in the new combined PEP, and the benefits ascribed to > implicit package directories in the new PEP are more compelling than > "other languages do it that way, so we should too". > > To save people having to trawl around various mailing list threads and > reading through PEP 395, I'm providing those objections in a > consolidated form here. If reading these objections in one place > causes people to have second thoughts about the wisdom of implicit > package directories, even better. > > 1. Implicit package directories go against the Zen of Python > > Getting this one out of the way first. As I see it, implicit package > directories violate at least 4 of the design principles in the Zen: > - Explicit is better than implicit (my calling them implicit package > directories is a deliberate rhetorical ploy to harp on this point, > although it's also an accurate name) > - If the implementation is hard to explain, it's a bad idea (see the > section about backwards compatibility challenges) > - Readability counts (see the section about introducing ambiguity into > filesystem layouts) > - Errors should never pass silently (see the section about implicit > relative imports from main) > > 2. Implicit package directories pose awkward backwards compatibility challenges > > It concerns me gravely that the consensus proposal MvL posted is > *backwards incompatible with Python 3.2*, as it deliberately omits one > of the PEP 402 features that provided that backwards compatibility. > Specifically, under the consensus, a subdirectory "foo" of a directory > on sys.path will shadow a "foo.py" or "foo/__init__.py" that appears > later on sys.path. As Python 3.2 would have found that latter > module/package correctly, this is an unacceptable breach of the > backwards compatibility requirements. PEP 402 at least got this right > by always executing the first "foo.py" or "foo/__init__.py" it found, > even if > another "foo" directory was found earlier in sys.path. > > We can't just wave that additional complexity away if an implicit > package directory proposal is going to remain backwards compatible > with current layouts (e.g. if an application's starting directory > included a "json" subfolder containing json files rather than Python > code, the consensus approach as posted by MvL would render the > standard library's json module inaccessible) > > 3. Implicit package directories introduce ambiguity into filesystem layouts > > With the current Python package design, there is a clear 1:1 mapping > between the filesystem layout and the module hierarchy. For example: > > ? ?parent/ ?# This directory goes on sys.path > ? ? ? ?project/ ?# The "project" package > ? ? ? ? ? ?__init__.py ?# Explicit package marker > ? ? ? ? ? ?code.py ?# The "project.code" module > ? ? ? ? ? ?tests/ ?# The "project.tests" package > ? ? ? ? ? ? ? ?__init__.py ?# Explicit package marker > ? ? ? ? ? ? ? ?test_code.py ?# The "projects.tests.test_code" module > > Any explicit package directory approach will preserve this 1:1 > mapping. For example, under PEP 382: > > ? ?parent/ ?# This directory goes on sys.path > ? ? ? ?project.pyp/ ?# The "project" package > ? ? ? ? ? ?code.py ?# The "project.code" module > ? ? ? ? ? ?tests.pyp/ ?# The "project.tests" package > ? ? ? ? ? ? ? ?test_code.py ?# The "projects.tests.test_code" module > > With implicit package directories, you can no longer tell purely from > the code structure which directory is meant to be added to sys.path, > as there are at least two valid mappings to the Python module > hierarchy: > > ? ?parent/ ?# This directory goes on sys.path > ? ? ? ?project/ ?# The "project" package > ? ? ? ? ? ?code.py ?# The "project.code" module > ? ? ? ? ? ?tests/ ?# The "project.tests" package > ? ? ? ? ? ? ? ?test_code.py ?# The "projects.tests.test_code" module > > ? ?parent/ > ? ? ? ?project/ ?# This directory goes on sys.path > ? ? ? ? ? ?code.py ?# The "code" module > ? ? ? ? ? ?tests/ ?# The "tests" package > ? ? ? ? ? ? ? ?test_code.py ?# The "tests.test_code" module > > What are implicit package directories buying us in exchange for this > inevitable ambiguity? What can we do with them that can't be done with > explicit package directories? And no, "Java does it that way" is not a > valid argument. > > 4. Implicit package directories will permanently entrench current > newbie-hostile behaviour in __main__ > > It's a fact of life that Python beginners learn that they can do a > quick sanity check on modules they're writing by including an "if > __name__ == '__main__':" section at the end and doing one of 3 things: > - run "python mymodule.py" > - hit F5 (or the relevant hot key) in their IDE > - double click the module in their filesystem browser > - start the Python REPL and do "import mymodule" > > However, there are some serious caveats to that as soon as you move > the module inside a package: > - if you use explicit relative imports, you can import it, but not run > it directly using any of the above methods > - if you rely on implicit relative imports, the above direct execution > methods should work most of the time, but you won't be able to import > it > - if you use absolute imports for your own package, nothing will work > (unless the parent directory for your package is already on sys.path) > - if you only use absolute imports for *other* packages, everything > should be fine > > The errors you get in these cases are *horrible*. The interpreter > doesn't really know what is going on, so it gives the user bad error > messages. > > In large part, the "Why are my imports broken?" section in PEP 395 > exists because I sat down to try to document what does and doesn't > work when you attempt to directly execute a module from inside a > package directory. In building the list of what would work properly > ("python -m" from the parent directory of the package) and what would > sometimes break (everything else), I realised that instead of > documenting the entire hairy mess, the 1:1 mapping from the filesystem > layout to the Python module hierarchy meant we could *just fix it* to > not do the wrong thing by default. If implicit package directories are > blessed for inclusion in Python 3.3, that opportunity is lost forever > - with the loss of the unambiguous 1:1 mapping from the filesystem > layout to the module hierarchy, it's no longer possible for the > interpreter to figure out the right thing to do without guessing. > > PJE proposed that newbies be instructed to add the following > boilerplate to their modules if they want to use "if __name__ == > '__main__':" for sanity checking: > > ? ?import pkgutil > ? ?pkgutil.script_module(__name__, 'project.code.test_code') > > This completely defeats the purpose of having explicit relative > imports in the language, as it embeds the absolute name of the module > inside the module itself. If a package subtree is ever moved or > renamed, you will have to manually fix every script_module() > invocation in that subtree. Double-keying data like this is just plain > bad design. The package structure should be recorded explicitly in > exactly one place: the filesystem. > > PJE has other objections to the PEP 395 proposal, specifically > relating to its behaviour on package layouts where the directories > added to sys.path contain __init__.py files, such that the developer's > intent is not accurately reflected in their filesystem layout. Such > layouts are *broken*, and the misbehaviour under PEP 395 won't be any > worse than the misbehaviour with the status quo (sys.path[0] is set > incorrectly in either case, it will just be fixable under PEP 395 by > removing the extraneous __init__.py files). A similar argument applies > to cases where a parent package __init__ plays games with sys.path > (although the PEP 395 algorithm could likely be refined to better > handle that situation). Regardless, if implicit package directories > are accepted into Python 3.3 in any form, I *will* be immediately > marking PEP 395 as Rejected due to incompatibility with an accepted > PEP. I'll then (eventually, once I'm less annoyed about the need to do > so) write a new PEP to address a subset of the issues previously > covered by PEP 395 that omits any proposals that rely on explicit > package directories. > > Also, I consider it a requirement that any implicit packages PEP > include an update to the tutorial to explain to beginners what will > and won't work when they attempt to directly execute a module from > inside a Python package. After all, such a PEP is closing off any > possibility of ever fixing the problem: it should have to deal with > the consequences. Hi Nick, The write-up was a little unclear on a main point and I think that's contributed to some confusion here. The path search will continue to work in exactly the same way as it does now, with one difference. Instead of the current ImportError when nothing matches, the mechanism for namespace packages would be used. The mechanism would create a namespace package with a __path__ matching the paths corresponding to all namespace package "portions". The likely implementation will simply track the namespace package __path__ during the initial (normal) path search and use it only when there are no matching modules nor regular packages. Packages without __init__.py would only be allowed for namespace packages. So effectively namespace packages would be problematic for PEP 395, but not normal packages. Ultimately this is a form of PEP 402 without so much complexity. The trade-off is it requires a new kind of package. As far as I understand them, most of your concerns are based on the idea that namespace packages would be included in the initial traversal of sys.path, which is not the case. It sounds like there are a couple points you made that may still need attention, but hopefully this at least helps clarify what we talked about. -eric From greg at krypto.org Tue Mar 13 04:49:04 2012 From: greg at krypto.org (Gregory P. Smith) Date: Mon, 12 Mar 2012 20:49:04 -0700 Subject: [Python-ideas] Save memory when forking with *really* immutable objects In-Reply-To: <4F5E9074.70701@hastings.org> References: <4F5E9074.70701@hastings.org> Message-ID: On Mon, Mar 12, 2012 at 5:10 PM, Larry Hastings wrote: > > I'll admit in advance that this is in all likelyhood a terrible idea. > What I'm curious about is why it wouldn't work, or if it wouldn't even > help ;-) > > One problem for CPython is that it can't share data across processes very > often. If you have an application server, and you fork a hundred processes > to handle requests, your memory use will be "C * n * p" where C is a > constant, n is the number of processes, and p is the average memory > consumption of your app. I fear C is very nearly near 1.0. Most of > Python's memory usage is on the heap, and Python uses its memory to store > objects, and objects are reference counted, and reference counts change. > So all the COW data pages get written to sooner or later. > Despite me really disliking anything that fork()s these days and generally not using fork anymore... I have been pondering this one on and off over the years as well, it could help with people using the fork()ing variant of multiprocessing (ie: its default today). If reference counts were moved out of the PyObject structure into a region of memory allocated specifically for reference counts, only those pages would need copying rather than virtually every random page of memory containing a PyObject. My initial thought was to do this by turning the existing refcount field into a pointer to the object's count or an array reference that code managing the reference count array would use to manipulate the count. Obviously either of these would have some performance impact and break the ABI. Some practical real-world-ish forking server and multiprocessing computation memory usage benchmarks need to be put together to measure the impact of any work on that. > The obvious first step: add a magical reference count number that never > changes, called Py_REF_ETERNAL. I added this to CPython trunk with a quick > hack. It seemed to work; I threw in some asserts to test it, which passed, > and it was passing the unit test suite. > > I discussed this with Martin who (as usual) made some excellent points. > Martin suggests that this wouldn't help unless we could concentrate the > Py_REF_ETERNAL objects in their own memory pools in the small block > allocator. Otherwise we'd never get a page that didn't get written to > sooner or later. > > Obviously all interned strings could get Py_REF_ETERNAL. A slightly more > controversial idea: mark code objects (but only those that get unmarshaled, > says Martin!) as Py_REF_ETERNAL too. Yes, you can unload code from > sys.modules, but in practice if you ever import something you never throw > it away for the life of the process. If we went this route we could > probably mark most (all?) immutable objects that get unmarshaled with > Py_REF_ETERNAL. > You have this working? neat. I toyed with making a magic value (-1 or -2 or something) mean "infinite" or "eternal" for ref counts a few years ago but things were crashing and I really didn't feel like trying to debug that one. It makes sense for any intern()'ed string to be set to eternal. If you know which objects will be eternal or not at allocation time, clustering them into different pages makes a lot of sense but I don't believe we express that meaningfully in our code today. -gps > Martin's statistics from writing the flexible string representation says > that for a toy Django app, memory consumption is mostly strings, and most > strings are short (< 16 or even < 8 bytes)... in other words, identifiers. > So if you ran 100 toy Django instances it seems likely this would help! > > And no I haven't benchmarked it, > > > /arry > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Mar 13 07:07:46 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 16:07:46 +1000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 10:03 AM, Nick Coghlan wrote: > 2. Implicit package directories pose awkward backwards compatibility challenges > > It concerns me gravely that the consensus proposal MvL posted is > *backwards incompatible with Python 3.2*, as it deliberately omits one > of the PEP 402 features that provided that backwards compatibility. > Specifically, under the consensus, a subdirectory "foo" of a directory > on sys.path will shadow a "foo.py" or "foo/__init__.py" that appears > later on sys.path. As Python 3.2 would have found that latter > module/package correctly, this is an unacceptable breach of the > backwards compatibility requirements. PEP 402 at least got this right > by always executing the first "foo.py" or "foo/__init__.py" it found, > even if > another "foo" directory was found earlier in sys.path. > > We can't just wave that additional complexity away if an implicit > package directory proposal is going to remain backwards compatible > with current layouts (e.g. if an application's starting directory > included a "json" subfolder containing json files rather than Python > code, the consensus approach as posted by MvL would render the > standard library's json module inaccessible) It has been pointed out that the above is based on a misreading of MvL's email. So, consider the following backwards compatibility concern instead: Many projects use the following snippet to find a json module: try: import json except ImportError: import simplejson as json Now, this particular snippet should still work fine with implicit package directories (even if a non-Python json directory exists on sys.path), since there *will* be a real json module in the standard library to find and the simplejson fallback won't be needed. However, for the general case: try: import foo except ImportError: import foobar as foo Then implicit package directories pose a backwards compatibility problem (specifically, if "foo" does not exist as a module or explicit package on sys.path, but there is a non-Python "foo/" directory, then "foo" will be silently be created as an empty package rather than falling back to "foobar"). Sure, the likelihood of that actually affecting anyone is fairly remote (although all it really takes is one broken uninstaller leaving a "foo" dir in site-packages), but we've rejected proposals in the past over smaller concerns than this. *Now*, my original comment about the consensus view rejecting complexity from PEP 402 by disregarding backwards compatibility concerns becomes accurate. PEP 402 addressed this issue specifically by disallowing direct imports of implicit packages (only finding them later when searching for submodules). This is in fact the motivating case given for that behaviour in the PEP: http://www.python.org/dev/peps/pep-0402/#backwards-compatibility-and-performance So, *why* are we adopting implicit packages again, given all the challenges they pose? What, exactly, is the problem with a ".pyp" extension that makes all this additional complexity the preferred choice? So far, I've only heard two *positive* statements in favour of implicit package directories: 1. Java/Perl/etc do it that way. I've already made it clear that I don't care about that argument. If it was all that compelling, we'd have implicit self by now. (However, clearly Guido favours it in this case, given his message that arrived while I was writing this one) 2. It (arguably) makes it easier to convert an existing package into a namespace package With implicit package directories, you just delete your empty __init__.py file to turn an existing package into a namespace package. With a PEP 382 style directory suffix, you have to change your directory name to append the ".pyp" (and, optionally, delete your __init__.py file, since it's now going to be ignored anyway). Barry's also tried to convince me that ".pyp" directories are somehow harder for distributions to deal with, but his only example looked like trying to use "yield from" in Python 3.2 and then complaining when it didn't work. However, so long as the backwards compatibility from PEP 402 is incorporated, and the new PEP proposed a specific addition to the tutorial to document the "never CD into a package, never double-click a file in a package to execute it, always use -m to execute modules from inside packages" guideline (and makes it clear that you may get strange and unpredictable behaviour if you ever break it), then I can learn to live with it. IDLE should also be updated to allow correct execution of submodules via F5 (I guess it will need some mechanism to be told what working directories to add to sys.path). It still seems to me that moving to a marker *suffix* (rather than a marker file) as PEP 382 proposes brings all the real practical benefits of implicit package directories (i.e. no empty __init__.py files wasting space) and absolutely *none* of the pain (i.e. no backwards compatibility concerns, no ambiguity in the filesystem to module hierarchy mapping, still able to fix direct execution of modules inside packages rather than having to explain forevermore why it doesn't work), but Guido clearly feels otherwise. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Tue Mar 13 07:35:01 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 13 Mar 2012 16:35:01 +1000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 4:07 PM, Nick Coghlan wrote: > It still seems to me that moving to a marker *suffix* (rather than a > marker file) as PEP 382 proposes brings all the real practical > benefits of implicit package directories (i.e. no empty __init__.py > files wasting space) and absolutely *none* of the pain (i.e. no > backwards compatibility concerns, no ambiguity in the filesystem to > module hierarchy mapping, still able to fix direct execution of > modules inside packages rather than having to explain forevermore why > it doesn't work), but Guido clearly feels otherwise. I think this paragraph really gets to the heart of what I'm objecting to. I agree wholeheartedly with the objective of eliminating __init__.py files, there's no need to convince me of that. However, *two* proposals were made to that end: PEP 382 kept the explicit marker, simply changing it to a directory suffix rather than a separate file. Simple, clean, straightforward, minimalist, effective. PEP 402 threw away the marker entirely, and then had to patch the package finding algorithm with a whole series of complications to avoid breaking backwards compatibility with Python 3.2. It also has the side effect of eliminating the 1:1 mapping between the filesystem and the module hierarchy. Once we lose that, there's no going back. What I really want out of the new PEP is a clear rationale for why the horrible package finding algorithm hacks needed to make the PEP 402 approach work in a backwards compatible way are to be preferred to the explicitly marked PEP 382 approach which *doesn't pose a backwards compatibility problem in the first place*. The other thing to keep in mind is that, if, for whatever reason, we decided further down the road that the explicit directory suffix solution wasn't good enough, then *we could change our minds* and allow implicit package directories after all (just as the formats for valid C extension module names have changed over time). There's no such freedom with implicit package directories - once they're in, they're in and we can never introduce a requirement for an explicit marker again without breaking working packages. Is it so bad that I want us to take baby steps here, rather than jumping straight to the implicit solution? Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From niki.spahiev at gmail.com Tue Mar 13 11:57:02 2012 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Tue, 13 Mar 2012 12:57:02 +0200 Subject: [Python-ideas] Save memory when forking with *really* immutable objects In-Reply-To: References: <4F5E9074.70701@hastings.org> Message-ID: On 13.03.2012 05:49, Gregory P. Smith wrote: > Despite me really disliking anything that fork()s these days and generally > not using fork anymore... I have been pondering this one on and off over > the years as well, it could help with people using the fork()ing variant of > multiprocessing (ie: its default today). > > If reference counts were moved out of the PyObject structure into a region > of memory allocated specifically for reference counts, only those pages > would need copying rather than virtually every random page of memory > containing a PyObject. My initial thought was to do this by turning the > existing refcount field into a pointer to the object's count or an array > reference that code managing the reference count array would use to > manipulate the count. Obviously either of these would have some > performance impact and break the ABI. > > Some practical real-world-ish forking server and multiprocessing > computation memory usage benchmarks need to be put together to measure the > impact of any work on that. This looks like dalvik VM in android. They do many things to preserve memory when forking. HTH Niki From solipsis at pitrou.net Tue Mar 13 12:09:18 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 13 Mar 2012 12:09:18 +0100 Subject: [Python-ideas] Save memory when forking with *really* immutable objects References: <4F5E9074.70701@hastings.org> Message-ID: <20120313120918.333053a1@pitrou.net> On Mon, 12 Mar 2012 17:10:28 -0700 Larry Hastings wrote: > > Martin's statistics from writing the flexible string representation says > that for a toy Django app, memory consumption is mostly strings, and > most strings are short (< 16 or even < 8 bytes)... in other words, > identifiers. So if you ran 100 toy Django instances it seems likely > this would help! How many MB do you save on a real app, though? By the way, "short strings are identifiers" is a fallacy. > And no I haven't benchmarked it, Well, you should. Regards Antoine. From p.f.moore at gmail.com Tue Mar 13 16:35:15 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 13 Mar 2012 15:35:15 +0000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On 13 March 2012 06:07, Nick Coghlan wrote: > However, so long as the backwards compatibility from PEP 402 is > incorporated, and the new PEP proposed a specific addition to the > tutorial to document the "never CD into a package, never double-click > a file in a package to execute it, always use -m to execute modules > from inside packages" guideline (and makes it clear that you may get > strange and unpredictable behaviour if you ever break it), then I can > learn to live with it. Whoa! I'm not sure I can. I just recently got bitten badly by this for real. The following was what I was doing: 1. I'm writing a package. 2. I'm trying to do the tests-as-I-develop approach (yes, I know I should have been doing this for years - so sue me :-)) 3. I have my tests as a subpackage of the main package. 3. I use the command line 4. I cd to the tests directory as that's the easiest way to edit tests: gvim test_dow to edit test_download_binaries.py. And yes, I had endless trouble trying to work out why I can't then run the tests from the command line. I consider the trouble I have as a bug - it *should* work, in my view. I understand why what I'm doing is an edge case, but intuitively, I don't like it not working. I can change my practices, or use an IDE, or something. But my workflow will be less comfortable for me, and I don't feel that I understand why I should have to. I *think* that what Nick is proposing is a fix for this (and if it isn't, can you fix this scenario too, please Nick? :-)) and the idea that it's going to get documented as "don't do that" strikes me as unfriendly, if not outright wrong. I also don't think it should be documented in the tutorial - it's not something a new developer hits, it's later on, when people are writing bigger, more complex code, that they hit it. In summary, I guess I support Nick's objections 3 and 4. We need a better response than just "don't do that", IMHO. Paul. From guido at python.org Tue Mar 13 16:57:43 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2012 08:57:43 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Tue, Mar 13, 2012 at 8:35 AM, Paul Moore wrote: > On 13 March 2012 06:07, Nick Coghlan wrote: >> However, so long as the backwards compatibility from PEP 402 is >> incorporated, and the new PEP proposed a specific addition to the >> tutorial to document the "never CD into a package, never double-click >> a file in a package to execute it, always use -m to execute modules >> from inside packages" guideline (and makes it clear that you may get >> strange and unpredictable behaviour if you ever break it), then I can >> learn to live with it. > > Whoa! I'm not sure I can. I just recently got bitten badly by this for > real. The following was what I was doing: > > 1. I'm writing a package. > 2. I'm trying to do the tests-as-I-develop approach (yes, I know I > should have been doing this for years - so sue me :-)) > 3. I have my tests as a subpackage of the main package. > 3. I use the command line > 4. I cd to the tests directory as that's the easiest way to edit > tests: gvim test_dow to edit test_download_binaries.py. > > And yes, I had endless trouble trying to work out why I can't then run > the tests from the command line. I consider the trouble I have as a > bug - it *should* work, in my view. I understand why what I'm doing is > an edge case, but intuitively, I don't like it not working. > > I can change my practices, or use an IDE, or something. But my > workflow will be less comfortable for me, and I don't feel that I > understand why I should have to. > > I *think* that what Nick is proposing is a fix for this (and if it > isn't, can you fix this scenario too, please Nick? :-)) and the idea > that it's going to get documented as "don't do that" strikes me as > unfriendly, if not outright wrong. I also don't think it should be > documented in the tutorial - it's not something a new developer hits, > it's later on, when people are writing bigger, more complex code, that > they hit it. > > In summary, I guess I support Nick's objections 3 and 4. We need a > better response than just "don't do that", IMHO. Oh, but there are several other solutions. For example, if you set PYTHONPATH to the directory *containing* your toplevel package, new code could be added that will discover the true toplevel no matter where you are. This code doesn't exist today, but Nick is proposing something similar looking for __init__.py files; the code that tries to find the script directory as a subdirectory of some path on sys.path could be added there. Also, the code Nick currently proposes for PEP 395 is still useful if you add __init__.py files to your package -- if you did that you wouldn't have to set PYTHONPATH (assuming we do add this code to Python 3.3, of course). -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Mar 13 17:15:14 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2012 09:15:14 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 11:35 PM, Nick Coghlan wrote: > On Tue, Mar 13, 2012 at 4:07 PM, Nick Coghlan wrote: >> It still seems to me that moving to a marker *suffix* (rather than a >> marker file) as PEP 382 proposes brings all the real practical >> benefits of implicit package directories (i.e. no empty __init__.py >> files wasting space) and absolutely *none* of the pain (i.e. no >> backwards compatibility concerns, no ambiguity in the filesystem to >> module hierarchy mapping, still able to fix direct execution of >> modules inside packages rather than having to explain forevermore why >> it doesn't work), but Guido clearly feels otherwise. > > I think this paragraph really gets to the heart of what I'm objecting to. > > I agree wholeheartedly with the objective of eliminating __init__.py > files, there's no need to convince me of that. > > However, *two* proposals were made to that end: > PEP 382 kept the explicit marker, simply changing it to a directory > suffix rather than a separate file. Simple, clean, straightforward, > minimalist, effective. > PEP 402 threw away the marker entirely, and then had to patch the > package finding algorithm with a whole series of complications to > avoid breaking backwards compatibility with Python 3.2. It also has > the side effect of eliminating the 1:1 mapping between the filesystem > and the module hierarchy. Once we lose that, there's no going back. > > What I really want out of the new PEP is a clear rationale for why the > horrible package finding algorithm hacks needed to make the PEP 402 > approach work in a backwards compatible way are to be preferred to the > explicitly marked PEP 382 approach which *doesn't pose a backwards > compatibility problem in the first place*. > > The other thing to keep in mind is that, if, for whatever reason, we > decided further down the road that the explicit directory suffix > solution wasn't good enough, then *we could change our minds* and > allow implicit package directories after all (just as the formats for > valid C extension module names have changed over time). > > There's no such freedom with implicit package directories - once > they're in, they're in and we can never introduce a requirement for an > explicit marker again without breaking working packages. > > Is it so bad that I want us to take baby steps here, rather than > jumping straight to the implicit solution? I think it comes down to this: I really, really, really hate directories with a suffix. I'd like to point out that the suffix is also introducing a backwards incompatibility: everybody will have to teach their tools, IDEs, and brains about .pyp directories, and they will also have to *rename* their directories (*if* they want to benefit from the new feature). Renaming directories is a huge pain -- I counted over 400 directories in Django, so that would mean renaming over 400 renames. In my experience renaming a directory is a huge pain no matter which version control system you use -- yes, it can be done, and modern VCSes have some support for renaming, but it's still a huge mess. Importing patches will be painful. Producing diffs across the renames will be hugely painful. I just think there are too many tools that won't know how to deal with this. (I just did a little experiment: I cloned a small project using Hg and renamed one directory. Then I made a small change to one of the files whose parent was renamed. I have not figured out how to get a diff between the latest version of that file and any version before the mass renaming; the renaming is shown as a delete of the entire old file and an add of the entire new file. Even if you can tell me how to do this, my point stays: it's not easy to figure out. Similarly for logs: by default, "hg log" stops at the rename. You must add --follow to see logs across the rename.) And regardless of which PEP we adopt, there will still be two types of package directories: PEP 382 still maintains backwards compatibility with directories that don't have a suffix but do have an __init__.py. So the unification still remains elusive. And at the end of the day I still really, really, really hate directories with a suffix. -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Mar 13 17:24:56 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2012 09:24:56 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mon, Mar 12, 2012 at 11:07 PM, Nick Coghlan wrote: > So far, I've only heard two *positive* statements in favour of > implicit package directories: > > 1. Java/Perl/etc do it that way. > > I've already made it clear that I don't care about that argument. If > it was all that compelling, we'd have implicit self by now. (However, > clearly Guido favours it in this case, given his message that arrived > while I was writing this one) Honestly, I don't really care about "compatibility" with Java or Perl. However that *both* of those languages do it this way (BTW what does Ruby do?) is an argument that this is a *natural* or *intuitive* way of setting things up. In fact, Python today also uses this: a package P lives in a directory named P. Plain and simple. Users can immediately understand this. Collapsing multiple directories named P along sys.path is also pretty natural, given that we already have the behavior of searching along sys.path. The requirement of having an __init__.py file however is a wart. > 2. It (arguably) makes it easier to convert an existing package into a > namespace package > > With implicit package directories, you just delete your empty > __init__.py file to turn an existing package into a namespace package. > With a PEP 382 style directory suffix, you have to change your > directory name to append the ".pyp" (and, optionally, delete your > __init__.py file, since it's now going to be ignored anyway). > > Barry's also tried to convince me that ".pyp" directories are somehow > harder for distributions to deal with, but his only example looked > like trying to use "yield from" in Python 3.2 and then complaining > when it didn't work. I hope I've added some indication that it's also harder to deal with in version control systems. > However, so long as the backwards compatibility from PEP 402 is > incorporated, and the new PEP proposed a specific addition to the > tutorial to document the "never CD into a package, never double-click > a file in a package to execute it, always use -m to execute modules > from inside packages" guideline (and makes it clear that you may get > strange and unpredictable behaviour if you ever break it), then I can > learn to live with it. IDLE should also be updated to allow correct > execution of submodules via F5 (I guess it will need some mechanism to > be told what working directories to add to sys.path). Those are all sensible requests. > It still seems to me that moving to a marker *suffix* (rather than a > marker file) as PEP 382 proposes brings all the real practical > benefits of implicit package directories (i.e. no empty __init__.py > files wasting space) and absolutely *none* of the pain (i.e. no > backwards compatibility concerns, no ambiguity in the filesystem to > module hierarchy mapping, still able to fix direct execution of > modules inside packages rather than having to explain forevermore why > it doesn't work), but Guido clearly feels otherwise. I expect pain in different places. -- --Guido van Rossum (python.org/~guido) From guido at python.org Tue Mar 13 17:26:55 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 13 Mar 2012 09:26:55 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: Oh, shit. Nick posted a bunch of messages to python-ideas instead of import-sig, and I followed up there. Instead of reposting, I'm just going to suggest that people interested in this discussion will, unfortunately, have to follow both lists. -- --Guido van Rossum (python.org/~guido) From p.f.moore at gmail.com Tue Mar 13 20:24:23 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Tue, 13 Mar 2012 19:24:23 +0000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On 13 March 2012 15:57, Guido van Rossum wrote: > Oh, but there are several other solutions. For example, if you set > PYTHONPATH to the directory *containing* your toplevel package, new > code could be added that will discover the true toplevel no matter > where you are. This code doesn't exist today, but Nick is proposing > something similar looking for __init__.py files; the code that tries > to find the script directory as a subdirectory of some path on > sys.path could be added there. Also, the code Nick currently proposes > for PEP 395 is still useful if you add __init__.py files to your > package -- if you did that you wouldn't have to set PYTHONPATH > (assuming we do add this code to Python 3.3, of course). I tend not to use PYTHONPATH - I'm on Windows, and environment variables aren't the obvious solution there so I tend to forget. Also, I tend to initially develop projects in subdirectories of my "junk" directory, which has all sorts of cruft in it, including .py files with random names. So setting PYTHONPATH to that could introduce all sorts to my namespace, which is a bit less than ideal. OTOH, I don't have a problem with __init__.py files, so something that correctly autodetects the right thing to add to sys.path based on the presence of __init__ files would be fine. All of which assumes that me simply being more organised isn't the real answer here :-) Paul. From ben+python at benfinney.id.au Tue Mar 13 21:42:32 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 14 Mar 2012 07:42:32 +1100 Subject: [Python-ideas] My objections to implicit package directories References: Message-ID: <874ntsi66f.fsf@benfinney.id.au> Guido van Rossum writes: > And at the end of the day I still really, really, really hate > directories with a suffix. +1 -- \ ?Good morning, Pooh Bear?, said Eeyore gloomily. ?If it is a | `\ good morning?, he said. ?Which I doubt?, said he. ?A. A. Milne, | _o__) _Winnie-the-Pooh_ | Ben Finney From ncoghlan at gmail.com Tue Mar 13 22:33:49 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Mar 2012 07:33:49 +1000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Mar 14, 2012 5:24 AM, "Paul Moore" > OTOH, I don't have a problem with __init__.py files, so something that > correctly autodetects the right thing to add to sys.path based on the > presence of __init__ files would be fine. I set up my projects the same way you do - it's a good, self-contained structure. And beginners (at least the ones that used Stack Overflow when I was spending time there) seemed to like it as well. That's the reason PEP 395 uses it as its main example. Over on import-sig, Eric Snow suggested a revised implicit package tolerant search algorithm that's too slow to use on interpreter start up, but should be fine for generating better error messages if __main__ or an interactive import fails with ImportError, so I'll likely revise 395 to propose that instead. Cheers, Nick. -- Sent from my phone, thus the relative brevity :) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Mar 13 23:53:14 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 14 Mar 2012 08:53:14 +1000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: On Wed, Mar 14, 2012 at 2:24 AM, Guido van Rossum wrote: > I hope I've added some indication that it's also harder to deal with > in version control systems. Yeah, given that part of my argument when I updated PEP 414 was "the lack of explicit unicode literals creates useless noise in version control diffs", I can hardly fault you for using a similar argument against changing package directory names! Hopefully Eric can capture this clearly in the new PEP so future readers will have a clear understanding of the trade-offs involved. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ben+python at benfinney.id.au Wed Mar 14 01:36:09 2012 From: ben+python at benfinney.id.au (Ben Finney) Date: Wed, 14 Mar 2012 11:36:09 +1100 Subject: [Python-ideas] My objections to implicit package directories References: Message-ID: <87pqcgggsm.fsf@benfinney.id.au> Nick Coghlan writes: > On Wed, Mar 14, 2012 at 2:24 AM, Guido van Rossum wrote: > > I hope I've added some indication that it's also harder to deal with > > in version control systems. > > Yeah, given that part of my argument when I updated PEP 414 was "the > lack of explicit unicode literals creates useless noise in version > control diffs", I can hardly fault you for using a similar argument > against changing package directory names! Are you convinced by the argument that a directory representing a package should be named exactly the same as the package? That's the most convincing reason I can see (though many other reasons are strong too) for not introducing special cases for the name of the package's directory. > Hopefully Eric can capture this clearly in the new PEP so future > readers will have a clear understanding of the trade-offs involved. Agreed. Thanks for encouraging discussion and recording it, Eric. -- \ ?Pinky, are you pondering what I'm pondering?? ?I think so, | `\ Brain, but Zero Mostel times anything will still give you Zero | _o__) Mostel.? ?_Pinky and The Brain_ | Ben Finney From ubershmekel at gmail.com Wed Mar 14 17:34:30 2012 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Wed, 14 Mar 2012 18:34:30 +0200 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: <87pqcgggsm.fsf@benfinney.id.au> References: <87pqcgggsm.fsf@benfinney.id.au> Message-ID: I've always had trouble understanding and explaining the complexities and intricacies of python packaging. Is there a most basic but comprehensive list of use cases? IIUC they are: * Eg Standard library - import from a list of paths to be searched. * Eg This project - import from a relative path based on this file's current directory (which python has an odd syntax for). * Eg Distributed packages and virtual-env - import from a relative path based on an anchor directory. If we were to start completely from scratch would this problem be an easy one? Yuval Greenfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Wed Mar 14 18:36:27 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 15 Mar 2012 01:36:27 +0800 Subject: [Python-ideas] set.add could return True or False Message-ID: set.add(x) could return True if x was added to the set, and False if x was already in the set. Adding an element that is already present often constitutes an error in my code. As I understand, set.add is an atomic operation. Having set.add return a boolean will also allow EAFP-style code with regard to handling duplicates, the long winded form of which is currently: if a not in b: b.add(a) <-- race condition do_c() Which can be improved to: if b.add(a): do_c() Advantages: * Very common code pattern. * More concise. * Allows interpreter atomicity to be exploited, often removing the need for additional locking. * Faster because it avoids double contain check, and can avoid locking. From andrew.svetlov at gmail.com Wed Mar 14 18:46:48 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Wed, 14 Mar 2012 10:46:48 -0700 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: Message-ID: You still can get race condition: if b.add(a): # some thread removes a from b before do_c call... do_c() Explicit lock is still required for your case. On Wed, Mar 14, 2012 at 10:36 AM, Matt Joiner wrote: > set.add(x) could return True if x was added to the set, and False if x > was already in the set. > > Adding an element that is already present often constitutes an error in my code. > > As I understand, set.add is an atomic operation. Having set.add return > a boolean will also allow EAFP-style code with regard to handling > duplicates, the long winded form of which is currently: > > if a not in b: > ? ?b.add(a) <-- race condition > ? ?do_c() > > Which can be improved to: > > if b.add(a): > ? ?do_c() > > Advantages: > ?* Very common code pattern. > ?* More concise. > ?* Allows interpreter atomicity to be exploited, often removing the > need for additional locking. > ?* Faster because it avoids double contain check, and can avoid locking. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From masklinn at masklinn.net Wed Mar 14 18:53:50 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 14 Mar 2012 18:53:50 +0100 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: Message-ID: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> On 2012-03-14, at 18:36 , Matt Joiner wrote: > set.add(x) could return True if x was added to the set, and False if x > was already in the set. That does not mesh with the usual Python semantics of methods either having a side-effect (mutation) or returning a value. Why would that happen with sets but not with e.g. dicts? > Adding an element that is already present often constitutes an error in my code. Then thrown an error when that happens? > As I understand, set.add is an atomic operation. Having set.add return > a boolean will also allow EAFP-style code with regard to handling > duplicates, the long winded form of which is currently: > > if a not in b: > b.add(a) <-- race condition > do_c() > > Which can be improved to: > > if b.add(a): > do_c() > > Advantages: > * Very common code pattern. > * More concise. > * Allows interpreter atomicity to be exploited, often removing the > need for additional locking. > * Faster because it avoids double contain check, and can avoid locking. Nope, as Andrew noted it's possible that an other thread has *removed* the element from the set before do_c (so you've got your race condition right there, assuming do_c expects `a` to be in `b`) And since you're using a set `b.add(a)` is a noop if `a` is already in `b` so there's no race condition at the point you note. The race condition is instead in the condition itself, you can have two different threads finding out the value is not in the set and ultimately executing do_c. From pyideas at rebertia.com Wed Mar 14 19:11:07 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Wed, 14 Mar 2012 11:11:07 -0700 Subject: [Python-ideas] set.add could return True or False In-Reply-To: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> References: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> Message-ID: On Wed, Mar 14, 2012 at 10:53 AM, Masklinn wrote: > On 2012-03-14, at 18:36 , Matt Joiner wrote: >> set.add(x) could return True if x was added to the set, and False if x >> was already in the set. > > That does not mesh with the usual Python semantics of methods either > having a side-effect (mutation) or returning a value. Why would that > happen with sets but not with e.g. dicts? The rule is a bit more complicated than that (e.g. consider list.pop()). It's gets fleshed out well in: http://bugs.python.org/issue12192 set.remove() arguably "returns" the same sort of indication as that which is proposed, in that it either raises or doesn't raise KeyError depending on whether the value was present. But yeah, these boolean return values aren't of huge utility, particularly in the multithreaded case. Cheers, Chris From anacrolix at gmail.com Wed Mar 14 19:26:21 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 15 Mar 2012 02:26:21 +0800 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: Message-ID: This is an example only. On Mar 15, 2012 1:46 AM, "Andrew Svetlov" wrote: > You still can get race condition: > > if b.add(a): > # some thread removes a from b before do_c call... > do_c() > > Explicit lock is still required for your case. > > On Wed, Mar 14, 2012 at 10:36 AM, Matt Joiner wrote: > > set.add(x) could return True if x was added to the set, and False if x > > was already in the set. > > > > Adding an element that is already present often constitutes an error in > my code. > > > > As I understand, set.add is an atomic operation. Having set.add return > > a boolean will also allow EAFP-style code with regard to handling > > duplicates, the long winded form of which is currently: > > > > if a not in b: > > b.add(a) <-- race condition > > do_c() > > > > Which can be improved to: > > > > if b.add(a): > > do_c() > > > > Advantages: > > * Very common code pattern. > > * More concise. > > * Allows interpreter atomicity to be exploited, often removing the > > need for additional locking. > > * Faster because it avoids double contain check, and can avoid locking. > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Wed Mar 14 19:29:27 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 15 Mar 2012 02:29:27 +0800 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> Message-ID: I should not have emphasized the atomicity here, this was not intended to be the main reason. On Mar 15, 2012 2:11 AM, "Chris Rebert" wrote: > On Wed, Mar 14, 2012 at 10:53 AM, Masklinn wrote: > > On 2012-03-14, at 18:36 , Matt Joiner wrote: > >> set.add(x) could return True if x was added to the set, and False if x > >> was already in the set. > > > > That does not mesh with the usual Python semantics of methods either > > having a side-effect (mutation) or returning a value. Why would that > > happen with sets but not with e.g. dicts? > > The rule is a bit more complicated than that (e.g. consider > list.pop()). It's gets fleshed out well in: > http://bugs.python.org/issue12192 > > set.remove() arguably "returns" the same sort of indication as that > which is proposed, in that it either raises or doesn't raise KeyError > depending on whether the value was present. > > But yeah, these boolean return values aren't of huge utility, > particularly in the multithreaded case. > > Cheers, > Chris > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Mar 14 19:32:49 2012 From: guido at python.org (Guido van Rossum) Date: Wed, 14 Mar 2012 11:32:49 -0700 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> Message-ID: I think it would actually be reasonable to return a true/false value here (there are plenty of non-threaded uses for this), except I think it's too late to change now -- it would just encourage folks from writing code that works with Python 3.3 but is very subtly broken with earlier versions. So, -1. --Guido On Wed, Mar 14, 2012 at 11:29 AM, Matt Joiner wrote: > I should not have emphasized the atomicity here, this was not intended to be > the main reason. > > On Mar 15, 2012 2:11 AM, "Chris Rebert" wrote: >> >> On Wed, Mar 14, 2012 at 10:53 AM, Masklinn wrote: >> > On 2012-03-14, at 18:36 , Matt Joiner wrote: >> >> set.add(x) could return True if x was added to the set, and False if x >> >> was already in the set. >> > >> > That does not mesh with the usual Python semantics of methods either >> > having a side-effect (mutation) or returning a value. Why would that >> > happen with sets but not with e.g. dicts? >> >> The rule is a bit more complicated than that (e.g. consider >> list.pop()). It's gets fleshed out well in: >> http://bugs.python.org/issue12192 >> >> set.remove() arguably "returns" the same sort of indication as that >> which is proposed, in that it either raises or doesn't raise KeyError >> depending on whether the value was present. >> >> But yeah, these boolean return values aren't of huge utility, >> particularly in the multithreaded case. >> >> Cheers, >> Chris > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From anacrolix at gmail.com Wed Mar 14 19:34:42 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 15 Mar 2012 02:34:42 +0800 Subject: [Python-ideas] set.add could return True or False In-Reply-To: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> References: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> Message-ID: On Mar 15, 2012 1:53 AM, "Masklinn" wrote: > > On 2012-03-14, at 18:36 , Matt Joiner wrote: > > set.add(x) could return True if x was added to the set, and False if x > > was already in the set. > > That does not mesh with the usual Python semantics of methods either > having a side-effect (mutation) or returning a value. Why would that > happen with sets but not with e.g. dicts? Because dict insertions are by operator? > > > Adding an element that is already present often constitutes an error in my code. > > Then thrown an error when that happens? > > > As I understand, set.add is an atomic operation. Having set.add return > > a boolean will also allow EAFP-style code with regard to handling > > duplicates, the long winded form of which is currently: > > > > if a not in b: > > b.add(a) <-- race condition > > do_c() > > > > Which can be improved to: > > > > if b.add(a): > > do_c() > > > > Advantages: > > * Very common code pattern. > > * More concise. > > * Allows interpreter atomicity to be exploited, often removing the > > need for additional locking. > > * Faster because it avoids double contain check, and can avoid locking. > > Nope, as Andrew noted it's possible that an other thread has *removed* > the element from the set before do_c (so you've got your race condition > right there, assuming do_c expects `a` to be in `b`) > > And since you're using a set `b.add(a)` is a noop if `a` is already in > `b` so there's no race condition at the point you note. The race > condition is instead in the condition itself, you can have two different > threads finding out the value is not in the set and ultimately executing > do_c. There's still a performance cost in looking up the already present value a second time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From anacrolix at gmail.com Wed Mar 14 19:42:28 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Thu, 15 Mar 2012 02:42:28 +0800 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> Message-ID: There's not much chance of that now, most projects that could cause trouble with this are using six, or moving up from 2. By comparison many other things in Python 3.3 are very non backwards friendly like delegating generators, new exception hierarchy, and a crap load of new API calls. Now is the hour! :> On Mar 15, 2012 2:33 AM, "Guido van Rossum" wrote: > I think it would actually be reasonable to return a true/false value > here (there are plenty of non-threaded uses for this), except I think > it's too late to change now -- it would just encourage folks from > writing code that works with Python 3.3 but is very subtly broken with > earlier versions. So, -1. > > --Guido > > On Wed, Mar 14, 2012 at 11:29 AM, Matt Joiner wrote: > > I should not have emphasized the atomicity here, this was not intended > to be > > the main reason. > > > > On Mar 15, 2012 2:11 AM, "Chris Rebert" wrote: > >> > >> On Wed, Mar 14, 2012 at 10:53 AM, Masklinn > wrote: > >> > On 2012-03-14, at 18:36 , Matt Joiner wrote: > >> >> set.add(x) could return True if x was added to the set, and False if > x > >> >> was already in the set. > >> > > >> > That does not mesh with the usual Python semantics of methods either > >> > having a side-effect (mutation) or returning a value. Why would that > >> > happen with sets but not with e.g. dicts? > >> > >> The rule is a bit more complicated than that (e.g. consider > >> list.pop()). It's gets fleshed out well in: > >> http://bugs.python.org/issue12192 > >> > >> set.remove() arguably "returns" the same sort of indication as that > >> which is proposed, in that it either raises or doesn't raise KeyError > >> depending on whether the value was present. > >> > >> But yeah, these boolean return values aren't of huge utility, > >> particularly in the multithreaded case. > >> > >> Cheers, > >> Chris > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Wed Mar 14 20:28:44 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 14 Mar 2012 21:28:44 +0200 Subject: [Python-ideas] Logging2 with default NullHandler Message-ID: Badly need `logging2` module that has NullHandler assigned by default for all loggers. http://packages.python.org/Logbook/api/handlers.html#logbook.NullHandler Why? Because logging fails to play well with libraries: import logging log = logging.getLogger(__name__) log.warn("WARN") No handlers could be found for logger "spyderlib.utils.bsdsocket" What do I want from library logging as a Python application developer? Nothing until I explicitly setup default behaviour. logging can not be changed, and leaving everything as-is is a PITA, that's why I am proposing for logging2 as the only viable solution. -- anatoly t. From raymond.hettinger at gmail.com Wed Mar 14 20:48:17 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Wed, 14 Mar 2012 12:48:17 -0700 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: <5315C5BD-DFBA-461B-949F-8135636B7A70@masklinn.net> Message-ID: On Mar 14, 2012, at 11:34 AM, Matt Joiner wrote: > There's still a performance cost in looking up the already present value a second time. The performance cost is near zero. The relevant memory accesses with be in cache (which make accessing them almost free). Besides, the price you pay for storing and testing the boolean value isn't free either. Thinking like a C programmer won't help you in the optimization game with sets. And, it's certainly not worth complexifying the API. Raymond P.S. There have been previous threads on this same subject and they've all ended with rejecting the proposal. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Wed Mar 14 23:07:14 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 14 Mar 2012 18:07:14 -0400 Subject: [Python-ideas] Save memory when forking with *really* immutable objects In-Reply-To: <4F5E9074.70701@hastings.org> References: <4F5E9074.70701@hastings.org> Message-ID: On Mon, Mar 12, 2012 at 8:10 PM, Larry Hastings wrote: > The obvious first step: add a magical reference count number that never > changes, called Py_REF_ETERNAL. If you have a magic number, you need to check before doing the update; at some point in the distant past, that was considered too expensive because it is done so often. But once you do pay the cost of a more expensive refcount update, this isn't the only optimization available. For example, the incref/decref can be delayed or batched up, which can help with remote objects or incremental garbage collection. Gating reference acquisition may also be re-purposed to serve as thread-locking, or to more efficiently support Software Transactional Memory. > ?Martin suggests that this wouldn't help unless we could concentrate the > Py_REF_ETERNAL objects in their own memory pools in the small block > allocator. Right; it makes sense to have the incref/decref function be per-arena, or at least per page or some such. -jJ From stefan_ml at behnel.de Thu Mar 15 07:20:35 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Thu, 15 Mar 2012 07:20:35 +0100 Subject: [Python-ideas] Save memory when forking with *really* immutable objects In-Reply-To: References: <4F5E9074.70701@hastings.org> Message-ID: Jim Jewett, 14.03.2012 23:07: > On Mon, Mar 12, 2012 at 8:10 PM, Larry Hastings wrote: >> The obvious first step: add a magical reference count number that never >> changes, called Py_REF_ETERNAL. > > If you have a magic number, you need to check before doing the update; > at some point in the distant past, that was considered too expensive > because it is done so often. Well, we could switch to a floating point value for the refcount and let the CPU do it for us by using +inf as magic value. (this is python-ideas, right?) Stefan From niki.spahiev at gmail.com Thu Mar 15 13:24:34 2012 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Thu, 15 Mar 2012 14:24:34 +0200 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On 14.03.2012 21:28, anatoly techtonik wrote: > Badly need `logging2` module that has NullHandler assigned by default > for all loggers. > http://packages.python.org/Logbook/api/handlers.html#logbook.NullHandler > > Why? Because logging fails to play well with libraries: > > import logging > log = logging.getLogger(__name__) > log.warn("WARN") > > No handlers could be found for logger "spyderlib.utils.bsdsocket" > > What do I want from library logging as a Python application developer? > Nothing until I explicitly setup default behaviour. > > logging can not be changed, and leaving everything as-is is a PITA, > that's why I am > proposing for logging2 as the only viable solution. Can't this be solved with new function? e.g. log = logging.getLibLogger(__name__) Niki From techtonik at gmail.com Thu Mar 15 16:54:24 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 15 Mar 2012 17:54:24 +0200 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 3:24 PM, Niki Spahiev wrote: > On 14.03.2012 21:28, anatoly techtonik wrote: >> >> Badly need `logging2` module that has NullHandler assigned by default >> for all loggers. >> http://packages.python.org/Logbook/api/handlers.html#logbook.NullHandler >> >> Why? Because logging fails to play well with libraries: >> >> ? ?import logging >> ? ?log = logging.getLogger(__name__) >> ? ?log.warn("WARN") >> >> ? ?No handlers could be found for logger "spyderlib.utils.bsdsocket" >> >> What do I want from library logging as a Python application developer? >> Nothing until I explicitly setup default behaviour. >> >> logging can not be changed, and leaving everything as-is is a PITA, >> that's why I am >> proposing for logging2 as the only viable solution. > > > Can't this be solved with new function? e.g. > ?log = logging.getLibLogger(__name__) I don't know - there are questions. The function is used to get logger with NullHandler() if application did not provide configuration for root logger. So.. 1. How will logging know that application provided root logger configuration? 2. How will logging know that it should use root logger handler instead of more specific handler for __name__ module? This comes to the second part - what if logging.getLibLogger(__name__) is called after the application root handler is configured? 1. Will a new, more specific NullHandler() override the active root configuration? There is one more concern. If anything in the code base (including 3rd party modules) calls logging.log() or friends before root logger is configured by application, the root logger becomes automatically configured making everything above useless. -- anatoly t. From cs at zip.com.au Thu Mar 15 22:32:58 2012 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 16 Mar 2012 08:32:58 +1100 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: <20120315213257.GA1800@cskk.homeip.net> On 14Mar2012 21:28, anatoly techtonik wrote: | Badly need `logging2` module that has NullHandler assigned by default | for all loggers. | http://packages.python.org/Logbook/api/handlers.html#logbook.NullHandler | | Why? Because logging fails to play well with libraries: | | import logging | log = logging.getLogger(__name__) | log.warn("WARN") | | No handlers could be found for logger "spyderlib.utils.bsdsocket" | | What do I want from library logging as a Python application developer? | Nothing until I explicitly setup default behaviour. Fair point. Conversely, almost every app I write commences thus: from cs.logutils import setup_logging def main(argv): setup_logging() ... main code ... That sends to stderr with frills. Finer grained setup can come later. I _think_ I prefer logging's current behaviour: - I do want a big fat warning if I forget to configure logging at all - I don't want lobraries doing sufficient work at import time to warrant logging anything Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Silicon chips with a cardboard substrate? That's not a good marriage! - overhead by WIRED at the Intelligent Printing conference Oct2006 From techtonik at gmail.com Thu Mar 15 23:39:48 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 16 Mar 2012 00:39:48 +0200 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: <20120315213257.GA1800@cskk.homeip.net> References: <20120315213257.GA1800@cskk.homeip.net> Message-ID: On Fri, Mar 16, 2012 at 12:32 AM, Cameron Simpson wrote: > On 14Mar2012 21:28, anatoly techtonik wrote: > | Badly need `logging2` module that has NullHandler assigned by default > | for all loggers. > | http://packages.python.org/Logbook/api/handlers.html#logbook.NullHandler > | > | Why? Because logging fails to play well with libraries: > | > | ? ?import logging > | ? ?log = logging.getLogger(__name__) > | ? ?log.warn("WARN") > | > | ? ?No handlers could be found for logger "spyderlib.utils.bsdsocket" > | > | What do I want from library logging as a Python application developer? > | Nothing until I explicitly setup default behaviour. > > Fair point. > > Conversely, almost every app I write commences thus: > > ?from cs.logutils import setup_logging > > ?def main(argv): > ? ?setup_logging() > ? ?... main code ... > > That sends to stderr with frills. Finer grained setup can come later. But that makes all such libraries dependent on cs.logutils, which is not an option. In my case these libraries are self-contained PySide widgets that can be used standalone or from an IDE. Adding dependency on cs.logutils makes them too tied to application. > I _think_ I prefer logging's current behaviour: > > ?- I do want a big fat warning if I forget to configure logging at all Unless your application uses alternative logging means and doesn't use logging at all (unlike 3rd party libraries it uses). > ?- I don't want lobraries doing sufficient work at import time to > ? ?warrant logging anything I do not want libraries doing any logging setup work at import either. They should just need to contain 'extension points' for plugging log handlers in case I need to debug these modules. -- anatoly t. From larry at hastings.org Fri Mar 16 01:48:24 2012 From: larry at hastings.org (Larry Hastings) Date: Thu, 15 Mar 2012 17:48:24 -0700 Subject: [Python-ideas] Combining stat/lstat/fstatat etc. In-Reply-To: References: Message-ID: <4F628DD8.1000906@hastings.org> Guido steered me towards this suggestion. I think it's a fine idea. I'm in the process of adding keyword-only arguments to C extension argument parsing (issue #14328), and after that adding a new ns= parameter to os.utime (issue #14127). After that I may take a stab at this, unless you would prefer to be the implementor. Could you please create an issue on the bug tracker to discuss/track this? I'm happy to create the issue if you prefer. //arry/ On 03/10/2012 09:33 AM, Serhiy Storchaka wrote: > In Python3 added a large number of "at"-functions from the latter Posix standard. Such a large number of names only litters namespace. In C it is necessary for historical reasons (note that fstatat combines the stat and lstat and includes the possibility of extension) and because of static typing. But in Python because of dynamic typing we can extend existing functions. > > So instead of stat, lstat, and fstatat could use the same function `stat(path, *, dirfd=None, followlinks=True)`. Then `lstat(path)` == `stat(path, followlinks=False)`, and `fstatat(dirfd, path, flags)` == `stat(path, dirfd=(AT_FDCWD if dirfd is None else AT_FDCWD), followlinks=not (flags& AT_SYMLINK_NOFOLLOW))`. As special value for dirfd I suggest using the None, not AT_FDCWD (it's more pythonish). fstat could be included too, by specifically treating the case where the path is integer. > > And same for other functions. > > Old lstat and lchown will remain for compatibility as light wrappers around the stat, with time they may become deprecated. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From cs at zip.com.au Fri Mar 16 03:10:54 2012 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 16 Mar 2012 13:10:54 +1100 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: <20120316021054.GA24789@cskk.homeip.net> On 16Mar2012 00:39, anatoly techtonik wrote: | > | What do I want from library logging as a Python application developer? | > | Nothing until I explicitly setup default behaviour. | > | > Fair point. | > | > Conversely, almost every app I write commences thus: | > ?from cs.logutils import setup_logging | > ?def main(argv): | > ? ?setup_logging() | > ? ?... main code ... | > | > That sends to stderr with frills. Finer grained setup can come later. | | But that makes all such libraries dependent on cs.logutils, which is | not an option. No, it doesn't. setup_logging just sets up the root logger somewhat. Your apps need some kind of convenience routine like it, is all. It can be very small. All the libraries have unchanged vanilla logging calls. For example, I use SQLalchemy et al under this scheme. The libraries don't know about cs.logutils at all. | In my case these libraries are self-contained PySide | widgets that can be used standalone or from an IDE. Adding dependency | on cs.logutils makes them too tied to application. The libraries don't use my convenience module. My module just makes a whole bunch of setup I happen to like reduce to just two lines of code in my app, import and setup call. | > I _think_ I prefer logging's current behaviour: | > ?- I do want a big fat warning if I forget to configure logging at all | | Unless your application uses alternative logging means and doesn't use | logging at all (unlike 3rd party libraries it uses). If the 3rd party libraries use the logging module and you don't want logging happening, just configure logging to throw stuff away. | > ?- I don't want lobraries doing sufficient work at import time to | > ? ?warrant logging anything | | I do not want libraries doing any logging setup work at import either. | They should just need to contain 'extension points' for plugging log | handlers in case I need to debug these modules. Well, they do. Via logging. Supply an adapter handler to hook their calls to logging to your alternative logging means. Logging doesn't have to log to files/syslog/smtp etc - you can sent it anywhere. I'm sure you know that, so I think I'm missing some subtlety in your use case. Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ ERROR 155 - You can't do that. - Data General S200 Fortran error code list From cs at zip.com.au Fri Mar 16 03:19:40 2012 From: cs at zip.com.au (Cameron Simpson) Date: Fri, 16 Mar 2012 13:19:40 +1100 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: Message-ID: <20120316021940.GA27048@cskk.homeip.net> On 15Mar2012 02:26, Matt Joiner wrote: | This is an example only. [... racy stuff ...] Then please provide a realistic nonracy example. And also, what's your use case for "Adding an element that is already present often constitutes an error in my code"? And how is it not addressed by: b.update( (a,) ) Cheers, -- Cameron Simpson DoD#743 http://www.cskk.ezoshosting.com/cs/ Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. - Martin Golding, DoD #0236, martin at plaza.ds.adp.com From anacrolix at gmail.com Fri Mar 16 08:32:43 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 16 Mar 2012 15:32:43 +0800 Subject: [Python-ideas] set.add could return True or False In-Reply-To: <20120316021940.GA27048@cskk.homeip.net> References: <20120316021940.GA27048@cskk.homeip.net> Message-ID: I've submitted patches and benchmarks for this here: http://bugs.python.org/issue14320 On Fri, Mar 16, 2012 at 10:19 AM, Cameron Simpson wrote: > On 15Mar2012 02:26, Matt Joiner wrote: > | This is an example only. > [... racy stuff ...] > > Then please provide a realistic nonracy example. > > And also, what's your use case for "Adding an element that is already > present often constitutes an error in my code"? And how is it not > addressed by: > > ?b.update( (a,) ) > > Cheers, > -- > Cameron Simpson DoD#743 > http://www.cskk.ezoshosting.com/cs/ > > Always code as if the guy who ends up maintaining your code will be a violent > psychopath who knows where you live. > ? ? ? ?- Martin Golding, DoD #0236, martin at plaza.ds.adp.com From techtonik at gmail.com Fri Mar 16 09:33:34 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 16 Mar 2012 10:33:34 +0200 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: <20120316021054.GA24789@cskk.homeip.net> References: <20120316021054.GA24789@cskk.homeip.net> Message-ID: On Fri, Mar 16, 2012 at 5:10 AM, Cameron Simpson wrote: > On 16Mar2012 00:39, anatoly techtonik wrote: > | > | What do I want from library logging as a Python application developer? > | > | Nothing until I explicitly setup default behaviour. > | > > | > Fair point. > | > > | > Conversely, almost every app I write commences thus: > | > ?from cs.logutils import setup_logging > | > ?def main(argv): > | > ? ?setup_logging() > | > ? ?... main code ... > | > > | > That sends to stderr with frills. Finer grained setup can come later. > | > | But that makes all such libraries dependent on cs.logutils, which is > | not an option. > > No, it doesn't. setup_logging just sets up the root logger somewhat. > Your apps need some kind of convenience routine like it, is all. > It can be very small. > > All the libraries have unchanged vanilla logging calls. > For example, I use SQLalchemy et al under this scheme. The libraries > don't know about cs.logutils at all. > > | In my case these libraries are self-contained PySide > | widgets that can be used standalone or from an IDE. Adding dependency > | on cs.logutils makes them too tied to application. > > The libraries don't use my convenience module. My module just makes a whole > bunch of setup I happen to like reduce to just two lines of code in my app, > import and setup call. So, your scenario is clear. You have an application where you configure logging and none of your libraries are used without this configuration. This is a doable scenario for logging module. But logging2 is needed for the opposite user story - when you have a bunch of independent libraries that are used from scripts, from application or standalone, and you need to make them silent by default. Doing this requires inserting a logging configuration into the header of every library. Things are dim, because 3rd party libraries don't come with these headers. Moreover - it is unclear if this configuration won't break the logging when you finally have to turn it on for this particular lib from an application. > | > I _think_ I prefer logging's current behaviour: > | > ?- I do want a big fat warning if I forget to configure logging at all > | > | Unless your application uses alternative logging means and doesn't use > | logging at all (unlike 3rd party libraries it uses). > > If the 3rd party libraries use the logging module and you don't want > logging happening, just configure logging to throw stuff away. The problem that I don't know and probably don't want to think about logging schemes used by those libs. When the time comes to troubleshoot these libs then sure - the logging will be the first thing I'll look into. > | > ?- I don't want lobraries doing sufficient work at import time to > | > ? ?warrant logging anything > | > | I do not want libraries doing any logging setup work at import either. > | They should just need to contain 'extension points' for plugging log > | handlers in case I need to debug these modules. > > Well, they do. Via logging. Supply an adapter handler to hook their > calls to logging to your alternative logging means. Logging doesn't have > to log to files/syslog/smtp etc - you can sent it anywhere. I'm sure you > know that, so I think I'm missing some subtlety in your use case. Yes, the argument in not about API for 'extension points' (logging does that). The argument that logging does implicit extra work, and it requires even more extra work to cancel the produced effects. More over, the effects of this default implicit work are non-deterministic - that means they depend on execution flow in you program. The effects will be different if you toss these events in random order: - root config is set by application - logging.log() or friends called - getLogger('some.dotted.name').warn() called Basically, proper logging setup also requires to care about import order of you application. It's not a problem if you have a single entry point, but becomes a PITA with multiple entrypoints or in exploratory programming (IPython?) sessions. "Don't let me think" is the greatest usability principle of all times, and logging module, unfortunately fails to comply. Backward compatibility won't let it to be fixed. And that's why a predictable logging2 module is needed. -- anatoly t. From p.f.moore at gmail.com Fri Mar 16 09:53:23 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 16 Mar 2012 08:53:23 +0000 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: <20120316021054.GA24789@cskk.homeip.net> Message-ID: On 16 March 2012 08:33, anatoly techtonik wrote: > "Don't let me think" is the greatest usability principle of all times, > and logging module, unfortunately fails to comply. Backward > compatibility won't let it to be fixed. And that's why a predictable > logging2 module is needed. Your original example was wrong: >>> import logging >>> log = logging.getLogger(__name__) >>> log.warn("WARN") WARNING:__main__:WARN >>> log = logging.getLogger('a.b.c') >>> log.warn("WARN") WARNING:a.b.c:WARN Logging reports to stderr by default. So if I don't think, I get any output that code using logging thinks is worth drawing to my attention. Having seen that, if I want to do something about it (i.e. I think a *tiny* amount) I can set the log level to critical and suppress anything else: >>> log.setLevel(logging.CRITICAL) >>> log.warn("WARN") And if I prefer something more complex, I can do that. But by then, I've reached the point where expecting me not to read the documentation and think a little is unreasonable. The examples above are in 2.7, btw. They also work in 3.2 and 3.3a0. Paul. From techtonik at gmail.com Fri Mar 16 10:19:13 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 16 Mar 2012 11:19:13 +0200 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: <20120316021054.GA24789@cskk.homeip.net> Message-ID: On Fri, Mar 16, 2012 at 11:53 AM, Paul Moore wrote: > On 16 March 2012 08:33, anatoly techtonik wrote: >> "Don't let me think" is the greatest usability principle of all times, >> and logging module, unfortunately fails to comply. Backward >> compatibility won't let it to be fixed. And that's why a predictable >> logging2 module is needed. > > Your original example was wrong: The example is actually right, and the preceding string explains that example code should be run from the library. >>>> import logging >>>> log = logging.getLogger(__name__) >>>> log.warn("WARN") > WARNING:__main__:WARN >>>> log = logging.getLogger('a.b.c') >>>> log.warn("WARN") > WARNING:a.b.c:WARN Restart the interpreter to clear the state and execute only the last part: >>> import logging >>> log = logging.getLogger('a.b.c') >>> log.warn("WARN") No handlers could be found for logger "a.b.c" Now without restarting execute the following: >>> log = logging.getLogger('spyderlib.utils.bsdsocket') >>> log.warn("WARN") >>> Does the original example look correct now? I guess, Paul, that you've overlooked the reference to library, so I skipped the rest of the quote, because you should agree that library should not set logging level for itself as it will cancel logging config may have already occurred at application level. -- anatoly t. From p.f.moore at gmail.com Fri Mar 16 11:25:25 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 16 Mar 2012 10:25:25 +0000 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: <20120316021054.GA24789@cskk.homeip.net> Message-ID: On 16 March 2012 09:19, anatoly techtonik wrote: > On Fri, Mar 16, 2012 at 11:53 AM, Paul Moore wrote: >> On 16 March 2012 08:33, anatoly techtonik wrote: >>> "Don't let me think" is the greatest usability principle of all times, >>> and logging module, unfortunately fails to comply. Backward >>> compatibility won't let it to be fixed. And that's why a predictable >>> logging2 module is needed. >> >> Your original example was wrong: > > The example is actually right, and the preceding string explains that > example code should be run from the library. > >>>>> import logging >>>>> log = logging.getLogger(__name__) >>>>> log.warn("WARN") >> WARNING:__main__:WARN >>>>> log = logging.getLogger('a.b.c') >>>>> log.warn("WARN") >> WARNING:a.b.c:WARN > > Restart the interpreter to clear the state and execute only the last part: > >>>> import logging >>>> log = logging.getLogger('a.b.c') >>>> log.warn("WARN") > No handlers could be found for logger "a.b.c" PS D:\Data> py -3.2 Python 3.2.2 (default, Sep 4 2011, 09:51:08) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import logging >>> log = logging.getLogger('a.b.c') >>> log.warn("WARN") WARN >>> > Now without restarting execute the following: > >>>> log = logging.getLogger('spyderlib.utils.bsdsocket') >>>> log.warn("WARN") >>>> >>> log = logging.getLogger('spyderlib.utils.bsdsocket') >>> log.warn("WARN") WARN >>> > Does the original example look correct now? No. > I guess, Paul, that you've overlooked the reference to library, so I > skipped the rest of the quote, because you should agree that library > should not set logging level for itself as it will cancel logging > config may have already occurred at application level. I have no library spyderlib.utils.bsdsocket. If you are saying that library has a bug, then I can't comment (other than to say that blaming the logging module for that is something of an over-reaction, to be polite). Paul. From jh at improva.dk Fri Mar 16 12:37:24 2012 From: jh at improva.dk (Jacob Holm) Date: Fri, 16 Mar 2012 12:37:24 +0100 Subject: [Python-ideas] set.add could return True or False In-Reply-To: References: Message-ID: <4F6325F4.4020806@improva.dk> On 03/14/2012 06:46 PM, Andrew Svetlov wrote: > You still can get race condition: > > if b.add(a): > # some thread removes a from b before do_c call... > do_c() > > Explicit lock is still required for your case. > A common use for this pattern is to do some piece of work only the first time a given value is seen. The race condition you mention is unlikely to matter there because either: - Values are never removed from b, and so there is no race, or - Values are only ever removed by do_c, so there is no race, or - Whether a is actually *in* b is irrelevant to do_c. The original race *does* matter, because do_c() may be called multiple times for the same value. Changing set.add() to return True if the value was actually added fixes this race condition without using locks. In fact, you could even use the proposed feature to *implement* a form of (non-reentrant, nonblocking) locking. If you need a lot of locks, it would even be cheaper (at least memory-wise) than using threading.Lock objects. Having to use a "real" lock to avoid the race condition makes that approach a lot less attractive. - Jacob > On Wed, Mar 14, 2012 at 10:36 AM, Matt Joiner wrote: >> set.add(x) could return True if x was added to the set, and False if x >> was already in the set. >> >> Adding an element that is already present often constitutes an error in my code. >> >> As I understand, set.add is an atomic operation. Having set.add return >> a boolean will also allow EAFP-style code with regard to handling >> duplicates, the long winded form of which is currently: >> >> if a not in b: >> b.add(a)<-- race condition >> do_c() >> >> Which can be improved to: >> >> if b.add(a): >> do_c() >> >> Advantages: >> * Very common code pattern. >> * More concise. >> * Allows interpreter atomicity to be exploited, often removing the >> need for additional locking. >> * Faster because it avoids double contain check, and can avoid locking. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From ncoghlan at gmail.com Fri Mar 16 12:39:59 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 16 Mar 2012 21:39:59 +1000 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On Thu, Mar 15, 2012 at 5:28 AM, anatoly techtonik wrote: > > logging can not be changed, and leaving everything as-is is a PITA, > that's why I am > proposing for logging2 as the only viable solution. Logging *can* be changed, and indeed *was* changed for 3.2 (to behave as Paul describes elsewhere in the thread). Anatoly, when you have complaints like this, *please* check that the behaviour you dislike can be reproduced on the latest version of Python. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From anacrolix at gmail.com Fri Mar 16 13:45:48 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 16 Mar 2012 20:45:48 +0800 Subject: [Python-ideas] set.add could return True or False In-Reply-To: <4F6325F4.4020806@improva.dk> References: <4F6325F4.4020806@improva.dk> Message-ID: This, and more so removing duplication, are my reasons for suggesting it. A lot of my code uses assertions to verify assumptions about uniqueness of elements. This requires checking that an element isn't already present before adding it. Doing these checks on sets shared by threads requires locking, which is a high cost when most operations allow "cheating" with the GIL. But this is a minor boon. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Fri Mar 16 17:03:56 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 16 Mar 2012 18:03:56 +0200 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On Fri, Mar 16, 2012 at 2:39 PM, Nick Coghlan wrote: > On Thu, Mar 15, 2012 at 5:28 AM, anatoly techtonik wrote: >> >> logging can not be changed, and leaving everything as-is is a PITA, >> that's why I am >> proposing for logging2 as the only viable solution. > > Logging *can* be changed, and indeed *was* changed for 3.2 (to behave > as Paul describes elsewhere in the thread). Even though Paul described 2.7 version too, it is good to know that at least this inconsistency was fixed in 3.x. Sorry for not being clear about Python 2.x from the start. Seeing that Paul resorted to protect only `py` 3.2 in the last comment I may assume that we are all agree that using logging from libraries with scenarios I've described in Python 2.x is a PITA and all my arguments are valid against this version. > Anatoly, when you have complaints like this, *please* check that the > behaviour you dislike can be reproduced on the latest version of > Python. It is hard to propose something for the thing I don't use daily. I couldn't reproduce this stuff, because getting to the root of the problem required a lot of effort alone. If there was a list for known misbehaviours like this in Python 2.x the things could be much easier. I guess the question can be closed, but I am still dissatisfied with the current state of things. If Python 3.x is developed using the same process, there is a risk of UX problems like this too. -- anatoly t. From p.f.moore at gmail.com Fri Mar 16 18:06:46 2012 From: p.f.moore at gmail.com (Paul Moore) Date: Fri, 16 Mar 2012 17:06:46 +0000 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On 16 March 2012 16:03, anatoly techtonik wrote: > I may assume that we are all agree > that using logging from libraries with scenarios I've described in > Python 2.x is a PITA and all my arguments are valid against this > version. Nope, I checked 2.7 too. I just didn't paste both cases. You should at a minimum, state the version you are using when you make these comments. And anyway, Python 2.x behaviour isn't really an appropriate topic for python-ideas, as there won't be a new 2.x version. Paul. From yselivanov.ml at gmail.com Fri Mar 16 18:57:59 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 16 Mar 2012 13:57:59 -0400 Subject: [Python-ideas] make __closure__ writable Message-ID: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Can we make the __closure__ attribute writeable? Since __code__ already is, and it is possible to tackle with the opcodes, having no way of rewriting __closure__ (except creating a completely new function) is annoying. I don't think it will somehow harm python, as those who want to break it can do it already in multiple ways, easier than playing with __closure__. - Yury From jsbueno at python.org.br Fri Mar 16 19:19:28 2012 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Fri, 16 Mar 2012 15:19:28 -0300 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Message-ID: On 16 March 2012 14:57, Yury Selivanov wrote: > Can we make the __closure__ attribute writeable? ?Since __code__ already is, > and it is possible to tackle with the opcodes, having no way of rewriting > __closure__ (except creating a completely new function) is annoying. > > I don't think it will somehow harm python, as those who want to break it can > do it already in multiple ways, easier than playing with __closure__. +1 This could lead to "flatter" code in a lot of places, where the only way to have the code behave in certain ways is to write nested functions and class declarations. > - > Yury From mark at hotpy.org Fri Mar 16 19:37:37 2012 From: mark at hotpy.org (Mark Shannon) Date: Fri, 16 Mar 2012 18:37:37 +0000 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Message-ID: <4F638871.1010603@hotpy.org> Yury Selivanov wrote: > Can we make the __closure__ attribute writeable? Since __code__ already is, > and it is possible to tackle with the opcodes, having no way of rewriting > __closure__ (except creating a completely new function) is annoying. What's wrong with using a decorator? > > I don't think it will somehow harm python, as those who want to break it can > do it already in multiple ways, easier than playing with __closure__. > > - > Yury > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From yselivanov.ml at gmail.com Fri Mar 16 19:57:47 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 16 Mar 2012 14:57:47 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <4F638871.1010603@hotpy.org> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> Message-ID: <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> Decorators can be nested, and what you can do in this case is to find the most inner-wrapped function by traversing the '__wrapped__' attributes (and check that the function you found is the actual original function). After that you can play with its attributes, but you can't simply substitute the function object, as the inner decorator won't use it. So sometimes you have to work with the function object without a way of substituting it. - Yury On 2012-03-16, at 2:37 PM, Mark Shannon wrote: > Yury Selivanov wrote: >> Can we make the __closure__ attribute writeable? Since __code__ already is, >> and it is possible to tackle with the opcodes, having no way of rewriting __closure__ (except creating a completely new function) is annoying. > > What's wrong with using a decorator? > >> I don't think it will somehow harm python, as those who want to break it can >> do it already in multiple ways, easier than playing with __closure__. >> - >> Yury >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From mark at hotpy.org Fri Mar 16 20:24:32 2012 From: mark at hotpy.org (Mark Shannon) Date: Fri, 16 Mar 2012 19:24:32 +0000 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> Message-ID: <4F639370.1020609@hotpy.org> Yury Selivanov wrote: > On 2012-03-16, at 2:57 PM, Yury Selivanov wrote: > >> Decorators can be nested, and what you can do in this case is to >> find the most inner-wrapped function by traversing the '__wrapped__' >> attributes (and check that the function you found is the actual >> original function). After that you can play with its attributes, >> but you can't simply substitute the function object, as the inner >> decorator won't use it. So sometimes you have to work with the >> function object without a way of substituting it. > > And that applies to the situations where decorators are not enough > and you have to work on the opcode level. Which you can do with a decorator. Would this do what you want? def f_with_new_closure(f, closure): return types.FunctionType(f.__code__, f.__globals__, f.__name__, f.__defaults__, closure) Cheers, Mark. From yselivanov.ml at gmail.com Fri Mar 16 21:58:37 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 16 Mar 2012 16:58:37 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <4F639370.1020609@hotpy.org> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> Message-ID: <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> Yes, your approach will work if your decorator is the only one applied. But, as I said, if you have many of them (see below), you can't just return a new function out of your decorator, you need to change the underlying "in-place". Consider the following: def modifier(func): orig_func = func while func.__wrapped__: func = func.__wrapped__ # patch func.__code__ and func.__closure__ return orig_func # no need to wrap anything def some_decorator(func): def wrapper(*args, **kwargs): # some code return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper @modifier @some_decorator def foo(): # this code needs to be verified/augmented/etc So, in the above snippet, if you don't want to discard the @some_decorator by returning a new function object, you need to modify the 'foo' from the @modifier. In a complex framework, where you can't guarantee that your magic decorator will always be called first, rewriting the __closure__ attribute is the only way. Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it. - Yury On 2012-03-16, at 3:24 PM, Mark Shannon wrote: > Yury Selivanov wrote: >> On 2012-03-16, at 2:57 PM, Yury Selivanov wrote: >>> Decorators can be nested, and what you can do in this case is to >>> find the most inner-wrapped function by traversing the '__wrapped__' >>> attributes (and check that the function you found is the actual >>> original function). After that you can play with its attributes, >>> but you can't simply substitute the function object, as the inner >>> decorator won't use it. So sometimes you have to work with the >>> function object without a way of substituting it. >> And that applies to the situations where decorators are not enough >> and you have to work on the opcode level. > > Which you can do with a decorator. > > Would this do what you want? > > def f_with_new_closure(f, closure): > return types.FunctionType(f.__code__, > f.__globals__, > f.__name__, > f.__defaults__, > closure) > > > Cheers, > Mark. > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From storchaka at gmail.com Fri Mar 16 23:05:40 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 17 Mar 2012 00:05:40 +0200 Subject: [Python-ideas] Combining stat/lstat/fstatat etc. In-Reply-To: <4F628DD8.1000906@hastings.org> References: <4F628DD8.1000906@hastings.org> Message-ID: 16.03.12 02:48, Larry Hastings ???????(??): > > > Guido steered me towards this suggestion. I think it's a fine idea. I'm > in the process of adding keyword-only arguments to C extension argument > parsing (issue #14328), and after that adding a new ns= parameter to > os.utime (issue #14127). After that I may take a stab at this, unless > you would prefer to be the implementor. > > Could you please create an issue on the bug tracker to discuss/track > this? I'm happy to create the issue if you prefer. With my bad English will be better if you create the issue. I will try to prepare a patch (it will be a big patch), when the syntax of the functions will become quite clear. Initially, I assumed the following signatures: access(path, mode, *, followlinks=True, dirfd=None, eaccess=False) chmod(path, mode, *, followlinks=True, dirfd=None) chown(path, uid, gid, *, followlinks=True, dirfd=None) link(srcpath, dstpath, *, followlinks=True, srcdirfd=None, dstdirfd=None) mkdir(path, mode=0o777, *, dirfd=None) mknod(path, mode=0o600, device=0, *, dirfd=None) open(path, flag, mode=0o777, *, dirfd=None) readlink(path, *, dirfd=None) rename(oldpath, newpath, *, olddirfd=None, newdirfd=None) stat(path, *, followlinks=True, dirfd=None) symlink(src, dst, *, dirfd=None) unlink(path, *, removedir=False, dirfd=None) utimes(path[, (atime, mtime)], *, ns=False, dirfd=None) mkfifoat(path, mode=0o666, *, followlinks=True, dirfd=None) But there is a nuisance with several dirfd in link and rename. The problem with naming parameters. Perhaps it would be better instead of using keyword parameter dirfd to pass tuple (dirfd, relpath) as path parameters? The second question is how the user code will check the availability of dirfd functionality. Special global flags in os or sys? From andrew.svetlov at gmail.com Sat Mar 17 02:44:01 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 16 Mar 2012 18:44:01 -0700 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> Message-ID: I'm ok with mutable __closure__ but can you point the real use case? On Fri, Mar 16, 2012 at 1:58 PM, Yury Selivanov wrote: > Yes, your approach will work if your decorator is the only one applied. > But, as I said, if you have many of them (see below), you can't just > return a new function out of your decorator, you need to change the > underlying "in-place". ?Consider the following: > > def modifier(func): > ?orig_func = func > > ?while func.__wrapped__: > ? ?func = func.__wrapped__ > > ?# patch func.__code__ and func.__closure__ > ?return orig_func # no need to wrap anything > > def some_decorator(func): > ?def wrapper(*args, **kwargs): > ? ? ?# some code > ? ? ?return func(*args, **kwargs) > ?functools.wraps(wrapper, func) > ?return wrapper > > @modifier > @some_decorator > def foo(): > ?# this code needs to be verified/augmented/etc > > So, in the above snippet, if you don't want to discard the > @some_decorator by returning a new function object, you need to modify > the 'foo' from the @modifier. > > In a complex framework, where you can't guarantee that your magic > decorator will always be called first, rewriting the __closure__ > attribute is the only way. > > Again, since the __code__ attribute is modifiable, and __closure__ > works in tight conjunction with it, I see no point in protecting it. > > - > Yury > > On 2012-03-16, at 3:24 PM, Mark Shannon wrote: > >> Yury Selivanov wrote: >>> On 2012-03-16, at 2:57 PM, Yury Selivanov wrote: >>>> Decorators can be nested, and what you can do in this case is to >>>> find the most inner-wrapped function by traversing the '__wrapped__' >>>> attributes (and check that the function you found is the actual >>>> original function). ?After that you can play with its attributes, >>>> but you can't simply substitute the function object, as the inner >>>> decorator won't use it. ?So sometimes you have to work with the >>>> function object without a way of substituting it. >>> And that applies to the situations where decorators are not enough >>> and you have to work on the opcode level. >> >> Which you can do with a decorator. >> >> Would this do what you want? >> >> def f_with_new_closure(f, closure): >> ? ?return types.FunctionType(f.__code__, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?f.__globals__, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?f.__name__, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?f.__defaults__, >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?closure) >> >> >> Cheers, >> Mark. >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thanks, Andrew Svetlov From stephen at xemacs.org Sun Mar 18 08:54:37 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sun, 18 Mar 2012 16:54:37 +0900 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On Sat, Mar 17, 2012 at 1:03 AM, anatoly techtonik wrote: > problem required a lot of effort alone. If there was a list for known > misbehaviours like this in Python 2.x the things could be much easier. If you don't find it on the tracker (including closed issues), for the purpose of actually getting it changed, it's unknown. > I guess the question can be closed, but I am still dissatisfied with > the current state of things. So fix it. If you didn't find an open issue on the tracker, that means either few people (== 0) care enough to post an issue at all, or, as far as those who care are concerned, it's already fixed in the relevant branch(es) and the issue has been closed. Either way, you're going to have to do it yourself. > If Python 3.x is developed using the same process, there is a risk > of UX problems like this too. There's always a risk of problems. Do you have a specific proposal for improving process? From steve at pearwood.info Sun Mar 18 13:27:09 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 18 Mar 2012 23:27:09 +1100 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict Message-ID: <4F65D49D.5070809@pearwood.info> Currently, if you try to construct a function from parts, the mapping that becomes func.__globals__ must be an actual dict: py> class Mapping: ... def __getitem__(self, key): ... if key == 'y': ... return 42 ... raise KeyError(key) ... py> from types import FunctionType py> f = lambda x: x + y py> g = FunctionType(f.__code__, Mapping(), 'g') Traceback (most recent call last): File "", line 1, in TypeError: function() argument 2 must be dict, not Mapping I propose to allow function.__globals__ to accept any mapping type. That, plus the new collections.ChainMap class in Python 3.3, would allow some interesting experiments with namespaces and scoping rules. E.g. if I want to write a function with a custom namespace, I have to do something like this: ns = ChainMap( ... ) # set up a namespace def func(a, ns=ns): x = a + ns['b'] y = ns['some_func'](ns['c']) z = ns['another_func'](x, y) ns['d'] = (x, y, z) return ns['one_last_thing'](d) which is not a very natural way of writing code. But if we could use non-dict mappings as __globals__, I could write that function like this: ns = ChainMap( ... ) # set up a namespace def func(a): global d x = a + b y = some_func(c) z = another_func(x, y) d = (x, y, z) return one_last_thing(d) # This could be a decorator. func = FunctionType(func.__code__, ns, func.__name__) (By the way, ChainMap is only one possible example namespace.) -- Steven From andrew.svetlov at gmail.com Sun Mar 18 18:24:03 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sun, 18 Mar 2012 10:24:03 -0700 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: <4F65D49D.5070809@pearwood.info> References: <4F65D49D.5070809@pearwood.info> Message-ID: What are restrictions to supported classes in global? I think MutableMapping ABC is good requirement. On Sun, Mar 18, 2012 at 5:27 AM, Steven D'Aprano wrote: > Currently, if you try to construct a function from parts, the mapping that > becomes func.__globals__ must be an actual dict: > > > py> class Mapping: > ... ? ? def __getitem__(self, key): > ... ? ? ? ? ? ? if key == 'y': > ... ? ? ? ? ? ? ? ? ? ? return 42 > ... ? ? ? ? ? ? raise KeyError(key) > ... > py> from types import FunctionType > py> f = lambda x: x + y > py> g = FunctionType(f.__code__, Mapping(), 'g') > Traceback (most recent call last): > ?File "", line 1, in > TypeError: function() argument 2 must be dict, not Mapping > > > I propose to allow function.__globals__ to accept any mapping type. > > That, plus the new collections.ChainMap class in Python 3.3, would allow > some interesting experiments with namespaces and scoping rules. > > E.g. if I want to write a function with a custom namespace, I have to do > something like this: > > > ns = ChainMap( ... ) ?# set up a namespace > def func(a, ns=ns): > ? ?x = a + ns['b'] > ? ?y = ns['some_func'](ns['c']) > ? ?z = ns['another_func'](x, y) > ? ?ns['d'] = (x, y, z) > ? ?return ns['one_last_thing'](d) > > > which is not a very natural way of writing code. But if we could use > non-dict mappings as __globals__, I could write that function like this: > > > ns = ChainMap( ... ) ?# set up a namespace > def func(a): > ? ?global d > ? ?x = a + b > ? ?y = some_func(c) > ? ?z = another_func(x, y) > ? ?d = (x, y, z) > ? ?return one_last_thing(d) > > > # This could be a decorator. > func = FunctionType(func.__code__, ns, func.__name__) > > > > (By the way, ChainMap is only one possible example namespace.) > > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thanks, Andrew Svetlov From tjreedy at udel.edu Sun Mar 18 18:26:02 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 18 Mar 2012 13:26:02 -0400 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: <4F65D49D.5070809@pearwood.info> References: <4F65D49D.5070809@pearwood.info> Message-ID: On 3/18/2012 8:27 AM, Steven D'Aprano wrote: > Currently, if you try to construct a function from parts, the mapping > that becomes func.__globals__ must be an actual dict: > > > py> class Mapping: > ... def __getitem__(self, key): > ... if key == 'y': > ... return 42 > ... raise KeyError(key) > ... > py> from types import FunctionType > py> f = lambda x: x + y > py> g = FunctionType(f.__code__, Mapping(), 'g') > Traceback (most recent call last): > File "", line 1, in > TypeError: function() argument 2 must be dict, not Mapping The API of internal classes is intentionally not documented in the types module doc. I take this to mean that the api, if not the existence, of such classes, is implementation specific. Hence any such call is a 'cpython' rather than generic python call. Someone pointed out on python-list that FunctionType checks and rejects non-dict second args because the CPython ceval loop code for CPython LOAD_GLOBAL and STORE_GLOBAL opcodes directly access dicts rather than using the generic mapping interface. > I propose to allow function.__globals__ to accept any mapping type. The important question is whether the current ceval code in merely a holdover from ancient days when dict was the only mapping type and the user function type was not accessible to users, or whether the direct dict access is an important speed optimization that affects essentially all code run on cpython. If not done already, experiments are needed to assess the degree of slowdown. Something to keep in mind: LOAD_GLOBAL is *also* used to access builtins that are not in globals() itself via globals['__builtins__']. So not just any mapping will work as a replacement. See 'Alternative proposal' below. > That, plus the new collections.ChainMap class in Python 3.3, would allow > some interesting experiments with namespaces and scoping rules. > > E.g. if I want to write a function with a custom namespace, I have to do > something like this: > > ns = ChainMap( ... ) # set up a namespace > def func(a, ns=ns): > x = a + ns['b'] > y = ns['some_func'](ns['c']) > z = ns['another_func'](x, y) > ns['d'] = (x, y, z) > return ns['one_last_thing'](d) > > which is not a very natural way of writing code. But if we could use > non-dict mappings as __globals__, I could write that function like this: > > ns = ChainMap( ... ) # set up a namespace > def func(a): > global d > x = a + b > y = some_func(c) > z = another_func(x, y) > d = (x, y, z) > return one_last_thing(d) > > > # This could be a decorator. > func = FunctionType(func.__code__, ns, func.__name__) > > (By the way, ChainMap is only one possible example namespace.) Alternative proposal: write a function to replace LOAD_GLOBAL and STORE_GLOBAL for non-builtin names with the opcodes to access a mapping passed in as an arg to the rewrite function or the function itself. The latter would be perhaps easier since the name of the replacement mapping would already be in the code object. def f(a, _globals = {}):pass f = reglobalize(f) # assume '_globals' is the replacement # now call f with whatever 'global' dict you want on a per-call basis. -- Terry Jan Reedy From robert.kern at gmail.com Sun Mar 18 18:47:35 2012 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 18 Mar 2012 17:47:35 +0000 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: References: <4F65D49D.5070809@pearwood.info> Message-ID: On 3/18/12 5:26 PM, Terry Reedy wrote: > On 3/18/2012 8:27 AM, Steven D'Aprano wrote: >> I propose to allow function.__globals__ to accept any mapping type. > > The important question is whether the current ceval code in merely a holdover > from ancient days when dict was the only mapping type and the user function type > was not accessible to users, or whether the direct dict access is an important > speed optimization that affects essentially all code run on cpython. If not done > already, experiments are needed to assess the degree of slowdown. At one time, both the locals and globals dictionaries used by exec and eval() were required to be true dicts, but this restriction was loosened for the locals mapping but not the globals. This distinction is documented in the language reference. You can probably find the original discussions for the reasons why the restriction on globals dicts remained; I don't remember the details, but it wasn't just an oversight. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From techtonik at gmail.com Sun Mar 18 19:09:01 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Sun, 18 Mar 2012 21:09:01 +0300 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On Sun, Mar 18, 2012 at 10:54 AM, Stephen J. Turnbull wrote: > On Sat, Mar 17, 2012 at 1:03 AM, anatoly techtonik wrote: >> problem required a lot of effort alone. If there was a list for known >> misbehaviours like this in Python 2.x the things could be much easier. > > If you don't find it on the tracker (including closed issues), for the > purpose of > actually getting it changed, it's unknown. The problem that it is almost impossible to find anything on tracker related to specific component like logging. The issue description doesn't always correspond to the actual content and it is really hard to get through all the comments. Proposal - add modules field, allow comment rating, issue and commend edits (Trac). Why won't I do anything? Because my time is limited to 15 minutes slots and getting developer instance takes more time, so I never pass through this barrier (and TAL is not my language). >> I guess the question can be closed, but I am still dissatisfied with >> the current state of things. > > So fix it. ?If you didn't find an open issue on the tracker, that means either > few people (== 0) care enough to post an issue at all, or, as far as those > who care are concerned, it's already fixed in the relevant branch(es) and > the issue has been closed. ?Either way, you're going to have to do it yourself. > >> If Python 3.x is developed using the same process, there is a risk >> of UX problems like this too. > > There's always a risk of problems. ?Do you have a specific proposal for > improving process? The first step is to gather a critical mass of people who acknowledge the problem, who have their own ideas summarized and can share them at the right moment. So, the specific proposal is to have a history that can be analyzed by anyone. Right now it is extremely hard to summarize problems - no custom tags on tracker, no way to organize email threads - these are just two technical proposals to improve the process. One more idea is the use case DB to collect API uses cases and detect conflicts at early stage to draw attention to them. Every problem and use case should have a number, should be clearly defined (much better than tracker issue summaries) and should have a rating/star system. That should be enough for now. -- anatoly t. From ericsnowcurrently at gmail.com Sun Mar 18 22:58:05 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sun, 18 Mar 2012 14:58:05 -0700 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: <4F65D49D.5070809@pearwood.info> References: <4F65D49D.5070809@pearwood.info> Message-ID: On Sun, Mar 18, 2012 at 5:27 AM, Steven D'Aprano wrote: > Currently, if you try to construct a function from parts, the mapping that > becomes func.__globals__ must be an actual dict: > > > py> class Mapping: > ... ? ? def __getitem__(self, key): > ... ? ? ? ? ? ? if key == 'y': > ... ? ? ? ? ? ? ? ? ? ? return 42 > ... ? ? ? ? ? ? raise KeyError(key) > ... > py> from types import FunctionType > py> f = lambda x: x + y > py> g = FunctionType(f.__code__, Mapping(), 'g') > Traceback (most recent call last): > ?File "", line 1, in > TypeError: function() argument 2 must be dict, not Mapping > > > I propose to allow function.__globals__ to accept any mapping type. > > That, plus the new collections.ChainMap class in Python 3.3, would allow > some interesting experiments with namespaces and scoping rules. > > E.g. if I want to write a function with a custom namespace, I have to do > something like this: > > > ns = ChainMap( ... ) ?# set up a namespace > def func(a, ns=ns): > ? ?x = a + ns['b'] > ? ?y = ns['some_func'](ns['c']) > ? ?z = ns['another_func'](x, y) > ? ?ns['d'] = (x, y, z) > ? ?return ns['one_last_thing'](d) > > > which is not a very natural way of writing code. But if we could use > non-dict mappings as __globals__, I could write that function like this: > > > ns = ChainMap( ... ) ?# set up a namespace > def func(a): > ? ?global d > ? ?x = a + b > ? ?y = some_func(c) > ? ?z = another_func(x, y) > ? ?d = (x, y, z) > ? ?return one_last_thing(d) > > > # This could be a decorator. > func = FunctionType(func.__code__, ns, func.__name__) > > > > (By the way, ChainMap is only one possible example namespace.) A casual search of the archives identifies some similar discussion over the years [1]. If PyEval_EvalCodeEx were more fully exposed in Python things would be simpler (see function_call() [2]). I made a small effort to that effect last year, but it turned out to be relatively unnecessary. However, I didn't factor in the restriction on the type of globals... -eric [1] past discussion/efforts: http://www.python.org/download/releases/2.2/descrintro/#subclassing (optimization surrounding globals as dict) http://mail.python.org/pipermail/python-dev/2002-October/029752.html http://mail.python.org/pipermail/python-dev/2002-October/029753.html http://mail.python.org/pipermail/python-dev/2002-October/029761.html http://bugs.python.org/issue215126 http://mail.python.org/pipermail/python-list/2010-December/1262045.html [2] http://hg.python.org/cpython/file/default/Objects/funcobject.c#l592 From stephen at xemacs.org Mon Mar 19 02:49:03 2012 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 19 Mar 2012 10:49:03 +0900 Subject: [Python-ideas] Logging2 with default NullHandler In-Reply-To: References: Message-ID: On Mon, Mar 19, 2012 at 3:09 AM, anatoly techtonik wrote: > Every problem and use case should have a number, should be > clearly defined (much better than tracker issue summaries) and > should have a rating/star system. > That should be enough for now. Of course it would. But ... You remind me of Dave Hayes, who suggested that spam wasn't a problem. You just need sufficiently smart MUAs that never allow spam through, and after a while the spammers will stop trying. The problem here is that like you, people generally only get time in 15 minute increments, and very few are able to clearly define things ("much better than tracker issue summaries") without more time than that. OTOH, we don't have an automatic way to do it. So your proposal AFAICS is DOA. But I'm willing to put a little time into self-education. The only Trac system I use regularly is MacPorts, and it frankly sucks by the standards you put forward. Specifically, although I always search the system for applicable existing bugs, I miss about 2/3 of them (possibly more; only about 1 in 5 bugs show up in my search, but about half of my reports end up getting closed as dupes of existing bugs by the maintainers, and I wouldn't be surprised if some fraction of relevant bugs are unknown to the maintainers, too.) Getting to the current, mostly usable, level of suckitude took about five years. Since that's only one example, you can help me out by pointing me to an example of a well-managed, highly useful, tracker that has the features you propose! From yselivanov.ml at gmail.com Mon Mar 19 16:29:07 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 19 Mar 2012 11:29:07 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> Message-ID: On 2012-03-16, at 9:44 PM, Andrew Svetlov wrote: > I'm ok with mutable __closure__ but can you point the real use case? Well, we need mutable __closure__ to be able to inject some constants or objects to the namespace the corresponding __code__ works with. If you want to know why on earth did we need to mess with the __code__ object at all, that's to control the execution of the 'finally' statement in generator-based coroutines. We modify the __code__ of generators to signal when they are executing in their 'finally' blocks, and when they are, we never abort the execution (by timeout, for instance). The code we inject needs to call one function, and for now, we just inject that function to the generator's __globals__, but the cleaner solution would be to just modify its __closure__. BTW, that's the real problem many coroutine-based frameworks will encounter some day. While this may all sound too complicated, the case is real. We had an option to either patch CPython (and later PyPy), or to inject the needed opcodes in the __code__ object directly. We found that the latter is more preferable. So as I said: I see no reason in protecting the __closure__ attribute, when the __code__ object is writeable. - Yury Selivanov http://sprymix.com From yselivanov.ml at gmail.com Mon Mar 19 18:52:34 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Mon, 19 Mar 2012 13:52:34 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> Message-ID: <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> I've created an issue: http://bugs.python.org/issue14369 - Yury From steve at pearwood.info Tue Mar 20 00:51:02 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 20 Mar 2012 10:51:02 +1100 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: References: <4F65D49D.5070809@pearwood.info> Message-ID: <4F67C666.6090307@pearwood.info> Terry Reedy wrote: > On 3/18/2012 8:27 AM, Steven D'Aprano wrote: >> Currently, if you try to construct a function from parts, the mapping >> that becomes func.__globals__ must be an actual dict: [...] > The API of internal classes is intentionally not documented in the types > module doc. I take this to mean that the api, if not the existence, of > such classes, is implementation specific. Hence any such call is a > 'cpython' rather than generic python call. Are you sure that they are *intentionally* not documented, or merely haven't been due to lack of time and interest? The types themselves are documented as public, they aren't given _single underscore private names, and FunctionType has a minimal but useful doc string with nothing about it being implementation specific. > Someone pointed out on python-list that FunctionType checks and rejects > non-dict second args because the CPython ceval loop code for CPython > LOAD_GLOBAL and STORE_GLOBAL opcodes directly access dicts rather than > using the generic mapping interface. I expect that the existing optimization would be used whenever __globals__ is an actual dict, and the mapping interface would only be used when it is a subclass of dict or other mapping. So for normal functions, the only cost would be an extra type check to see whether __globals__ is a dict or not. [...] > Alternative proposal: write a function to replace LOAD_GLOBAL and > STORE_GLOBAL for non-builtin names with the opcodes to access a mapping > passed in as an arg to the rewrite function or the function itself. Are you proposing that I hack the byte code of the function? Now that would be unsupported and implementation dependent! If not, I'm afraid I don't understand your suggestion. -- Steven From tjreedy at udel.edu Tue Mar 20 03:37:54 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 19 Mar 2012 22:37:54 -0400 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: <4F67C666.6090307@pearwood.info> References: <4F65D49D.5070809@pearwood.info> <4F67C666.6090307@pearwood.info> Message-ID: On 3/19/2012 7:51 PM, Steven D'Aprano wrote: > Terry Reedy wrote: >> On 3/18/2012 8:27 AM, Steven D'Aprano wrote: >>> Currently, if you try to construct a function from parts, the mapping >>> that becomes func.__globals__ must be an actual dict: > [...] >> The API of internal classes is intentionally not documented in the >> types module doc. I take this to mean that the api, if not the >> existence, of such classes, is implementation specific. Hence any such >> call is a 'cpython' rather than generic python call. > > Are you sure that they are *intentionally* not documented, or merely > haven't been due to lack of time and interest? 100% absolutely sure, no. 90% fairly sure, yes. Less that half of the internal types even *are* Python-level callables with Python-object APIs: FunctionType, CodeType, MethodType, ModuleType. The rest are not. I think there once may have been a discussion of how much to expose them. While the last two are trivial, the help for CodeType says "Not for the faint of heart." I suspect that some of these accessible APIs are new since 1.3, or may have changed. It is hard to know since they are not documented ;-). > The types themselves are documented as public, they aren't given _single > underscore private names, The module says, "Typical use is for isinstance() or issubclass() checks." Exposing the internal types was done to do once and do correctly things that people did (and sometimes still do) in their code, such as isinstance(f, type(lambda: None)). The module has similar code to get the types. def _f(): pass FunctionType = type(_f) LambdaType = type(lambda: None) # Same as FunctionType CodeType = type(_f.__code__) So there is not way for the objects themselves to not be public. The names bound to them in types are not their .__name__ attributes. It is certainly intentional that they are not bound to their names in builtins. Getting traceback and frame types is trickier to get right: try: raise TypeError except TypeError: tb = sys.exc_info()[2] TracebackType = type(tb) FrameType = type(tb.tb_frame) tb = None; del tb > and FunctionType has a minimal but useful doc > string with nothing about it being implementation specific. Marking *anything* other than ref-counting and gc behavior as implementation specific in the docs is fairly recent. The same goes for cpython-only tests in the test suite. Some of this splitting has been prompted by questions from Iron-py and pypy developers. -- Terry Jan Reedy From steve at pearwood.info Tue Mar 20 04:01:50 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 20 Mar 2012 14:01:50 +1100 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: <4F65D49D.5070809@pearwood.info> References: <4F65D49D.5070809@pearwood.info> Message-ID: <20120320030149.GB28460@ando> On Sun, Mar 18, 2012 at 11:27:09PM +1100, Steven D'Aprano wrote: > I propose to allow function.__globals__ to accept any mapping type. Jython already supports this behaviour. steve at runes:~$ jython [...] Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19) [OpenJDK Client VM (Sun Microsystems Inc.)] on java1.6.0_18 Type "help", "copyright", "credits" or "license" for more information. >>> >>> class Mapping: ... def __getitem__(self, key): ... if key == 'a': return 1 ... else: raise KeyError(key) ... >>> from types import FunctionType >>> f = lambda x: x + a >>> g = FunctionType(f.func_code, Mapping(), 'g') >>> g(2) 3 -- Steven From ericsnowcurrently at gmail.com Tue Mar 20 05:41:17 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 19 Mar 2012 21:41:17 -0700 Subject: [Python-ideas] sys.implementation Message-ID: In October 2009 there was a short flurry of interest in adding "sys.implementation" as an object to encapsulate some implementation-specific information [1]. Does anyone recollect where this proposal went? Would anyone object to reviving it (or a variant)? -eric [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html From mark at hotpy.org Tue Mar 20 10:34:42 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 20 Mar 2012 09:34:42 +0000 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> Message-ID: <4F684F32.5080006@hotpy.org> Yury Selivanov wrote: > I've created an issue: http://bugs.python.org/issue14369 > I think that creating an issue may be premature, given that you have had no positive feedback on the idea. I still think making __closure__ mutable is unnecessary. If you insist that it is it, then please provide an example which would work with your proposed change, but cannot be made to work using types.FunctionType() to create a new closure. Cheers, Mark. From yselivanov.ml at gmail.com Tue Mar 20 14:06:18 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 20 Mar 2012 09:06:18 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <4F684F32.5080006@hotpy.org> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> Message-ID: <8406714B-876E-435E-84AB-716804C92387@gmail.com> I did provide such example earlier in this thread. I'm copying and pasting it to this mail. Please read the example carefully, as it explains why returning new types.FunctionType() is not enough. ---- Yes, your approach will work if your decorator is the only one applied. But, as I said, if you have many of them (see below), you can't just return a new function out of your decorator, you need to change the underlying "in-place". Consider the following: def modifier(func): orig_func = func while func.__wrapped__: func = func.__wrapped__ # patch func.__code__ and func.__closure__ return orig_func # no need to wrap anything def some_decorator(func): def wrapper(*args, **kwargs): # some code return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper @modifier @some_decorator def foo(): # this code needs to be verified/augmented/etc So, in the above snippet, if you don't want to discard the @some_decorator by returning a new function object, you need to modify the 'foo' from the @modifier. In a complex framework, where you can't guarantee that your magic decorator will always be called first, rewriting the __closure__ attribute is the only way. Again, since the __code__ attribute is modifiable, and __closure__ works in tight conjunction with it, I see no point in protecting it. On 2012-03-20, at 5:34 AM, Mark Shannon wrote: > Yury Selivanov wrote: >> I've created an issue: http://bugs.python.org/issue14369 > > I think that creating an issue may be premature, given that you have had > no positive feedback on the idea. > > I still think making __closure__ mutable is unnecessary. > If you insist that it is it, then please provide an example which would > work with your proposed change, but cannot be made to work using > types.FunctionType() to create a new closure. > > Cheers, > Mark. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From ncoghlan at gmail.com Tue Mar 20 14:15:48 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 20 Mar 2012 23:15:48 +1000 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <8406714B-876E-435E-84AB-716804C92387@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: On Tue, Mar 20, 2012 at 11:06 PM, Yury Selivanov wrote: > Again, since the __code__ attribute is modifiable, and __closure__ > works in tight conjunction with it, I see no point in protecting it. FWIW, I'm current +0 on the idea (based on Yury's reasoning here about updating already wrapped functions), but it's going to be a while before I can dig into the code and see if there are any *good* reasons we protect __closure__ (rather than that protection merely being an implementation artifact). I can't recall any off the top of my head, but that part of the code is complex enough that I don't completely trust my memory on that point. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From yselivanov.ml at gmail.com Tue Mar 20 15:31:11 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 20 Mar 2012 10:31:11 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: On 2012-03-20, at 9:15 AM, Nick Coghlan wrote: > On Tue, Mar 20, 2012 at 11:06 PM, Yury Selivanov > wrote: >> Again, since the __code__ attribute is modifiable, and __closure__ >> works in tight conjunction with it, I see no point in protecting it. > > FWIW, I'm current +0 on the idea (based on Yury's reasoning here about > updating already wrapped functions), but it's going to be a while > before I can dig into the code and see if there are any *good* reasons > we protect __closure__ (rather than that protection merely being an > implementation artifact). I can't recall any off the top of my head, > but that part of the code is complex enough that I don't completely > trust my memory on that point. Well, it seems that it is an implementation artifact. I've run across the CPython code, and got an impression that changing __closure__ to be writeable won't break anything. Writing malicious values to it may segfault python, but that's no different from doing so with __code__. In some way, __closure__ is already writeable, since you can pass it to the 'types.FunctionObject', and create a broken function, hence I don't see a reason in making it readonly. The patch attached to the issue in the tracker passes all unittests, and adds one more - specifically for testing __closure__ modification on a live function object. - Yury From mikegraham at gmail.com Tue Mar 20 15:40:24 2012 From: mikegraham at gmail.com (Mike Graham) Date: Tue, 20 Mar 2012 10:40:24 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Message-ID: On Fri, Mar 16, 2012 at 1:57 PM, Yury Selivanov wrote: > Can we make the __closure__ attribute writeable? Since __code__ already is, > and it is possible to tackle with the opcodes, having no way of rewriting > __closure__ (except creating a completely new function) is annoying. > > I don't think it will somehow harm python, as those who want to break it can > do it already in multiple ways, easier than playing with __closure__. > > - > Yury +1000, this would make code like https://github.com/magcius/toenail/blob/master/ailment.py#L20 much cleaner and more readable. Mike From mark at hotpy.org Tue Mar 20 15:43:06 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 20 Mar 2012 14:43:06 +0000 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <8406714B-876E-435E-84AB-716804C92387@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: <4F68977A.1000506@hotpy.org> Yury Selivanov wrote: > I did provide such example earlier in this thread. I'm copying and > pasting it to this mail. Please read the example carefully, as it > explains why returning new types.FunctionType() is not enough. > > ---- > > Yes, your approach will work if your decorator is the only one applied. > But, as I said, if you have many of them (see below), you can't just > return a new function out of your decorator, you need to change the > underlying "in-place". Consider the following: > > def modifier(func): > orig_func = func > > while func.__wrapped__: > func = func.__wrapped__ > > # patch func.__code__ and func.__closure__ > return orig_func # no need to wrap anything > > def some_decorator(func): > def wrapper(*args, **kwargs): > # some code > return func(*args, **kwargs) > functools.wraps(wrapper, func) > return wrapper > > @modifier > @some_decorator > def foo(): > # this code needs to be verified/augmented/etc > > So, in the above snippet, if you don't want to discard the > @some_decorator by returning a new function object, you need to modify > the 'foo' from the @modifier. > > In a complex framework, where you can't guarantee that your magic > decorator will always be called first, rewriting the __closure__ > attribute is the only way. So why won't this work? def f_with_new_closure(f, closure): return types.FunctionType(f.__code__, f.__globals__, f.__name__, f.__defaults__, closure) def modifier(func, closure): if func.__wrapped__: while func.__wrapped__.__wrapped__: func = func.__wrapped__ func.__wrapped__ = f_with_new_closure(func.__wrapped__, closure) else: return f_with_new_closure(func, closure) if func.__wrapped__: return f_with_new_closure(func, f_with_new_closure(func.__wrapped__)) else: return f_with_new_closure(func, closure) Cheers, Mark. From mark at hotpy.org Tue Mar 20 15:47:03 2012 From: mark at hotpy.org (Mark Shannon) Date: Tue, 20 Mar 2012 14:47:03 +0000 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <4F68977A.1000506@hotpy.org> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> <4F68977A.1000506@hotpy.org> Message-ID: <4F689867.2000705@hotpy.org> Whoops, cut and paste error. The code should have read: > > def f_with_new_closure(f, closure): > return types.FunctionType(f.__code__, > f.__globals__, > f.__name__, > f.__defaults__, > closure) > > def modifier(func, closure): > if func.__wrapped__: > while func.__wrapped__.__wrapped__: > func = func.__wrapped__ > func.__wrapped__ = f_with_new_closure(func.__wrapped__, > closure) > return func > else: > return f_with_new_closure(func, closure) Cheers, Mark. From yselivanov.ml at gmail.com Tue Mar 20 15:56:10 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 20 Mar 2012 10:56:10 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <4F68977A.1000506@hotpy.org> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> <4F68977A.1000506@hotpy.org> Message-ID: Because usually you write decorators as functions, not classes. And when you do the former style, you usually do it in the following way: def decorator(func): def wrapper(*args, **kwargs): return func(*args, **kwargs) functools.wraps(wrapper, func) return wrapper Now, let's use it: @decorator def some_func(): pass OK. At this point, 'some_func' object has a '__wrapped__' attribute, that points to original 'some_func' function. But whatever you write to 'some_func.__wrapped__' won't change anything, as the 'wrapper' will continue to call old 'some_func'. Instead of assigning something to __wrapped__, we need to change it in-place, by doing '__wrapped__.__closure__ = new_closure'. On 2012-03-20, at 10:43 AM, Mark Shannon wrote: > Yury Selivanov wrote: >> I did provide such example earlier in this thread. I'm copying and >> pasting it to this mail. Please read the example carefully, as it >> explains why returning new types.FunctionType() is not enough. >> ---- >> Yes, your approach will work if your decorator is the only one applied. >> But, as I said, if you have many of them (see below), you can't just >> return a new function out of your decorator, you need to change the >> underlying "in-place". Consider the following: >> def modifier(func): >> orig_func = func >> while func.__wrapped__: >> func = func.__wrapped__ >> # patch func.__code__ and func.__closure__ >> return orig_func # no need to wrap anything >> def some_decorator(func): >> def wrapper(*args, **kwargs): >> # some code >> return func(*args, **kwargs) >> functools.wraps(wrapper, func) >> return wrapper >> @modifier >> @some_decorator >> def foo(): >> # this code needs to be verified/augmented/etc >> So, in the above snippet, if you don't want to discard the >> @some_decorator by returning a new function object, you need to modify the 'foo' from the @modifier. >> In a complex framework, where you can't guarantee that your magic >> decorator will always be called first, rewriting the __closure__ attribute is the only way. > > So why won't this work? > > def f_with_new_closure(f, closure): > return types.FunctionType(f.__code__, > f.__globals__, > f.__name__, > f.__defaults__, > closure) > > def modifier(func, closure): > if func.__wrapped__: > while func.__wrapped__.__wrapped__: > func = func.__wrapped__ > func.__wrapped__ = f_with_new_closure(func.__wrapped__, > closure) > else: > return f_with_new_closure(func, closure) > if func.__wrapped__: > return f_with_new_closure(func, > f_with_new_closure(func.__wrapped__)) > else: > return f_with_new_closure(func, closure) > > Cheers, > Mark. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From yselivanov.ml at gmail.com Tue Mar 20 15:59:21 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 20 Mar 2012 10:59:21 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> <4F68977A.1000506@hotpy.org> Message-ID: <7C67D7CD-0D70-4FD7-8591-D89A525830DB@gmail.com> On 2012-03-20, at 10:56 AM, Yury Selivanov wrote: > Because usually you write decorators as functions, not classes. And > when you do the former style, you usually do it in the following way: > > def decorator(func): > def wrapper(*args, **kwargs): > return func(*args, **kwargs) > > functools.wraps(wrapper, func) > return wrapper > > Now, let's use it: > > @decorator > def some_func(): pass > > OK. At this point, 'some_func' object has a '__wrapped__' attribute, > that points to original 'some_func' function. But whatever you write > to 'some_func.__wrapped__' won't change anything, as the 'wrapper' > will continue to call old 'some_func'. Instead of assigning something > to __wrapped__, we need to change it in-place, by doing > '__wrapped__.__closure__ = new_closure'. And as I told you in the first example: there is no problem when you have only one decorator. You can surely just return a new FunctionType(). But when you have many of them, such as: @decorator3 @your_magic_decorator_that_modifies_the_closure @decorator2 @decorator1 def some_func(): pass The only way to modify the __closure__ it to write to the __wrapped__ attribute of 'decorator1' wrapper. - Yury From brett at python.org Tue Mar 20 16:01:02 2012 From: brett at python.org (Brett Cannon) Date: Tue, 20 Mar 2012 11:01:02 -0400 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 00:41, Eric Snow wrote: > In October 2009 there was a short flurry of interest in adding > "sys.implementation" as an object to encapsulate some > implementation-specific information [1]. Does anyone recollect where > this proposal went? Would anyone object to reviving it (or a > variant)? > > I was wondering that myself when looking at imp.source_to_cache() at PyCon. I would still like to see it happen. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Tue Mar 20 16:49:14 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Tue, 20 Mar 2012 08:49:14 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: Message-ID: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> On 13 Mar, 2012, at 9:15, Guido van Rossum wrote: > > I think it comes down to this: I really, really, really hate > directories with a suffix. I'd like to point out that the suffix is > also introducing a backwards incompatibility: everybody will have to > teach their tools, IDEs, and brains about .pyp directories, Directories with a suffix have the advantage that you could teach GUIs to treat them differently, filemanagers could for example show a ".pyp" directory as a folder with a python logo just like ".py" files are shown as documents with a python logo. With the implicit approach it is much harder to recognize python packages as such without detailed knowledge about the import algorithm and python search path. Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4788 bytes Desc: not available URL: From yselivanov.ml at gmail.com Tue Mar 20 17:01:10 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Tue, 20 Mar 2012 12:01:10 -0400 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: <4F65D49D.5070809@pearwood.info> References: <4F65D49D.5070809@pearwood.info> Message-ID: +1 on the idea. Would be interesting to play with frozendict (if it's accepted someday) On 2012-03-18, at 8:27 AM, Steven D'Aprano wrote: > Currently, if you try to construct a function from parts, the mapping that becomes func.__globals__ must be an actual dict: > > > py> class Mapping: > ... def __getitem__(self, key): > ... if key == 'y': > ... return 42 > ... raise KeyError(key) > ... > py> from types import FunctionType > py> f = lambda x: x + y > py> g = FunctionType(f.__code__, Mapping(), 'g') > Traceback (most recent call last): > File "", line 1, in > TypeError: function() argument 2 must be dict, not Mapping > > > I propose to allow function.__globals__ to accept any mapping type. > > That, plus the new collections.ChainMap class in Python 3.3, would allow some interesting experiments with namespaces and scoping rules. > > E.g. if I want to write a function with a custom namespace, I have to do something like this: > > > ns = ChainMap( ... ) # set up a namespace > def func(a, ns=ns): > x = a + ns['b'] > y = ns['some_func'](ns['c']) > z = ns['another_func'](x, y) > ns['d'] = (x, y, z) > return ns['one_last_thing'](d) > > > which is not a very natural way of writing code. But if we could use non-dict mappings as __globals__, I could write that function like this: > > > ns = ChainMap( ... ) # set up a namespace > def func(a): > global d > x = a + b > y = some_func(c) > z = another_func(x, y) > d = (x, y, z) > return one_last_thing(d) > > > # This could be a decorator. > func = FunctionType(func.__code__, ns, func.__name__) > > > > (By the way, ChainMap is only one possible example namespace.) > > > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From guido at python.org Tue Mar 20 18:29:28 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Mar 2012 10:29:28 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> Message-ID: On Tue, Mar 20, 2012 at 8:49 AM, Ronald Oussoren wrote: > > On 13 Mar, 2012, at 9:15, Guido van Rossum wrote: > >> >> I think it comes down to this: I really, really, really hate >> directories with a suffix. I'd like to point out that the suffix is >> also introducing a backwards incompatibility: everybody will have to >> teach their tools, IDEs, and brains about .pyp directories, > > Directories with a suffix have the advantage that you could teach GUIs to treat > them differently, filemanagers could for example show a ".pyp" directory as > a folder with a python logo just like ".py" ?files are shown as documents with > a python logo. > > With the implicit approach it is much harder to recognize python packages as > such without detailed knowledge about the import algorithm and python search > path. How's that working out for Java? -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Tue Mar 20 18:42:15 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 20 Mar 2012 13:42:15 -0400 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> Message-ID: On 3/20/2012 11:49 AM, Ronald Oussoren wrote: > > On 13 Mar, 2012, at 9:15, Guido van Rossum wrote: >> I think it comes down to this: I really, really, really hate >> directories with a suffix. I'd like to point out that the suffix >> is also introducing a backwards incompatibility: everybody will >> have to teach their tools, IDEs, and brains about .pyp >> directories, > > Directories with a suffix have the advantage that you could teach > GUIs to treat them differently, filemanagers could for example show a > ".pyp" directory as a folder with a python logo just like ".py" > files are shown as documents with a python logo. > > With the implicit approach it is much harder to recognize python > packages as such without detailed knowledge about the import > algorithm and python search path. Package directories are files and can be imported to make modules. I think it would have been nice to use .pyp from the beginning. It would make Python easier to learn. Also, 'import x' would mean simply mean "Search sys.path directories for a file named 'x.py*', with no need for either the importer (or human reader) to look within directories for the magic __init__.py file. Sorting a directory listing by extension would sort all packages together. -- Terry Jan Reedy From storchaka at gmail.com Tue Mar 20 19:26:43 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 20 Mar 2012 20:26:43 +0200 Subject: [Python-ideas] Exact integral types in struct Message-ID: The struct module works only with natural platform-specific integers. As a result, a lot of code in the standard library and third-party applications are forced either to rely on unreliable assumptions (short is always 2-byte, long is always 4-byte), which is not always true, either explicitly construct the integer from bytes (b[0]+(b[1]<<8)+(b[2]<<16)+...). I propose to introduce a special notation formats for signed and unsigned integers of arbitrary exact size (which is given a number preceded by the prefix). After that, eliminate the use of platform-specific formats when working with platform-independent data (such as zip, for example). Or maybe I'm behind, and the corresponding functions already exist, and the use of the struct module is only remnants? From guido at python.org Tue Mar 20 19:32:35 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 20 Mar 2012 11:32:35 -0700 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: Using the '<' and '>' prefixes to the format string you can force standard size and alignment. Do you have a specific bit of code in the stdlib in mind that is incorrectly using native alignment? On Tue, Mar 20, 2012 at 11:26 AM, Serhiy Storchaka wrote: > The struct module works only with natural platform-specific integers. As a > result, a lot of code in the standard library and third-party applications > are forced either to rely on unreliable assumptions (short is always 2-byte, > long is always 4-byte), which is not always true, either explicitly > construct the integer from bytes (b[0]+(b[1]<<8)+(b[2]<<16)+...). I propose > to introduce a special notation formats for signed and unsigned integers of > arbitrary exact size (which is given a number preceded by the prefix). After > that, eliminate the use of platform-specific formats when working with > platform-independent data (such as zip, for example). > > Or maybe I'm behind, and the corresponding functions already exist, and the > use of the struct module is only remnants? > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) From benjamin at python.org Tue Mar 20 20:09:21 2012 From: benjamin at python.org (Benjamin Peterson) Date: Tue, 20 Mar 2012 19:09:21 +0000 (UTC) Subject: [Python-ideas] sys.implementation References: Message-ID: Brett Cannon writes: > > I was wondering that myself when looking at imp.source_to_cache() at PyCon. What's implementation specific about that? From storchaka at gmail.com Tue Mar 20 20:48:08 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 20 Mar 2012 21:48:08 +0200 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: 20.03.12 20:32, Guido van Rossum ???????(??): > Using the '<' and'>' prefixes to the format string you can force > standard size and alignment. Do you have a specific bit of code in the > stdlib in mind that is incorrectly using native alignment? Hmm. It seems that I have been a temporary insanity. I just now noticed that these prefixes indicate not only endianess but also size. My fault. Excuse me for undue disturbance. However, the trick with struct.unpack('dd') in Lib/json/decoder.py amazes me. From andrew.svetlov at gmail.com Tue Mar 20 20:54:43 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Tue, 20 Mar 2012 21:54:43 +0200 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: Floating points from IEEE 754 doesn't depends from machine byte order and C double is always coded in 8 bytes as I know, On Tue, Mar 20, 2012 at 9:48 PM, Serhiy Storchaka wrote: > 20.03.12 20:32, Guido van Rossum ???????(??): > >> Using the '<' and'>' prefixes to the format string you can force >> standard size and alignment. Do you have a specific bit of code in the >> stdlib in mind that is incorrectly using native alignment? > > > Hmm. It seems that I have been a temporary insanity. I just now noticed that > these prefixes indicate not only endianess but also size. My fault. Excuse > me for undue disturbance. > > However, the trick with struct.unpack('dd') in Lib/json/decoder.py amazes > me. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thanks, Andrew Svetlov From storchaka at gmail.com Tue Mar 20 21:27:24 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 20 Mar 2012 22:27:24 +0200 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: 20.03.12 21:54, Andrew Svetlov ???????(??): > Floating points from IEEE 754 doesn't depends from machine byte order > and C double is always coded in 8 bytes as I know, Full code: def _floatconstants(): _BYTES = binascii.unhexlify(b'7FF80000000000007FF0000000000000') if sys.byteorder != 'big': _BYTES = _BYTES[:8][::-1] + _BYTES[8:][::-1] nan, inf = struct.unpack('dd', _BYTES) return nan, inf, -inf NaN, PosInf, NegInf = _floatconstants() But in xdrlib.py: return struct.unpack('>d', data)[0] And in pickle.py: self.append(unpack('>d', self.read(8))[0]) Test: >>> import struct >>> struct.pack('>d', 1) b'?\xf0\x00\x00\x00\x00\x00\x00' >>> struct.pack(' References: Message-ID: Sorry, my fault. But as you can see json lib switches byteorder manually - so it's not an error. Obviously it will be cleaner to use direct '>d' form. Please make an issue in bugtracker if you want. On Tue, Mar 20, 2012 at 10:27 PM, Serhiy Storchaka wrote: > 20.03.12 21:54, Andrew Svetlov ???????(??): > >> Floating points from IEEE 754 doesn't depends from machine byte order >> and C double is always coded in 8 bytes as I know, > > > Full code: > > def _floatconstants(): > ? ?_BYTES = binascii.unhexlify(b'7FF80000000000007FF0000000000000') > ? ?if sys.byteorder != 'big': > ? ? ? ?_BYTES = _BYTES[:8][::-1] + _BYTES[8:][::-1] > ? ?nan, inf = struct.unpack('dd', _BYTES) > ? ?return nan, inf, -inf > > NaN, PosInf, NegInf = _floatconstants() > > > But in xdrlib.py: > > ? ? ? ?return struct.unpack('>d', data)[0] > > And in pickle.py: > > ? ? ? ?self.append(unpack('>d', self.read(8))[0]) > > Test: > > ?>>> import struct > ?>>> struct.pack('>d', 1) > ?b'?\xf0\x00\x00\x00\x00\x00\x00' > ?>>> struct.pack(' ?b'\x00\x00\x00\x00\x00\x00\xf0?' > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thanks, Andrew Svetlov From storchaka at gmail.com Tue Mar 20 22:00:52 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Tue, 20 Mar 2012 23:00:52 +0200 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: 20.03.12 22:36, Andrew Svetlov ???????(??): > But as you can see json lib switches byteorder manually - so it's not an error. And it's funny. It is also strange that it is not using just float('nan') and float('inf'). > Please make an issue in bugtracker if you want. It's works. From anacrolix at gmail.com Wed Mar 21 00:46:10 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 21 Mar 2012 07:46:10 +0800 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: You can also use "3s", then int.from_bytes. Strangely, the need for finer control of struct members has never come up, I guess this is the legacy of C. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed Mar 21 02:29:43 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 20 Mar 2012 18:29:43 -0700 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Mar 20, 2012 1:11 PM, "Benjamin Peterson" wrote: > > Brett Cannon writes: > > > > I was wondering that myself when looking at imp.source_to_cache() at PyCon. > > What's implementation specific about that? File suffixes on the .pyc files there include implementation information. To get that info in Python you can use the platform module (not an option for bootstrapping importlib) or guess (essentially what the platform module does). -eric > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From benjamin at python.org Wed Mar 21 02:41:36 2012 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 21 Mar 2012 01:41:36 +0000 (UTC) Subject: [Python-ideas] sys.implementation References: Message-ID: Eric Snow writes: > > > > On Mar 20, 2012 1:11 PM, "Benjamin Peterson" wrote: > > > > Brett Cannon ...> writes: > > > > > > I was wondering that myself when looking at imp.source_to_cache() at PyCon. > > > > What's implementation specific about that? > File suffixes on the .pyc files there include implementation information.? To get that info in Python you can use the platform module (not an option for bootstrapping importlib) or guess (essentially what the platform module does). I think source_to_cache() is a bad example, though, because the operation would basically identical in every Python implementation. The tag will just change. It even changes in every CPython version. From ncoghlan at gmail.com Wed Mar 21 02:50:23 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Wed, 21 Mar 2012 11:50:23 +1000 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Wed, Mar 21, 2012 at 11:41 AM, Benjamin Peterson wrote: > I think source_to_cache() is a bad example, though, because the operation would > basically identical in every Python implementation. The tag will just change. It > even changes in every CPython version. I believe that was Brett's point. Currently other implementations have to replace imp.get_tag() to change the magic string, whereas that kind of info could easily be consolidated into a "sys.implementation" struct that standardised a few things so that impls just needed to populate the struct correctly rather than making scattered changes to a variety of different modules. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From benjamin at python.org Wed Mar 21 03:00:12 2012 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 21 Mar 2012 02:00:12 +0000 (UTC) Subject: [Python-ideas] sys.implementation References: Message-ID: Nick Coghlan writes: > > I believe that was Brett's point. Ah, so you're not suggesting moving imp.source_to_cache() to some sys.implementation module. Great. From pyideas at rebertia.com Wed Mar 21 04:13:07 2012 From: pyideas at rebertia.com (Chris Rebert) Date: Tue, 20 Mar 2012 20:13:07 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> Message-ID: On Tue, Mar 20, 2012 at 10:42 AM, Terry Reedy wrote: > On 3/20/2012 11:49 AM, Ronald Oussoren wrote: >> On 13 Mar, 2012, at 9:15, Guido van Rossum wrote: >>> I think it comes down to this: I really, really, really hate >>> directories with a suffix. I'd like to point out that the suffix >>> is also introducing a backwards incompatibility: everybody will >>> have to teach their tools, IDEs, and brains about .pyp >>> directories, >> >> Directories with a suffix have the advantage that you could teach >> GUIs to treat them differently, filemanagers could for example show a >> ".pyp" directory as a folder with a python logo just like ".py" >> files are shown as documents with a python logo. >> >> With the implicit approach it is much harder to recognize python >> packages as such without detailed knowledge about the import >> algorithm and python search path. > > Package directories are files and can be imported to make modules. I think > it would have been nice to use .pyp from the beginning. It would make Python > easier to learn. Also, 'import x' would mean simply mean "Search sys.path > directories for a file named 'x.py*', with no need for either the importer > (or human reader) to look within directories for the magic __init__.py file. > Sorting a directory listing by extension would sort all packages together. Your file manager views directories as having filename extensions? Mine sure doesn't. Cheers, Chris From storchaka at gmail.com Wed Mar 21 07:02:34 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 21 Mar 2012 08:02:34 +0200 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: 21.03.12 01:46, Matt Joiner ???????(??): > You can also use "3s", then int.from_bytes. Strangely, the need for > finer control of struct members has never come up, I guess this is the > legacy of C. Thank you. I'm not very familiar with the latest API. I think that in the documentation for the struct module should clarify, what is the *standard* size and alignment. The struct module is widely used in the standard library for working with binary formats (aifc, base64, binhex, compileall, dbm, gettext, gzip, idlelib, logging, modulefinder, msilib, pickle, wave, xdrlib, zipfile). From greg.ewing at canterbury.ac.nz Wed Mar 21 07:31:39 2012 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 21 Mar 2012 19:31:39 +1300 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: Message-ID: <4F6975CB.5070500@canterbury.ac.nz> Serhiy Storchaka wrote: > I think that in the documentation for the struct module should clarify, > what is the *standard* size and alignment. Yes, it's a bit perplexing the way it casually throws in the word "standard" without any elaboration at that point. It leaves the reader wondering -- which standard? It sounds like it's referring to some widely-recognised standard that the reader is assumed to already know about, whereas it's actually something made up for the struct module. Also I think it would help to point out that the standard is designed to be platform-independent as well as compiler-independent. One can infer this from what is said, but it wouldn't hurt to point it out explicitly. -- Greg From simon.sapin at kozea.fr Wed Mar 21 11:14:11 2012 From: simon.sapin at kozea.fr (Simon Sapin) Date: Wed, 21 Mar 2012 11:14:11 +0100 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: <4F6975CB.5070500@canterbury.ac.nz> References: <4F6975CB.5070500@canterbury.ac.nz> Message-ID: <4F69A9F3.4030202@kozea.fr> Le 21/03/2012 07:31, Greg Ewing a ?crit : > Serhiy Storchaka wrote: > >> > I think that in the documentation for the struct module should clarify, >> > what is the*standard* size and alignment. > Yes, it's a bit perplexing the way it casually throws in the > word "standard" without any elaboration at that point. It leaves > the reader wondering -- which standard? It sounds like it's > referring to some widely-recognised standard that the reader is > assumed to already know about, whereas it's actually something > made up for the struct module. I don?t see this problem when reading the documentation. The idea of "standard" size is introduced in section 7.3.2.1: > Standard size depends only on the format character; see the table > in the Format Characters section. The said table in the next section has a "Standard size" column. For example, the size for "@i" (native size) is variable, but "=i" (standard size) is always 4 bytes. http://docs.python.org/library/struct.html#byte-order-size-and-alignment http://docs.python.org/library/struct.html#format-characters Maybe the docs should not use the word "standard". But it is self-contained: it does not refer to an external standard. As to alignment, the table in 7.3.2.1 is pretty clear that "standard alignment" is no alignment at all. -- Simon Sapin From storchaka at gmail.com Wed Mar 21 13:36:49 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Wed, 21 Mar 2012 14:36:49 +0200 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: <4F69A9F3.4030202@kozea.fr> References: <4F6975CB.5070500@canterbury.ac.nz> <4F69A9F3.4030202@kozea.fr> Message-ID: 21.03.12 12:14, Simon Sapin ???????(??): > I don?t see this problem when reading the documentation. The idea of > "standard" size is introduced in section 7.3.2.1: Again it is all because of my carelessness. I looked ``pydoc struct``, and not a library documentation. From brett at python.org Wed Mar 21 15:19:55 2012 From: brett at python.org (Brett Cannon) Date: Wed, 21 Mar 2012 10:19:55 -0400 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Tue, Mar 20, 2012 at 22:00, Benjamin Peterson wrote: > Nick Coghlan writes: > > > > I believe that was Brett's point. > > Ah, so you're not suggesting moving imp.source_to_cache() to some > sys.implementation module. Great. Just FYI, everyone guessed right about what I was thinking and Benjamin is right about not suggesting moving imp.source_to_cache() but simply adding sys.implementation so that imp.source_to_cache() can use that instead of imp.get_tag(). -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Wed Mar 21 15:22:51 2012 From: brett at python.org (Brett Cannon) Date: Wed, 21 Mar 2012 10:22:51 -0400 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> Message-ID: On Tue, Mar 20, 2012 at 11:49, Ronald Oussoren wrote: > > On 13 Mar, 2012, at 9:15, Guido van Rossum wrote: > > > > > I think it comes down to this: I really, really, really hate > > directories with a suffix. I'd like to point out that the suffix is > > also introducing a backwards incompatibility: everybody will have to > > teach their tools, IDEs, and brains about .pyp directories, > > Directories with a suffix have the advantage that you could teach GUIs to > treat > them differently, filemanagers could for example show a ".pyp" directory as > a folder with a python logo just like ".py" files are shown as documents > with > a python logo. > > OS X has made me dislike that possibility. Some git tools will make directories ending in .git be considered an opaque object in the file system, forcing me to drop into a shell or right-click and choose to inspect the directory in order to see its contents. -Brett > With the implicit approach it is much harder to recognize python packages > as > such without detailed knowledge about the import algorithm and python > search > path. > > Ronald > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ericsnowcurrently at gmail.com Wed Mar 21 17:02:15 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Wed, 21 Mar 2012 10:02:15 -0600 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Mon, Mar 19, 2012 at 10:41 PM, Eric Snow wrote: > In October 2009 there was a short flurry of interest in adding > "sys.implementation" as an object to encapsulate some > implementation-specific information [1]. ?Does anyone recollect where > this proposal went? ?Would anyone object to reviving it (or a > variant)? FYI, there are several reasons why sys.implementation is a good idea when some of the same info is already in the platform module: * The implementation in the platform module is essentially just guessing [1]. With sys.implementation the various implementations would explicitly set the values in their own version of the sys module. * The platform module is part of the stdlib, which ideally would minimize implementation details such as would be in sys.implementation. * Any module used in the importlib bootstrap must be built-in or frozen, neither of which apply to the platform module. This is the point that led to me finding the previous proposal. I expect that any overlap between sys.implementation and the platform module would simply defer to sys.implementation (with the same interface in platform wrapping it). I'd like to move this forward, so any objections or feedback at this point would be helpful. If Christian is interested it taking this I'd gladly step back. Regardless, feedback from the different Python implementations will be especially important here. Preferably, sys.implementation (the object bundling the various info) would be available on all implementations sooner rather than later... -eric [1] http://hg.python.org/cpython/file/default/Lib/platform.py#l1247 > > -eric > > > [1] http://mail.python.org/pipermail/python-dev/2009-October/092893.html From benjamin at python.org Wed Mar 21 19:37:47 2012 From: benjamin at python.org (Benjamin Peterson) Date: Wed, 21 Mar 2012 18:37:47 +0000 (UTC) Subject: [Python-ideas] sys.implementation References: Message-ID: Eric Snow writes: > > I'd like to move this forward, so any objections or feedback at this > point would be helpful. I would like to see a concrete proposal of what would get put in there. Regards, Benjamin From techtonik at gmail.com Wed Mar 21 22:03:47 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 22 Mar 2012 00:03:47 +0300 Subject: [Python-ideas] Standard way to get caller name and easy call stack access Message-ID: Seems like the only way to get caller name in Python is to use inspect module [1] It seems that this way is not recommended (I wonder why?) and there is no other. The stackoverflow question about how to get callers method namd [1] got 3k views over 1 year, so the recipe seems to be useful. I've developed a similar snippet [2] that also extracts a module and class name in addition to method. To me this snippet is very handy to insert in print statements to see who uses a certain function at run-time. The output from this function is: spyderlib.widgets.externalshell.introspection.NotificationThread.run Which no function of `inspect` and `traceback` modules can provide (correct me if I wrong, but I am pretty confident). The stuff they propose will look like: File "/mount/sdb5/projects/spyderlib/spyderlib/widgets/externalshell/introspection.py", line 183, in run write_packet(self.notify_socket, output) As a 'software engineer' I am more interested to get logical structure of call stack rather than physical location of code text pieces (which is important too, but secondary). Neither `traceback` nor `inspect` seems to be designed with this use case, and if you'll look closely - the traceback above misses class info completely. To get the name of the class from this traceback you need to know its location withing the lines of the file. So, I'd propose to include caller_name() function somewhere and name it somehow. But there is more to that. Maybe too late for Python 3, but still an idea - make call stack a normal Python feature (optional for stackless, but still a feature) and describe a standard for call stack names, to provide everybody a simple and safe way to see what's going on with execution stack (and an easy way to get caller names). Making it a feature means that there will be almost no overhead when looking up the name (id) of the caller callers. This will also be handy for logging. I wish I could shorten this proposal. The picture in my head is something like: __main__ spyderlib.widgets.externalshell.introspection.NotificationThread.run spyderlib.utils.bsdsocket.write_packet Every indented string is an unique id of function or other closure that makes up Python code. Would it rock? This will probably require describing all possible ways to run Python code to see where a call to caller_name() can appear to cover all possible situations. Is there already a list of these ways? 1. http://stackoverflow.com/questions/2654113/python-how-to-get-the-callers-method-name-in-the-called-method 2. https://gist.github.com/2151727 -- anatoly t. From sven at marnach.net Thu Mar 22 01:17:25 2012 From: sven at marnach.net (Sven Marnach) Date: Thu, 22 Mar 2012 00:17:25 +0000 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: Message-ID: <20120322001725.GA10391@pantoffel-wg.de> anatoly techtonik schrieb am Do, 22. M?r 2012, um 00:03:47 +0300: > Seems like the only way to get caller name in Python is to use inspect > module [1] > It seems that this way is not recommended (I wonder why?) and there is no other. Using introspection in production code is not recommended in general, with some exceptions, since the main purpose of introspection is debugging. *If* you want to do this at all, the recommended way is to use the inspect module. (And there is another way -- at least in CPython you can use 'sys._getframe().f_back.f_code.co_name'.) > The stackoverflow question about how to get callers method namd [1] > got 3k views over 1 year, so the recipe seems to be useful. I've > developed a similar snippet [2] that also extracts a module and class > name in addition to method. To me this snippet is very handy to insert > in print statements to see who uses a certain function at run-time. > > The output from this function is: > spyderlib.widgets.externalshell.introspection.NotificationThread.run The missing link here seems to be that code objects don't allow access to PEP 3155 qualified names -- if they did, this would be rather easy to do. How about adding a `co_qualname` attribute to code objects? It could point to the same string the function object points to, so the overhead would be minimal. > So, I'd propose to include caller_name() function somewhere and name > it somehow. I don't think this would be necessary if the qualified name could be accessed from the frame object in some way. > But there is more to that. Maybe too late for Python 3, > but still an idea - make call stack a normal Python feature Not sure what you are talking about here. Cheers, Sven From victor.stinner at gmail.com Thu Mar 22 03:06:55 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Thu, 22 Mar 2012 03:06:55 +0100 Subject: [Python-ideas] Allow function __globals__ to be arbitrary mapping not just dict In-Reply-To: <4F65D49D.5070809@pearwood.info> References: <4F65D49D.5070809@pearwood.info> Message-ID: > I propose to allow function.__globals__ to accept any mapping type. I proposed a different but related change: support types other than dict for __builtins__ in the following issue. http://bugs.python.org/issue14385 Victor From techtonik at gmail.com Thu Mar 22 06:51:26 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 22 Mar 2012 08:51:26 +0300 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: <87pqcgggsm.fsf@benfinney.id.au> Message-ID: On Wed, Mar 14, 2012 at 7:34 PM, Yuval Greenfield wrote: > I've always had trouble understanding and explaining the complexities and > intricacies of python packaging. +1 > Is there a most basic but comprehensive list of use cases? IIUC they are: > > * Eg Standard library -?import from a list of paths to be searched. > * Eg This project - import from a relative path based on this file's current > directory (which python has an odd syntax for). > * Eg Distributed packages and virtual-env -?import from a relative path > based on an anchor directory. I am the big proponent of user story/use case first approach, but somebody needs to show everyone how to do this properly. I've created a draft at http://wiki.python.org/moin/CodeDiscoveryUseCases - feel free to improve it. > If we were to start completely from scratch would this problem be an easy > one? With a list of user stories - yes. -- anatoly t. From g.brandl at gmx.net Thu Mar 22 08:07:09 2012 From: g.brandl at gmx.net (Georg Brandl) Date: Thu, 22 Mar 2012 08:07:09 +0100 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: <4F6975CB.5070500@canterbury.ac.nz> <4F69A9F3.4030202@kozea.fr> Message-ID: On 21.03.2012 13:36, Serhiy Storchaka wrote: > 21.03.12 12:14, Simon Sapin ???????(??): >> I don?t see this problem when reading the documentation. The idea of >> "standard" size is introduced in section 7.3.2.1: > > Again it is all because of my carelessness. I looked ``pydoc struct``, > and not a library documentation. Well, if "pydoc struct" is not self-contained and mentions "standard size" without defining it, that is still a bug. At the very least it would have to refer to the library docs for what the standard size is. Georg From ncoghlan at gmail.com Thu Mar 22 08:23:27 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Mar 2012 17:23:27 +1000 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: <4F6975CB.5070500@canterbury.ac.nz> <4F69A9F3.4030202@kozea.fr> Message-ID: On Thu, Mar 22, 2012 at 5:07 PM, Georg Brandl wrote: > On 21.03.2012 13:36, Serhiy Storchaka wrote: >> 21.03.12 12:14, Simon Sapin ???????(??): >>> I don?t see this problem when reading the documentation. The idea of >>> "standard" size is introduced in section 7.3.2.1: >> >> Again it is all because of my carelessness. I looked ``pydoc struct``, >> and not a library documentation. > > Well, if "pydoc struct" is not self-contained and mentions "standard > size" without defining it, that is still a bug. ?At the very least it > would have to refer to the library docs for what the standard size is. The broader question of whether the docs might be better rephrased to say "default size" rather than "standard size" still stands, though (since 'default' is a more typical word for "we defined a value that is used automatically if you don't explicitly specify an alternative" than 'standard') Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From techtonik at gmail.com Thu Mar 22 08:29:05 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 22 Mar 2012 10:29:05 +0300 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: <20120322001725.GA10391@pantoffel-wg.de> References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: On Thu, Mar 22, 2012 at 3:17 AM, Sven Marnach wrote: > anatoly techtonik schrieb am Do, 22. M?r 2012, um 00:03:47 +0300: >> Seems like the only way to get caller name in Python is to use inspect >> module [1] >> It seems that this way is not recommended (I wonder why?) and there is no other. > > Using introspection in production code is not recommended in general, > with some exceptions, since the main purpose of introspection is > debugging. ?*If* you want to do this at all, the recommended way is to > use the inspect module. ?(And there is another way -- at least in > CPython you can use 'sys._getframe().f_back.f_code.co_name'.) If sys._getframe() is not available in Jython, IronPython and PyPy then I won't use that for debug mode output in libraries. The mechanism with `inspect` is complicated, incomplete - seems like it was not designed with object oriented concepts in mind. Look at the comments in the gist I posted before - after several iterations I still don't know how to distinguish between function name and static method call. I suspect there is no such info at all, because the API level of `inspect` objects is too low level. Other proof for that is that gist is impossible to understand without thoroughly reading the documentation. The inspect module was added around the time new style object appeared, and debugging was concentrated around the byte code compilation. Nobody thought about big applications where all these frames, line numbers, stack space and indexes of last attempted instruction in bytecode don't make any sense. Writing helpers for debugging call stacks could be easy. >> The stackoverflow question about how to get callers method namd [1] >> got 3k views over 1 year, so the recipe seems to be useful. I've >> developed a similar snippet [2] that also extracts a module and class >> name in addition to method. To me this snippet is very handy to insert >> in print statements to see who uses a certain function at run-time. >> >> The output from this function is: >> ? spyderlib.widgets.externalshell.introspection.NotificationThread.run > > The missing link here seems to be that code objects don't allow access > to PEP 3155 qualified names -- if they did, this would be rather easy > to do. ?How about adding a `co_qualname` attribute to code objects? > It could point to the same string the function object points to, so > the overhead would be minimal. Thanks for PEP 3155. I read it, but completely forgot the word "qualified". Despite the discussion inside that "full name" is a bad choice - this is the first thing that comes to mind at the prompt of search engine. Adding co_qualname is an easy low-level solution, but looking back I don't want anybody to went through the path of getting through all the gory details about python frames, filtering all the confusing and irrelevant pieces just to get to call stack. So, the argument - that this part of Python should be simplified, because beautiful is better than ugly. More arguments at the end of the letter. There is also technical argument against using existing technique of frame access, which is explained in a big note here http://docs.python.org/library/inspect.html#the-interpreter-stack >> So, I'd propose to include caller_name() function somewhere and name >> it somehow. > > I don't think this would be necessary if the qualified name could be > accessed from the frame object in some way. > >> But there is more to that. Maybe too late for Python 3, >> but still an idea - make call stack a normal Python feature > > Not sure what you are talking about here. For example you need to explain a Python beginner what a Stackless is. If you have an abstraction, you just do: >>> for i,e in enumerate( call_stack() ): ... print " "*i, e ... __main__ module.something module2.we_are_here Not an abstraction - people can actually play with it, and then you say - "This is how CPython works. In Stackless there is no call stack - there is a queue": __main__ module.something -------------------------------- ^^^ we were here module2.we_are_here module.something __main__ So, the ability to operate the "call stack" concept is that I call the language feature. If people can see it, touch it and play with it - you can expect to see a lot of cool things appear about it. I could talk about perspectives of exploratory/incremental programming, and run-time rollbacks, but I better stop right here. -- anatoly t. From ncoghlan at gmail.com Thu Mar 22 09:01:31 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Thu, 22 Mar 2012 18:01:31 +1000 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: Anatoly, Trying to find out your calling function is hard because well-designed decoupled code with properly separated concerns *should never care who is calling it*, so improving the APIs for doing so is an *extremely low priority task*. There are some truly exceptional cases where it matters. For those, the onus is on the developers *implementing those exceptional cases* to figure out how to do this in a platform independent way (which may involve conditional code that does different things depending on the platform). These legitimate use cases are simply too rare for it to be worthwhile for the various implementations to agree on a standard way to expose the functionality and promise never to change it. If all you are trying to do is print some debugging info and you can use Python 3.2, then use the "stack_info=True" argument to the logging APIs. That's what it's for. If you can't use Python 3.2, then look at the way logging does it in 3.2 and see if you can adapt that for your own needs. There's no compelling reason that writing helpers for debugging call stacks needs to be easy. It should definitely be *possible*, but 99.99% of all Python programmers are just going to use the native debugging facilities of the interpreter, or use those of an IDE that someone else wrote. Once again, you're painting with a broad sweeping brush "ah, it's horrible, it's too hard to do anything worthwhile" (despite the obvious contradiction that many projects already implement such things quite effectively) instead of suggesting *small*, *incremental* changes that might actually improve the status quo. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From mark at hotpy.org Thu Mar 22 10:24:40 2012 From: mark at hotpy.org (Mark Shannon) Date: Thu, 22 Mar 2012 09:24:40 +0000 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: <4F6AEFD8.5030106@hotpy.org> anatoly techtonik wrote: > On Thu, Mar 22, 2012 at 3:17 AM, Sven Marnach wrote: >> anatoly techtonik schrieb am Do, 22. M?r 2012, um 00:03:47 +0300: >>> Seems like the only way to get caller name in Python is to use inspect >>> module [1] >>> It seems that this way is not recommended (I wonder why?) and there is no other. >> Using introspection in production code is not recommended in general, >> with some exceptions, since the main purpose of introspection is >> debugging. *If* you want to do this at all, the recommended way is to >> use the inspect module. (And there is another way -- at least in >> CPython you can use 'sys._getframe().f_back.f_code.co_name'.) > > If sys._getframe() is not available in Jython, IronPython and PyPy > then I won't use that for debug mode output in libraries. sys._getframe() is available in PyPy and Jython, I'm not sure if IronPython supports it, I think it does. A simpler and more concrete suggestion might be to simply rename sys._getframe() as sys.getframe(), since all the main implementations support it anyway. Cheers, Mark. From flub at devork.be Thu Mar 22 11:00:15 2012 From: flub at devork.be (Floris Bruynooghe) Date: Thu, 22 Mar 2012 10:00:15 +0000 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: <4F6AEFD8.5030106@hotpy.org> References: <20120322001725.GA10391@pantoffel-wg.de> <4F6AEFD8.5030106@hotpy.org> Message-ID: On 22 March 2012 09:24, Mark Shannon wrote: > A simpler and more concrete suggestion might be to simply rename > sys._getframe() as sys.getframe(), since all the main implementations > support it anyway. But at least now those of us using it know they're treading on thin ice. E.g. for some reason sys._current_frames() is not available as widely as sys._getframe(). Floris -- Debian GNU/Linux -- The Power of Freedom www.debian.org | www.gnu.org | www.kernel.org From sven at marnach.net Thu Mar 22 13:55:55 2012 From: sven at marnach.net (Sven Marnach) Date: Thu, 22 Mar 2012 12:55:55 +0000 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: <20120322125555.GA3076@pantoffel-wg.de> Nick Coghlan schrieb am Do, 22. M?r 2012, um 18:01:31 +1000: > Once again, you're painting with a broad sweeping brush "ah, it's > horrible, it's too hard to do anything worthwhile" (despite the > obvious contradiction that many projects already implement such things > quite effectively) instead of suggesting *small*, *incremental* > changes that might actually improve the status quo. To be fair, Anatoly *did* suggest an incremental change -- adding the function caller_name(). And I think this raises a good point: It is outright impossible now to access the qualified name of a calling function. Adding an 'co_qualname' attribute to code objects would facilitate this. Another point Anatoly raised is that the mere function names mentioned in tracebacks often are less useful than fully qualified names. So how about including the qualified names in the traceback? Cheers, Sven From guido at python.org Thu Mar 22 14:51:34 2012 From: guido at python.org (Guido van Rossum) Date: Thu, 22 Mar 2012 06:51:34 -0700 Subject: [Python-ideas] Exact integral types in struct In-Reply-To: References: <4F6975CB.5070500@canterbury.ac.nz> <4F69A9F3.4030202@kozea.fr> Message-ID: Oh, but this is NOT a default! The default is system local. Agreed on clarifying the docstring. --Guido On Thursday, March 22, 2012, Nick Coghlan wrote: > On Thu, Mar 22, 2012 at 5:07 PM, Georg Brandl wrote: >> On 21.03.2012 13:36, Serhiy Storchaka wrote: >>> 21.03.12 12:14, Simon Sapin ???????(??): >>>> I don?t see this problem when reading the documentation. The idea of >>>> "standard" size is introduced in section 7.3.2.1: >>> >>> Again it is all because of my carelessness. I looked ``pydoc struct``, >>> and not a library documentation. >> >> Well, if "pydoc struct" is not self-contained and mentions "standard >> size" without defining it, that is still a bug. At the very least it >> would have to refer to the library docs for what the standard size is. > > The broader question of whether the docs might be better rephrased to > say "default size" rather than "standard size" still stands, though > (since 'default' is a more typical word for "we defined a value that > is used automatically if you don't explicitly specify an alternative" > than 'standard') > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From amcnabb at mcnabbs.org Thu Mar 22 18:37:12 2012 From: amcnabb at mcnabbs.org (Andrew McNabb) Date: Thu, 22 Mar 2012 11:37:12 -0600 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: <4F6AEFD8.5030106@hotpy.org> References: <20120322001725.GA10391@pantoffel-wg.de> <4F6AEFD8.5030106@hotpy.org> Message-ID: <20120322173712.GA23302@mcnabbs.org> On Thu, Mar 22, 2012 at 09:24:40AM +0000, Mark Shannon wrote: > >If sys._getframe() is not available in Jython, IronPython and PyPy > >then I won't use that for debug mode output in libraries. > > sys._getframe() is available in PyPy and Jython, I'm not sure if > IronPython supports it, I think it does. > > A simpler and more concrete suggestion might be to simply rename > sys._getframe() as sys.getframe(), since all the main implementations > support it anyway. Apparently sys._getframe is only partially supported in PyPy. It works but probably shouldn't be encouraged because it has a huge performance hit: "sys._getframe(), sys.exc_info() work, but they give performance penalty that can be huge, by disabling the JIT. Use only for specialized use cases (like a debugger) where performance does not matter. "One unobvious usecase where this is used is the logging module. Don't use logging module if you want things to be fast." https://bitbucket.org/pypy/pypy/wiki/JitFriendliness -- Andrew McNabb http://www.mcnabbs.org/andrew/ PGP Fingerprint: 8A17 B57C 6879 1863 DE55 8012 AB4D 6098 8826 6868 From andrew.svetlov at gmail.com Thu Mar 22 21:56:30 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Thu, 22 Mar 2012 22:56:30 +0200 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <7C67D7CD-0D70-4FD7-8591-D89A525830DB@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> <4F68977A.1000506@hotpy.org> <7C67D7CD-0D70-4FD7-8591-D89A525830DB@gmail.com> Message-ID: >From my perspective Yury's patch is simple and clean. Hi described very well why decorators (which often implemented as functions which own __closure__) cannot be easy patched. While overriding __closure__ is not common case it's hard to do that if you really need. If __code__ is already writable there are no reason to prevent writing to __closure__ as well. I am +1 to push the patch in 3.3. On Tue, Mar 20, 2012 at 4:59 PM, Yury Selivanov wrote: > On 2012-03-20, at 10:56 AM, Yury Selivanov wrote: > >> Because usually you write decorators as functions, not classes. ?And >> when you do the former style, you usually do it in the following way: >> >> def decorator(func): >> ? ?def wrapper(*args, **kwargs): >> ? ? ? ?return func(*args, **kwargs) >> >> ? ?functools.wraps(wrapper, func) >> ? ?return wrapper >> >> Now, let's use it: >> >> @decorator >> def some_func(): pass >> >> OK. ?At this point, 'some_func' object has a '__wrapped__' attribute, >> that points to original 'some_func' function. ?But whatever you write >> to 'some_func.__wrapped__' won't change anything, as the 'wrapper' >> will continue to call old 'some_func'. ?Instead of assigning something >> to __wrapped__, we need to change it in-place, by doing >> '__wrapped__.__closure__ = new_closure'. > > > And as I told you in the first example: there is no problem when you have > only one decorator. ?You can surely just return a new FunctionType(). > > But when you have many of them, such as: > > @decorator3 > @your_magic_decorator_that_modifies_the_closure > @decorator2 > @decorator1 > def some_func(): pass > > The only way to modify the __closure__ it to write to the __wrapped__ > attribute of 'decorator1' wrapper. > > - > Yury > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- Thanks, Andrew Svetlov From ethan at stoneleaf.us Thu Mar 22 22:20:51 2012 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 22 Mar 2012 14:20:51 -0700 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <7C67D7CD-0D70-4FD7-8591-D89A525830DB@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> <4F68977A.1000506@hotpy.org> <7C67D7CD-0D70-4FD7-8591-D89A525830DB@gmail.com> Message-ID: <4F6B97B3.8070600@stoneleaf.us> +1 to writable __closure__. ~Ethan~ From barry at python.org Fri Mar 23 01:34:32 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 22 Mar 2012 20:34:32 -0400 Subject: [Python-ideas] My objections to implicit package directories References: Message-ID: <20120322203432.282ca205@limelight.wooz.org> On Mar 13, 2012, at 09:15 AM, Guido van Rossum wrote: >And at the end of the day I still really, really, really hate >directories with a suffix. I completely agree, for all the reasons you stated. Especially because would be extremely difficult to handle migrations from a pre-namespace-packages world to a post-namespace-packages world with directory suffixes. For example, let's say Debian/Ubuntu supports Python 3.2 and 3.3. We can continue to craft __init__.py files with the old-style namespace package code at installation time and pretty much do what we're currently doing. It's painful but the technology is there so it doesn't change much for us. But when we can drop support for < 3.3 (or we back port namespace package support to 3.2) then we can simply drop the code that creates these __init__.py files at installation time and we'll magically gain support for new-style namespace packages. With directory suffixes, I don't see how this is possible. I shudder to think what OS vendors will have to do to rename all the directories of *installed* packages, let alone have to rebuild all Python 3 packages to support the renamed directories, when they make the switch to a new-style world. catching-up-ly y'rs, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Fri Mar 23 01:40:14 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 22 Mar 2012 20:40:14 -0400 Subject: [Python-ideas] Save memory when forking with *really* immutable objects References: <4F5E9074.70701@hastings.org> Message-ID: <20120322204014.66b9ddc6@limelight.wooz.org> On Mar 12, 2012, at 08:49 PM, Gregory P. Smith wrote: >If reference counts were moved out of the PyObject structure into a region >of memory allocated specifically for reference counts, only those pages >would need copying rather than virtually every random page of memory >containing a PyObject. My initial thought was to do this by turning the >existing refcount field into a pointer to the object's count or an array >reference that code managing the reference count array would use to >manipulate the count. Obviously either of these would have some >performance impact and break the ABI. It's been *ages* since I really knew how any of this worked, but I think some flavor of the Objective-C runtime did reference counting this way. I think this afforded them other tricks, like the ability to not increment the refcount for an object if it was exactly 1. I've no doubt someone here will fill in all my faulty memory and gaps, but I do seem to recall it being a pretty efficient system for memory management. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Fri Mar 23 01:45:14 2012 From: barry at python.org (Barry Warsaw) Date: Thu, 22 Mar 2012 20:45:14 -0400 Subject: [Python-ideas] sys.implementation References: Message-ID: <20120322204514.68a8206b@limelight.wooz.org> On Mar 21, 2012, at 10:19 AM, Brett Cannon wrote: >Just FYI, everyone guessed right about what I was thinking and Benjamin is >right about not suggesting moving imp.source_to_cache() but simply adding >sys.implementation so that imp.source_to_cache() can use that instead of >imp.get_tag(). Yeah, that would be kind of nice. imp.get_tag() was just a convenient place to stash that information at the time. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From ncoghlan at gmail.com Fri Mar 23 02:16:49 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 Mar 2012 11:16:49 +1000 Subject: [Python-ideas] Save memory when forking with *really* immutable objects In-Reply-To: <20120322204014.66b9ddc6@limelight.wooz.org> References: <4F5E9074.70701@hastings.org> <20120322204014.66b9ddc6@limelight.wooz.org> Message-ID: On Fri, Mar 23, 2012 at 10:40 AM, Barry Warsaw wrote: > It's been *ages* since I really knew how any of this worked, but I think some > flavor of the Objective-C runtime did reference counting this way. ?I think > this afforded them other tricks, like the ability to not increment the > refcount for an object if it was exactly 1. ?I've no doubt someone here will > fill in all my faulty memory and gaps, but I do seem to recall it being a > pretty efficient system for memory management. Also from the world of "hazy memories of old discussions", my recollection is that the two main problems with the indirection are: - an extra pointer indirection for every refcounting operation (which are frequent enough that the micro-pessimisation has a measurable effect) - some loss of cache locality (since every Python object will need both its own memory and its refcount memory in the cache) Larry's suggestion for allowing eternal objects avoids the latter problem, but still suffers from (a variant of) the first. As the many GIL discussions can attest, we're generally very reluctant to accept a single-threaded (or, in this case, single process) performance hit to improve behaviour in the concurrent case. Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ncoghlan at gmail.com Fri Mar 23 02:22:13 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 Mar 2012 11:22:13 +1000 Subject: [Python-ideas] sys.implementation In-Reply-To: References: Message-ID: On Thu, Mar 22, 2012 at 4:37 AM, Benjamin Peterson wrote: > Eric Snow writes: >> >> I'd like to move this forward, so any objections or feedback at this >> point would be helpful. > > I would like to see a concrete proposal of what would get put in there. +1 A possible starting list: - impl name (with a view to further standardising the way we check for impl specific tests in the regression test suite) - impl version (official place to store the implementation version, potentially independent of the language version as it already is in PyPy) - cache tag (replacement for imp.get_tag()) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Fri Mar 23 04:44:57 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 22 Mar 2012 20:44:57 -0700 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: On Tue, Mar 20, 2012 at 7:31 AM, Yury Selivanov wrote: > On 2012-03-20, at 9:15 AM, Nick Coghlan wrote: > >> On Tue, Mar 20, 2012 at 11:06 PM, Yury Selivanov >> wrote: >>> Again, since the __code__ attribute is modifiable, and __closure__ >>> works in tight conjunction with it, I see no point in protecting it. >> >> FWIW, I'm current +0 on the idea (based on Yury's reasoning here about >> updating already wrapped functions), but it's going to be a while >> before I can dig into the code and see if there are any *good* reasons >> we protect __closure__ (rather than that protection merely being an >> implementation artifact). I can't recall any off the top of my head, >> but that part of the code is complex enough that I don't completely >> trust my memory on that point. > > Well, it seems that it is an implementation artifact. ?I've run across > the CPython code, and got an impression that changing __closure__ to be > writeable won't break anything. ?Writing malicious values to it may > segfault python, but that's no different from doing so with __code__. > In some way, __closure__ is already writeable, since you can pass it > to the 'types.FunctionObject', and create a broken function, hence I > don't see a reason in making it readonly. > > The patch attached to the issue in the tracker passes all unittests, and > adds one more - specifically for testing __closure__ modification on a > live function object. -0 (for now :) "it won't break anything" isn't the only metric that should be considered here. Would this change make it too easy for people to write hard-to-debug code without realizing it? Would this create an attractive nuisance? I don't have enough Guido-fu to answer these questions. Though __code__ is already writable, I don't think it's a good comparison about which to reason. The object bound to __code__ is much more complex (and guarded) than the one bound to __closure__. I expect that it would be much easier to do the wrong thing with __closure__. As well, it seems to me that hacking __closure__ would be much more tempting. Will we see a "significantly" higher number of bugs about segfaults where we have to respond with "don't do that"? Probably not. But should any solution here guard (at some expense) against such mistakes that currently are much more difficult to make? Nick already alluded to double-checking the code somewhat to that effect. I'm not opposed in principle to making __closure__ writable, but worry that not all the angles are being considered here. -eric From ncoghlan at gmail.com Fri Mar 23 05:08:58 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Fri, 23 Mar 2012 14:08:58 +1000 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: On Fri, Mar 23, 2012 at 1:44 PM, Eric Snow wrote: > Will we see a "significantly" higher number of bugs about segfaults > where we have to respond with "don't do that"? ?Probably not. ?But > should any solution here guard (at some expense) against such mistakes > that currently are much more difficult to make? ?Nick already alluded > to double-checking the code somewhat to that effect. Yes, while I'm in favour of the writable closure attribute idea in principle, the details of how we access the closure array are the kind of thing I'm worried about when I say I need to check the source code before commenting on the implementation details. Setting "f.__closure__ = []" is a lot easier than crafting the necessary bytecode to cause problems with the current setup, so "Can the new behaviour be abused to segfault CPython with pure Python code?" is exactly the right question to be asking. With Victor's recent work to close some longstanding segfault vulnerabilities, I really don't want us to be adding anything that goes in the other direction. However, I won't be doing that investigation myself until my broadband provider finally finishes setting up the connection at my new place, so if anyone wants to cast an appropriately paranoid eye over Yury's patch in the meantime, please go ahead :) Cheers, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From ericsnowcurrently at gmail.com Fri Mar 23 07:19:29 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Thu, 22 Mar 2012 23:19:29 -0700 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <8406714B-876E-435E-84AB-716804C92387@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: On Tue, Mar 20, 2012 at 6:06 AM, Yury Selivanov wrote: > I did provide such example earlier in this thread. ?I'm copying and > pasting it to this mail. ?Please read the example carefully, as it > explains why returning new types.FunctionType() is not enough. > > ---- > > Yes, your approach will work if your decorator is the only one applied. > But, as I said, if you have many of them (see below), you can't just > return a new function out of your decorator, you need to change the > underlying "in-place". ?Consider the following: > > def modifier(func): > ?orig_func = func > > ?while func.__wrapped__: > ? func = func.__wrapped__ > > ?# patch func.__code__ and func.__closure__ > ?return orig_func # no need to wrap anything > > def some_decorator(func): > ?def wrapper(*args, **kwargs): > ? ? # some code > ? ? return func(*args, **kwargs) > ?functools.wraps(wrapper, func) > ?return wrapper > > @modifier > @some_decorator > def foo(): > ?# this code needs to be verified/augmented/etc Couldn't something like the following work? def modifier(func): """Traverse the decorator "stack" and patch the bottom-most wrapped function.""" # relies on __wrapped__ being set at each level of the decorator stack and on the # wrapped function being bound in func.__closure__. if not hasattr(func, "__wrapped__"): # patch func.__code__ and func.__closure__ code = ... closure = ... else: code = func.__code__ closure = list(func.__closure__) closure[closure.index(func.__wrapped__)] = modifier(func.__wrapped__) return type(func)(code, func.__globals__, func.__name__, func.__defaults__, tuple(closure)) Also, I'm guessing that your actual use-case looks more like the following: from some_third_party_module import foo #assert foo.__wrapped__ == foo.__closure__[0] foo = modifier(foo) # hacks up foo.__wrapped__ Hacking the innards of an existing function object is touchy stuff, probably the riskiest kind of monkey-patching. You're basically taking the chance of breaking (in ugly, unexpected ways) other code that uses that function you just hacked. Still, there are certainly valid use-cases (and we're all consenting adults here). However, I'd hate for projects to start getting blamed for difficult-to-debug problems that are the result of some other project that did this sort of hack. It's nice when your expectations for a function's behavior (or any code for that matter) can remain stable, regardless of what libraries are installed along-side. -eric p.s. I want to reiterate my understanding that nearly everything involving the internals of functions is pretty delicate (read: fragile), in part due to being the focal point for optimization. Hacking it like this is therefore a delicate undertaking and definitely skirts the edges of creating implementation-specific code. Don't shy away. Just be extra cautious. > > So, in the above snippet, if you don't want to discard the > @some_decorator by returning a new function object, you need to modify > the 'foo' from the @modifier. > > In a complex framework, where you can't guarantee that your magic > decorator will always be called first, rewriting the __closure__ > attribute is the only way. > > Again, since the __code__ attribute is modifiable, and __closure__ > works in tight conjunction with it, I see no point in protecting it. > > On 2012-03-20, at 5:34 AM, Mark Shannon wrote: > >> Yury Selivanov wrote: >>> I've created an issue: http://bugs.python.org/issue14369 >> >> I think that creating an issue may be premature, given that you have had >> no positive feedback on the idea. >> >> I still think making __closure__ mutable is unnecessary. >> If you insist that it is it, then please provide an example which would >> work with your proposed change, but cannot be made to work using >> types.FunctionType() to create a new closure. >> >> Cheers, >> Mark. >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From techtonik at gmail.com Fri Mar 23 09:12:20 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 23 Mar 2012 11:12:20 +0300 Subject: [Python-ideas] Save memory when forking with *really* immutable objects In-Reply-To: <4F5E9074.70701@hastings.org> References: <4F5E9074.70701@hastings.org> Message-ID: For those who have troubles understanding what all this memory pages stuff is about - here is a good intro "Python, Linkers, and Virtual Memory" by Brandon Rhodes http://www.youtube.com/watch?v=twQKAoq2OPE -- anatoly t. From yselivanov.ml at gmail.com Fri Mar 23 16:33:42 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 23 Mar 2012 11:33:42 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: <8FD7FAC6-2A32-4112-BDC1-9FB350C97D36@gmail.com> On 2012-03-23, at 2:19 AM, Eric Snow wrote: > Couldn't something like the following work? > > def modifier(func): > """Traverse the decorator "stack" and patch the bottom-most > wrapped function.""" > # relies on __wrapped__ being set at each level of the decorator > stack and on the > # wrapped function being bound in func.__closure__. > > if not hasattr(func, "__wrapped__"): > # patch func.__code__ and func.__closure__ > code = ... > closure = ... > else: > code = func.__code__ > closure = list(func.__closure__) > closure[closure.index(func.__wrapped__)] = modifier(func.__wrapped__) > > return type(func)(code, func.__globals__, func.__name__, > func.__defaults__, tuple(closure)) Well, it certainly possible to hack on this level too, but I wouldn't do that ;) The only case I came up with after looking at your code was a way of extracting the decorated function if the decorator didn't set __wrapped__ attribute. But even that is just an idea. > Also, I'm guessing that your actual use-case looks more like the following: > > from some_third_party_module import foo > #assert foo.__wrapped__ == foo.__closure__[0] > foo = modifier(foo) # hacks up foo.__wrapped__ I described our exact use case here: http://mail.python.org/pipermail/python-ideas/2012-March/014550.html > Hacking the innards of an existing function object is touchy stuff, > probably the riskiest kind of monkey-patching. You're basically > taking the chance of breaking (in ugly, unexpected ways) other code > that uses that function you just hacked. Still, there are certainly > valid use-cases (and we're all consenting adults here). You're right. Once you start to work on that level you are on your own. But if you really forced to decide between the following three options: - introduce some really ugly syntax to support normal try..finally semantics in coroutine - patch cpython and restrict your framework from ever being used by anyone, and introduce additional hassles over deployment - work with python __code__ object to get around the problem I vote for the last option. If you have good unittests you shouldn't experience any problems, at least between python major releases. > However, I'd hate for projects to start getting blamed for > difficult-to-debug problems that are the result of some other project > that did this sort of hack. It's nice when your expectations for a > function's behavior (or any code for that matter) can remain stable, > regardless of what libraries are installed along-side. Well, that's much more complicated problem. I don't think that making __closure__ writable will anyhow make it worse. Once again, you can screw up the __code__ object even now ;) - Yury From yselivanov.ml at gmail.com Fri Mar 23 16:59:43 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Fri, 23 Mar 2012 11:59:43 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F638871.1010603@hotpy.org> <0A6B28B2-C9FC-4835-9821-3F7079742AAE@gmail.com> <4F639370.1020609@hotpy.org> <93E337A9-1326-41DB-9C0D-281D78C611A4@gmail.com> <8FBB3A4C-E5B4-4EC1-B5B5-70E2DEA2A23B@gmail.com> <4F684F32.5080006@hotpy.org> <8406714B-876E-435E-84AB-716804C92387@gmail.com> Message-ID: <7AC2D8B5-609F-4878-99BA-A55477727794@gmail.com> On 2012-03-23, at 12:08 AM, Nick Coghlan wrote: > On Fri, Mar 23, 2012 at 1:44 PM, Eric Snow wrote: >> Will we see a "significantly" higher number of bugs about segfaults >> where we have to respond with "don't do that"? Probably not. But >> should any solution here guard (at some expense) against such mistakes >> that currently are much more difficult to make? Nick already alluded >> to double-checking the code somewhat to that effect. > > Yes, while I'm in favour of the writable closure attribute idea in > principle, the details of how we access the closure array are the kind > of thing I'm worried about when I say I need to check the source code > before commenting on the implementation details. Setting > "f.__closure__ = []" is a lot easier than crafting the necessary > bytecode to cause problems with the current setup, so "Can the new > behaviour be abused to segfault CPython with pure Python code?" is > exactly the right question to be asking. Well, it does look easier, but on the other hand you can substitute the __code__ object with just two more lines, and it may be as fatal as writing wrong __closure__. > With Victor's recent work to close some longstanding segfault > vulnerabilities, I really don't want us to be adding anything that > goes in the other direction. However, I won't be doing that > investigation myself until my broadband provider finally finishes > setting up the connection at my new place, so if anyone wants to cast > an appropriately paranoid eye over Yury's patch in the meantime, > please go ahead :) I get the point of segfault vulnerabilities and don't want to introduce any new ways either. Will adding two checks: *) in closure_setter check that all items in the tuple are cells AND that the length of new __closure__ greater or equal than __code__ freevars; *) in code_setter check that __code__ freevars are less or equal to the __closure__ length; be sufficient? This way you won't be able to segfault python, but will have a freedom to manipulate the __closure__ by introducing more vars to it. - Yury From lukas.lueg at googlemail.com Fri Mar 23 17:50:24 2012 From: lukas.lueg at googlemail.com (Lukas Lueg) Date: Fri, 23 Mar 2012 17:50:24 +0100 Subject: [Python-ideas] EBNF-module in stdlib Message-ID: Hi. By all accounts, Python has many strong and flexible ways to deal with parsing of semi-structured data. Applications range from csv- and configuration-files over log- and report-data to all kinds of semi-structured formats. As far as I can see the standard library however either provides highly specialized modules (e.g. configparser, csv, json) or drives you off into very generic approaches (e.g. re) for you to implement the rest or use external modules. I started looking into RPython some time ago and found the included parser-module highly useful. It provides an (again extended) EBNF-parser with capable visitors to get parse-trees out of a simple BN-syntax. Is it worth the discussion to get our own EBNF-module into the standard library, so people who otherwise rely on pure regular expressions can build capable parsers with little effort? External modules (like simpleparser) and even internal modules like the ones mentioned above may even benefit from that. Regards Lukas From stefan_ml at behnel.de Fri Mar 23 18:35:12 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 Mar 2012 18:35:12 +0100 Subject: [Python-ideas] EBNF-module in stdlib In-Reply-To: References: Message-ID: Lukas Lueg, 23.03.2012 17:50: > I started looking into RPython some time ago and found the included > parser-module highly useful. It provides an (again extended) > EBNF-parser with capable visitors to get parse-trees out of a simple > BN-syntax. There are a number of capable parser packages for Python, including pyparsing and some lex/yacc like parsers. I don't think the parser you mentioned above beats them in turns of wide spread application, but that would usually be a good sign for something being worth adding to the stdlib. Also note that there are certain requirements for a new stdlib package, including the donation by the author and the availability of an active maintainer for it as part of the stdlib. Stefan From miki.tebeka at gmail.com Fri Mar 23 18:56:09 2012 From: miki.tebeka at gmail.com (Miki Tebeka) Date: Fri, 23 Mar 2012 10:56:09 -0700 Subject: [Python-ideas] Add "default" keyword to itemgetter and attrgetter Message-ID: Greetings, I apologize if this appears twice, Google groups gave me an error on the first one and I want to make sure this makes it through. Repeating http://bugs.python.org/?issue14384 .... This way they will behave more like getattr and the dictionary get. If default is not specified, then if the item/attr not found, an execption will be raised, which is the current behavior. However if default is specified, then return it in case when item/attr not found - default value will be returned. I wanted this when trying to get configuration from a list of objects. I'd like to do get = attrgetter('foo', None) return get(args) or get(config) or get(env) In the case of multiple items/attrs then you return default in their place: attrgetter('x', 'y', default=7)(None) => (7, 7) In case of dotted attribute again it'll return default value of any of the attributes is not found: attrgetter('x.y', default=7)(None) => 7 BTW: This is inspired from Clojure's get-in (http://bit.ly/GGzqjh) function. Thanks, -- Miki From stefan_ml at behnel.de Fri Mar 23 19:08:00 2012 From: stefan_ml at behnel.de (Stefan Behnel) Date: Fri, 23 Mar 2012 19:08:00 +0100 Subject: [Python-ideas] Add "default" keyword to itemgetter and attrgetter In-Reply-To: References: Message-ID: Miki Tebeka, 23.03.2012 18:56: > I apologize if this appears twice, Google groups gave me an error on > the first one and I want to make sure this makes it through. > > Repeating http://bugs.python.org/?issue14384 .... > > This way they will behave more like getattr and the dictionary get. Note that this has been discussed before. Look for a thread titled "defaultattrgetter". Stefan From raymond.hettinger at gmail.com Fri Mar 23 20:41:52 2012 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 23 Mar 2012 12:41:52 -0700 Subject: [Python-ideas] Add "default" keyword to itemgetter and attrgetter In-Reply-To: References: Message-ID: <4A405C5F-4713-4D4B-ABE2-C28C8951537A@gmail.com> On Mar 23, 2012, at 10:56 AM, Miki Tebeka wrote: > Repeating http://bugs.python.org/?issue14384 .... > > This way they will behave more like getattr and the dictionary get. > > If default is not specified, then if the item/attr not found, an > execption will be raised, which is the current behavior. > > However if default is specified, then return it in case when item/attr > not found - default value will be returned. > > I wanted this when trying to get configuration from a list of objects. > I'd like to do > get = attrgetter('foo', None) > return get(args) or get(config) or get(env) > > In the case of multiple items/attrs then you return default in their place: > attrgetter('x', 'y', default=7)(None) => (7, 7) Miki, I'm -1 on this proposal. It would have been a reasonable suggestion if itemgetter() and attrgetter() accepted only a single argument and invoked only a single step lookup. However, the suggestion makes much less sense given that the current API that allows calls like itemgetter(5,1,2) and attrgettr('a.b.c', 'd.e'). In that context, the proposals for a default argument make no sense at all. It is entirely unclear what the default should mean in those cases. The itemgetter() and attrgetter() factories are specializations designed to speedily handle some of the most common use cases without using a lambda or a def. But when more exotic logic is needed, a programmer is better-off writing a plain function which provides greater clarity and more flexibility than a specialized function factory. Overgeneralizing itemgetter() and attrgetter() would also have the negative effect of making them harder to learn, harder to read, and harder to debug. In the case of multistep lookups (x.y) or multiple arguments (5,1,2), it isn't clear what the default argument should do and which exceptions should be suppressed. This is a case where explicit really is better than implicit. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Mar 24 00:29:53 2012 From: guido at python.org (Guido van Rossum) Date: Fri, 23 Mar 2012 16:29:53 -0700 Subject: [Python-ideas] EBNF-module in stdlib In-Reply-To: References: Message-ID: On Fri, Mar 23, 2012 at 10:35 AM, Stefan Behnel wrote: > Lukas Lueg, 23.03.2012 17:50: >> I started looking into RPython some time ago and found the included >> parser-module highly useful. It provides an (again extended) >> EBNF-parser with capable visitors to get parse-trees out of a simple >> BN-syntax. > > There are a number of capable parser packages for Python, including > pyparsing and some lex/yacc like parsers. I don't think the parser you > mentioned above beats them in turns of wide spread application, but that > would usually be a good sign for something being worth adding to the stdlib. Also check out Lib/lib2to3/pgen2 in the repo. It comes with a Python grammar but can be adapted to any grammar you want. > Also note that there are certain requirements for a new stdlib package, > including the donation by the author and the availability of an active > maintainer for it as part of the stdlib. -- --Guido van Rossum (python.org/~guido) From sven at marnach.net Sat Mar 24 14:14:55 2012 From: sven at marnach.net (Sven Marnach) Date: Sat, 24 Mar 2012 13:14:55 +0000 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: <20120322203432.282ca205@limelight.wooz.org> References: <20120322203432.282ca205@limelight.wooz.org> Message-ID: <20120324131455.GB3076@pantoffel-wg.de> Barry Warsaw schrieb am Do, 22. M?r 2012, um 20:34:32 -0400: > With directory suffixes, I don't see how this is possible. I shudder to think > what OS vendors will have to do to rename all the directories of *installed* > packages, let alone have to rebuild all Python 3 packages to support the > renamed directories, when they make the switch to a new-style world. You would just add symlinks with extension to the directories without extension, wouldn't you? These symlinks could be placed in version-specific directories to avoid any further confusion. -- Sven From ironfroggy at gmail.com Sun Mar 25 16:15:05 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Sun, 25 Mar 2012 10:15:05 -0400 Subject: [Python-ideas] Implied try blocks Message-ID: Given that for some time the try/except/else form is standard and encouraged, I've found the following pattern really common in my code: try: r = some_single_statement() except TypeError: print "oh no!" raise OhNoException() else: p = prepare(r) print "I got", p Where the try block ever doing more than one thing feels dangerous or sloppy, because I want to make sure I know exactly where the exception comes from. The else block becomes the long tail and the try is just the head. This makes the try block itself seem heavy. What if we allowed this to be implied and except/else blocks bound to the previous statement? A try block would be an optional form, and mostly left for multi-block try's r = some_single_statement() except TypeError: print "oh no!" raise OhNoException() else: p = prepare(r) print "I got", p I think it reads acceptably. With a try: block your eye leads up right to that one statement. There is no ambiguity to deal with, that I can tell. I'm not sure if this is a great idea, but I don't dislike it. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From steve at pearwood.info Sun Mar 25 17:20:25 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Mon, 26 Mar 2012 02:20:25 +1100 Subject: [Python-ideas] Implied try blocks In-Reply-To: References: Message-ID: <4F6F37B9.7060906@pearwood.info> Calvin Spealman wrote: > Given that for some time the try/except/else form is standard and > encouraged, I've found the following pattern really common in my code: > > try: > r = some_single_statement() > except TypeError: > print "oh no!" > raise OhNoException() > else: > p = prepare(r) > print "I got", p In this case, the "else" is not required. This could be written more simply as: try: r = some_single_statement() except TypeError: print "oh no!" raise OhNoException() p = prepare(r) print "I got", p The else is only required when the except blocks don't exit via a return or raise. Otherwise, you can just continue outside of the try...except block. > Where the try block ever doing more than one thing feels dangerous or sloppy, > because I want to make sure I know exactly where the exception comes from. > The else block becomes the long tail and the try is just the head. This makes > the try block itself seem heavy. Huh? The try block is two lines, one if you don't count the "try" itself. How is that heavy? > What if we allowed this to be implied and except/else blocks bound to > the previous > statement? A try block would be an optional form, and mostly left for > multi-block try's > > r = some_single_statement() > except TypeError: > print "oh no!" > raise OhNoException() > else: > p = prepare(r) > print "I got", p > > I think it reads acceptably. With a try: block your eye leads up right > to that one statement. -1 With the current syntax, when reading code, you know when you enter a try block because you read "try:" just before the block begins. With your suggestion, the try is a booby-trap for the unwary. You're reading code, you read the "r = ..." line which looks like a normal line outside of a try block, and then *blam* you trip over the "except" and have to mentally backtrack and re-interpret the previous line "oh, it's actually inside a try block -- an invisible try block". In my opinion, this sort of implicit change of semantics is exactly the kind of thing that the Zen of Python warns us against. Your suggestion will also mask errors: try: do_stuff() except NameError: first_except_block() except ValueError: second_except_block() and_another_line() and_a_third_line() # oops, I forgot to indent except TypeError: ... Currently, this will be a SyntaxError. With your suggestion, it will silently do the wrong thing. -- Steven From solipsis at pitrou.net Sun Mar 25 17:19:05 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sun, 25 Mar 2012 17:19:05 +0200 Subject: [Python-ideas] Implied try blocks References: Message-ID: <20120325171905.291c0519@pitrou.net> On Sun, 25 Mar 2012 10:15:05 -0400 Calvin Spealman wrote: > > What if we allowed this to be implied and except/else blocks bound to > the previous > statement? A try block would be an optional form, and mostly left for > multi-block try's > > r = some_single_statement() > except TypeError: > print "oh no!" > raise OhNoException() > else: > p = prepare(r) > print "I got", p Saving one line and four non-whitespace characters? -1 from me. cheers Antoine. From masklinn at masklinn.net Sun Mar 25 17:32:22 2012 From: masklinn at masklinn.net (Masklinn) Date: Sun, 25 Mar 2012 17:32:22 +0200 Subject: [Python-ideas] Implied try blocks In-Reply-To: <20120325171905.291c0519@pitrou.net> References: <20120325171905.291c0519@pitrou.net> Message-ID: <22679E21-C942-47E0-AFAC-9DA6D9ECE7CC@masklinn.net> On 2012-03-25, at 17:19 , Antoine Pitrou wrote: > On Sun, 25 Mar 2012 10:15:05 -0400 > Calvin Spealman > wrote: >> >> What if we allowed this to be implied and except/else blocks bound to >> the previous >> statement? A try block would be an optional form, and mostly left for >> multi-block try's >> >> r = some_single_statement() >> except TypeError: >> print "oh no!" >> raise OhNoException() >> else: >> p = prepare(r) >> print "I got", p > > Saving one line and four non-whitespace characters? > -1 from me. Just the characters, since the single statement could be put on the same line as the try: try: r = some_single_statement() except TypeError: print "oh no!" raise OhNoException() p = prepare(r) print "I got", p From victor.stinner at gmail.com Mon Mar 26 02:32:15 2012 From: victor.stinner at gmail.com (Victor Stinner) Date: Mon, 26 Mar 2012 02:32:15 +0200 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Message-ID: 2012/3/16 Yury Selivanov : > Can we make the __closure__ attribute writeable? ?Since __code__ already is, ... I never understood why __code__ is writable. What is the usecase of modifying the code of an existing function? Victor From yselivanov.ml at gmail.com Mon Mar 26 05:04:02 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Sun, 25 Mar 2012 23:04:02 -0400 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Message-ID: On 2012-03-25, at 8:32 PM, Victor Stinner wrote: > 2012/3/16 Yury Selivanov : >> Can we make the __closure__ attribute writeable? Since __code__ already is, ... > > I never understood why __code__ is writable. What is the usecase of > modifying the code of an existing function? Metaprogramming in its various forms. Read this thread and you'll find one use case. - Yury From ronaldoussoren at mac.com Mon Mar 26 10:49:18 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 26 Mar 2012 10:49:18 +0200 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> Message-ID: On 21 Mar, 2012, at 15:22, Brett Cannon wrote: > > > On Tue, Mar 20, 2012 at 11:49, Ronald Oussoren wrote: > > On 13 Mar, 2012, at 9:15, Guido van Rossum wrote: > > > > > I think it comes down to this: I really, really, really hate > > directories with a suffix. I'd like to point out that the suffix is > > also introducing a backwards incompatibility: everybody will have to > > teach their tools, IDEs, and brains about .pyp directories, > > Directories with a suffix have the advantage that you could teach GUIs to treat > them differently, filemanagers could for example show a ".pyp" directory as > a folder with a python logo just like ".py" files are shown as documents with > a python logo. > > > OS X has made me dislike that possibility. Some git tools will make directories ending in .git be considered an opaque object in the file system, forcing me to drop into a shell or right-click and choose to inspect the directory in order to see its contents. That's probably because those tools define ".git" directories as a package in their metadata and the finder won't show package contents by default (you can use the context menu of the finder to inspect the contents of packages, but that won't work in the file open/save panels). I'd have to experiment to be sure, but IIRC it is possible to assign icons to a suffix without making directories into packages. Ronald > > -Brett > > > With the implicit approach it is much harder to recognize python packages as > such without detailed knowledge about the import algorithm and python search > path. > > Ronald > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Mon Mar 26 10:45:50 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Mon, 26 Mar 2012 10:45:50 +0200 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> Message-ID: <6D42492D-D14A-400A-A85A-BCE4781F1BB8@mac.com> On 21 Mar, 2012, at 4:13, Chris Rebert wrote: > On Tue, Mar 20, 2012 at 10:42 AM, Terry Reedy wrote: >> On 3/20/2012 11:49 AM, Ronald Oussoren wrote: >>> On 13 Mar, 2012, at 9:15, Guido van Rossum wrote: >>>> I think it comes down to this: I really, really, really hate >>>> directories with a suffix. I'd like to point out that the suffix >>>> is also introducing a backwards incompatibility: everybody will >>>> have to teach their tools, IDEs, and brains about .pyp >>>> directories, >>> >>> Directories with a suffix have the advantage that you could teach >>> GUIs to treat them differently, filemanagers could for example show a >>> ".pyp" directory as a folder with a python logo just like ".py" >>> files are shown as documents with a python logo. >>> >>> With the implicit approach it is much harder to recognize python >>> packages as such without detailed knowledge about the import >>> algorithm and python search path. >> >> Package directories are files and can be imported to make modules. I think >> it would have been nice to use .pyp from the beginning. It would make Python >> easier to learn. Also, 'import x' would mean simply mean "Search sys.path >> directories for a file named 'x.py*', with no need for either the importer >> (or human reader) to look within directories for the magic __init__.py file. >> Sorting a directory listing by extension would sort all packages together. > > Your file manager views directories as having filename extensions? > Mine sure doesn't. Yes. On what platform are you? On unixy platforms filename extensions are just a naming convention that can just as easily be used with directories. Ronald From techtonik at gmail.com Mon Mar 26 12:59:47 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 26 Mar 2012 13:59:47 +0300 Subject: [Python-ideas] Implied try blocks In-Reply-To: References: Message-ID: On Sun, Mar 25, 2012 at 5:15 PM, Calvin Spealman wrote: > > r = some_single_statement() > except TypeError: > ? ?print "oh no!" > ? ?raise OhNoException() > else: > ? ?p = prepare(r) > ? ?print "I got", p -1, because.. When I reading this and encounter "except TypeError", I have to update the code that I've already read (the stuff in my head) to place it into exception handling block. That's a good anti-pattern for readability. -- anatoly t. From ironfroggy at gmail.com Mon Mar 26 14:38:09 2012 From: ironfroggy at gmail.com (Calvin Spealman) Date: Mon, 26 Mar 2012 08:38:09 -0400 Subject: [Python-ideas] Implied try blocks In-Reply-To: References: Message-ID: On Mon, Mar 26, 2012 at 6:59 AM, anatoly techtonik wrote: > On Sun, Mar 25, 2012 at 5:15 PM, Calvin Spealman wrote: >> >> r = some_single_statement() >> except TypeError: >> ? ?print "oh no!" >> ? ?raise OhNoException() >> else: >> ? ?p = prepare(r) >> ? ?print "I got", p > > -1, because.. When I reading this and encounter "except TypeError", I > have to update the code that I've already read (the stuff in my head) > to place it into exception handling block. That's a good anti-pattern > for readability. The idea i was hoping to pull off here is that every statement was, implicitly, a try block and any exception raised by it could have handlers associated, and a block that would only follow if it did not raise an exception. However, I never thought it was a great idea, but this is python-ideas not python-good-ideas ;-) Steven has already convinced me it was a bad idea from a maintenance standpoint. > -- > anatoly t. -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://techblog.ironfroggy.com/ Follow me if you're into that sort of thing: http://www.twitter.com/ironfroggy From guido at python.org Mon Mar 26 16:57:37 2012 From: guido at python.org (Guido van Rossum) Date: Mon, 26 Mar 2012 07:57:37 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: <6D42492D-D14A-400A-A85A-BCE4781F1BB8@mac.com> References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> <6D42492D-D14A-400A-A85A-BCE4781F1BB8@mac.com> Message-ID: On Mon, Mar 26, 2012 at 1:45 AM, Ronald Oussoren wrote: > Yes. On what platform are you? On unixy platforms filename extensions are just a naming convention that can just as easily be used with directories. IIUC that's how almost all filesystems treat them. However desktop software often assigns specific meanings to them -- the user can configure these, but there's a large set of predefined bindings too, and many key applications also play this game (since there is, frankly, not much else to go by -- some important file types are not easily guessable by reading their content, either because it's some esoteric binary format, or because it's something too universal, like XML). I know that's how it works on Windows and Mac, but I believe the Linux desktop things (the things I kill off or at lest ignore as soon as I log in :-) have the same idea. -- --Guido van Rossum (python.org/~guido) From techtonik at gmail.com Tue Mar 27 13:02:05 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Tue, 27 Mar 2012 14:02:05 +0300 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: On Thu, Mar 22, 2012 at 11:01 AM, Nick Coghlan wrote: > > Trying to find out your calling function is hard because well-designed > decoupled code with properly separated concerns *should never care who > is calling it*, so improving the APIs for doing so is an *extremely > low priority task*. You're saying that somebody specifically designed this part to be hard, because those who program properly should never request the name of calling function. Is that what you're wanted to say? Correct me, but I think you were trying to say that nobody is interested in improving the way to get caller function, because people should not write the code that requests it. There are two counter arguments: 1. De facto existence of the code in standard library that is used a lot and that makes Python programs slow - logging 2. Practicality that beats purity. Debugging and introspection capabilities of the language are as important for development and maintenance as writing well-designed decoupled code with properly separated concerns (which mostly exists in ideal world without time, readability and performance constraints). For people who study Python applications programming that goes beyond function reference most of the time is spent in reverse coding - debugging and troubleshooting, and accessing a call stack without unnecessary implications makes a big difference in understanding how application (and Python) works. > There are some truly exceptional cases where it matters. For those, > the onus is on the developers *implementing those exceptional cases* > to figure out how to do this in a platform independent way (which may > involve conditional code that does different things depending on the > platform). These legitimate use cases are simply too rare for it to be > worthwhile for the various implementations to agree on a standard way > to expose the functionality and promise never to change it. Is there an objective statistics how much a typical application accesses call stack information directly or indirectly? I have a Python checkout with its test suite - is there a way to patch test suite to count call stack accesses? As I am not a C expert - I can't immediately see any obvious problems with caching call stack information with qualified names on any platform. Straightforward idea is to annotate bytecode in a separate memory page during compilation. > If all you are trying to do is print some debugging info and you can > use Python 3.2, then use the "stack_info=True" argument to the logging > APIs. That's what it's for. If you can't use Python 3.2, then look at > the way logging does it in 3.2 and see if you can adapt that for your > own needs. I could respond with "In the face of ambiguity, refuse the temptation to guess.", but you're right - my original intention was to get the name of the caller for writing debug info into a file (without logging module though as seemed to be a reason for deadlock I was tying to debug). But when I saw how hard is it to understand Python call stack from the code, how complicated is to get to it (and how much time I spent to get what I need), and it the end information about Python caller is _still incomplete_ - I thought about other implications that this have on Python development as a language in a broader sense. There is a clojure-py project, Lisp written in Python, RPython and many other things, but understanding of code differences of these languages is obscured by the fact that the basic level in Python (on top of which all these things are written) is too low level. Python became a platform, and it would be nice when a platform is transparent and offers convenient debug capabilities (I tend to do Python development on Fedora, because it provides nice Python debugging commands in gdb). > There's no compelling reason that writing helpers for debugging call > stacks needs to be easy. It should definitely be *possible*, but > 99.99% of all Python programmers are just going to use the native > debugging facilities of the interpreter, or use those of an IDE that > someone else wrote. Debugging facilities of the interpreter.. Huh. inspect.stack()[2][0].f_locals['self'].__class__.__name__ - is that the thing you keep in your head when breaking into Python console to get a name of a class for a calling method? And what should be invoked if a caller is not a method? I find it convenient to quickly insert print statements (or probes) at the points of interest to explore run-time behavior at these points instead of inserting awkward logging imports and formatting calls all over the place. I also don't know which IDE can better help with debugging multi-threaded application in Python with Qt bindings. So I just want to have native debugging capabilities of the interpreter to be convenient and actually useful. > Once again, you're painting with a broad sweeping brush "ah, it's > horrible, it's too hard to do anything worthwhile" (despite the > obvious contradiction that many projects already implement such things > quite effectively) instead of suggesting *small*, *incremental* > changes that might actually improve the status quo. Well, I proposed to have a caller_name() method. If it is not too universal, then a sys.call_stack() could be a better try: sys.call_stack() - Get current call stack. Return the list with a list of qualified names starting with the oldest. Rationale: make logging calls and module based filters zero cost. -- anatoly t. From ncoghlan at gmail.com Tue Mar 27 15:00:49 2012 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 27 Mar 2012 23:00:49 +1000 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: On Tue, Mar 27, 2012 at 9:02 PM, anatoly techtonik wrote: > On Thu, Mar 22, 2012 at 11:01 AM, Nick Coghlan wrote: >> >> Trying to find out your calling function is hard because well-designed >> decoupled code with properly separated concerns *should never care who >> is calling it*, so improving the APIs for doing so is an *extremely >> low priority task*. > > You're saying that somebody specifically designed this part to be > hard, because those who program properly should never request the name > of calling function. Is that what you're wanted to say? No, I'm saying the existence of frame objects and a call stack in the first place is inherently an implementation detail and hence it is impossible to provide full access to it in an implementation independent fashion. We don't want to make it part of the language spec, because we want to give implementors freedom to break those APIs and still claim compliance with the Python language spec (many implementors try to provide those APIs *anyway*, but they're well within their rights not to do so). > Correct me, but I think you were trying to say that nobody is > interested in improving the way to get caller function, because people > should not write the code that requests it. There are two counter > arguments: > 1. De facto existence of the code in standard library that is used a > lot and that makes Python programs slow - logging No, I am merely saying that such code is necessarily implementation dependent. As far as I am aware, the reason printing tracebacks (either directly or via logging) slows PyPy down is *because it has to make sure the stack exists to be printed*, thus forcing PyPy to turn off a bunch of optimisations. Similarly, IronPython by default doesn't create a full frame stack - you have to specify additional options to request creation of either full or partial frames during execution in order to use modules that rely on those features. > 2. Practicality that beats purity. Debugging and introspection > capabilities of the language are as important for development and > maintenance as writing well-designed decoupled code with properly > separated concerns (which mostly exists in ideal world without time, > readability and performance constraints). > > For people who study Python applications programming that goes beyond > function reference most of the time is spent in reverse coding - > debugging and troubleshooting, and accessing a call stack without > unnecessary implications makes a big difference in understanding how > application (and Python) works. The call stack is necessarily tightly coupled to the implementation, and not all implementations will be easily able to introspect their calling environment (since the ability to do so efficiently depends on the underlying runtime). Guido made a deliberate choice a long time ago to exclude that feature from the language requirements by including the leading underscore in sys._getframe(). That and the similar underscore on sys._current_frames() are not an accident - they're there to give implementation authors additional freedom in how they choose to implement function calls (and that includes the authors of future versions of CPython). > As I am not a C expert - I can't immediately see any obvious problems > with caching call stack information with qualified names on any > platform. Straightforward idea is to annotate bytecode in a separate > memory page during compilation. You are thinking far too much about a single implementation here. The problem is not implementing call stack access for CPython (or any implementation that can provide the sys._getframe() API without much additional effort). The problem is placing that constraint on all current and future implementations of Python that claim conformance with the language spec. > Debugging facilities of the interpreter.. Huh. > inspect.stack()[2][0].f_locals['self'].__class__.__name__ - is that > the thing you keep in your head when breaking into Python console to > get a name of a class for a calling method? > And what should be invoked if a caller is not a method? No, I expect people to use pdb or just read the tracebacks printed when an exception is thrown. Runtime introspection of local variables without the aid of a debugger or other tool is a seriously advanced programming technique (and will *always* be highly implementation dependent). The following two options can be very useful for exploring problematic parts of a program: import pdb; pdb.set_trace() # Break into pdb in a running program import pbd; pdb.pm() # Start pdb after an exception was thrown (very useful in conjunction with the use of "python -i" to drop into the interactive prompt instead of terminating) >> Once again, you're painting with a broad sweeping brush "ah, it's >> horrible, it's too hard to do anything worthwhile" (despite the >> obvious contradiction that many projects already implement such things >> quite effectively) instead of suggesting *small*, *incremental* >> changes that might actually improve the status quo. > > Well, I proposed to have a caller_name() method. If it is not too > universal, then a sys.call_stack() could be a better try: > > ?sys.call_stack() - Get current call stack. Return the list with a > ? ? ? ? ? ? ? ? ? ? ? ? ? ?list of qualified names starting with the oldest. > > Rationale: make logging calls and module based filters zero cost. As others pointed out, I did indeed miss the concrete suggestions you made. The idea of making qualname an attribute of the code object rather than just the function certainly has merit, although it would be quite an extensive patch to achieve it (as it would involve changes to several parts of the code generation and execution chain, from the AST evaluation step through to the main eval loop and the function and class object constructors). Now, while 3.3 is still in alpha, is definitely the right time for such a patch to be put forward (it would be awkward, although possibly still feasible, to make such a change in a backwards compatible way for 3.4 if the current implementation is released as is for 3.3). There are other potentially beneficial use cases for the new qualname attribute as well (e.g. the suggestion of using it in tracebacks instead of __name__ is a good idea) As far as the call_stack() or caller_name() idea goes, it would definitely be significantly less restrictive than requiring that implementations expose the full frame machinery that CPython uses. However, the other implementations (especially PyPy) would need to be consulted regarding the possible performance implications. Use cases would also need to be investigated to ensure that just the qualified name is sufficient. Tracebacks and warnings, for example, require at least the filename and line number, and other use cases may require the module name. Any such discussion really needs to be driven by the implications for Python runtimes that don't natively use CPython-style execution frames, to ensure providing a rarely used introspection API doesn't end up slowing down *all* applications on those platforms. That particular performance concern doesn't affect CPython, since the de facto API for stack introspection is the one that CPython uses internally anyway and exposing it as is to users (albeit with the "here be dragons" implementation dependence warning) is a relatively cheap exercise. Regards, Nick. -- Nick Coghlan?? |?? ncoghlan at gmail.com?? |?? Brisbane, Australia From steve at pearwood.info Tue Mar 27 15:33:51 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Wed, 28 Mar 2012 00:33:51 +1100 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: <4F71C1BF.8090209@pearwood.info> anatoly techtonik wrote: [...] > Well, I proposed to have a caller_name() method. If it is not too > universal, then a sys.call_stack() could be a better try: > > sys.call_stack() - Get current call stack. Return the list with a > list of qualified names starting with the oldest. > > Rationale: make logging calls and module based filters zero cost. I don't have an opinion on whether this is a good idea or a bad idea, but if people are going to make radical changes with respect to inspecting the call stack, I think it would be much better to get a list of the actual calling functions, not their names. Reasons: * If you have the function, you can do anything with it, including getting the name, the docstring, any annotations it may have, etc., but if all you have is the name, it isn't so easy, and may not actually be possible to unambiguously find the function object given only it's name. * Not all functions have meaningful names (think lambdas); or their name may be ambiguous (think functions produced by factory functions). -- Steven From g.rodola at gmail.com Tue Mar 27 16:13:05 2012 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Tue, 27 Mar 2012 16:13:05 +0200 Subject: [Python-ideas] Add socketserver running property Message-ID: Hi all, I would like to attract some attention to: http://bugs.python.org/issue14375 Would this be acceptable? Regards, --- Giampaolo http://code.google.com/p/pyftpdlib/ http://code.google.com/p/psutil/ http://code.google.com/p/pysendfile/ From robert.kern at gmail.com Tue Mar 27 17:14:43 2012 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 27 Mar 2012 16:14:43 +0100 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: <4F71C1BF.8090209@pearwood.info> References: <20120322001725.GA10391@pantoffel-wg.de> <4F71C1BF.8090209@pearwood.info> Message-ID: On 3/27/12 2:33 PM, Steven D'Aprano wrote: > anatoly techtonik wrote: > [...] >> Well, I proposed to have a caller_name() method. If it is not too >> universal, then a sys.call_stack() could be a better try: >> >> sys.call_stack() - Get current call stack. Return the list with a >> list of qualified names starting with the oldest. >> >> Rationale: make logging calls and module based filters zero cost. > > I don't have an opinion on whether this is a good idea or a bad idea, but if > people are going to make radical changes with respect to inspecting the call > stack, I think it would be much better to get a list of the actual calling > functions, not their names. > > Reasons: > > * If you have the function, you can do anything with it, including getting the > name, the docstring, any annotations it may have, etc., but if all you have is > the name, it isn't so easy, and may not actually be possible to unambiguously > find the function object given only it's name. > > * Not all functions have meaningful names (think lambdas); or their name may be > ambiguous (think functions produced by factory functions). One problem with this is that some of the intermediate frames (at the very least, the root frame) won't be functions but rather executed code in the form of module-level code or exec'ed strings. Furthermore, in CPython at least, it is difficult to go from the code object (which is what you have available) to its owning function object (if there is one). You can get the (well, a) name from the code object, though. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tjreedy at udel.edu Tue Mar 27 18:01:03 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Tue, 27 Mar 2012 12:01:03 -0400 Subject: [Python-ideas] Add socketserver running property In-Reply-To: References: Message-ID: On 3/27/2012 10:13 AM, Giampaolo Rodol? wrote: > I would like to attract some attention to: > http://bugs.python.org/issue14375 > Would this be acceptable? Attention given. (Issue is about socketserver.) I think it should be at least partly rejected as changing method apis needs a stronger reason that philosophical aesthetics. Perhaps someone else can take a look. -- Terry Jan Reedy From sven at marnach.net Tue Mar 27 18:32:00 2012 From: sven at marnach.net (Sven Marnach) Date: Tue, 27 Mar 2012 17:32:00 +0100 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: <20120327163200.GA21082@bagheera> Nick Coghlan schrieb am Tue, 27. Mar 2012, um 23:00:49 +1000: > There are other potentially beneficial use cases for the new > qualname attribute as well (e.g. the suggestion of using it in > tracebacks instead of __name__ is a good idea) I think that including the qualified name in tracebacks would indeed be an worthwhile improvement, and it would be nice if this could be included in 3.3. It would put __qualname__ to an excellent use. There are some open questions to adding a 'co_qualname' attribute. 1. What should be done for code objects that don't correspond to something having a __qualname__, like list comprehensions? There are two options: Using "function_name..", similar to lambda functions, or simply using 'None' or the same string as 'co_name' to avoid the overhead of computing a qualified name for every code object. 2. What about module names and PEP 395 qualified module names? One option would of course be to add both of them to the code object as well. Steven D'Aprano suggested giving full access to the function objects instead. The function object cannot be referenced in the code object, though, because this would create reference cycles in CPython. It *can* be referenced in the frame object (and this way, the change would only affect implementations having stack frames in the first place). This would only partially solve the use case of including qualified names in the traceback, since it only covers functions, not modules and classes. (For classes, we can't do something like this anyway, since the class object does not yet exist while the class body code executes.) What would have to be done to push this proposal? Cheers, Sven From anacrolix at gmail.com Tue Mar 27 20:24:41 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Wed, 28 Mar 2012 02:24:41 +0800 Subject: [Python-ideas] Add socketserver running property In-Reply-To: References: Message-ID: I'm not sure why socketserver isn't left to rot, or used as an example of how not to do things. Most Python APIs are idempotent with regard to methods that would otherwise set an object's state to whatever it already is. I would expose the running property but not make successive starts throw an exception. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Tue Mar 27 22:06:51 2012 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 27 Mar 2012 16:06:51 -0400 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: <20120327163200.GA21082@bagheera> References: <20120322001725.GA10391@pantoffel-wg.de> <20120327163200.GA21082@bagheera> Message-ID: On Tue, Mar 27, 2012 at 12:32 PM, Sven Marnach wrote: > ?... ?Steven D'Aprano suggested giving full access to the function > ? objects instead. ?The function object cannot be referenced in the > ? code object, though, because this would create reference cycles in > ? CPython. Not if you use weakrefs. And frankly, even without weakrefs, so what? Functions are typically immortal; waiting until gc to collect the oddballs isn't such a terrible cost. > ?It *can* be referenced in the frame object (and this way, > ? the change would only affect implementations having stack frames in > ? the first place). If the name needs to be available, that affects all implementations, perhaps even to the point of requiring them to create pseudo-frames, or causing them to add the equivalent of an always-active trace function. So yes, there is cost. > ?This would only partially solve the use case of > ? including qualified names in the traceback, since it only covers > ? functions, not modules and classes. ?(For classes, we can't do > ? something like this anyway, since the class object does not yet > ? exist while the class body code executes.) Both the name and the namespace do, these may serve as a proxies for most uses. Or repurpose the super() magic. -jJ From guido at python.org Tue Mar 27 22:25:15 2012 From: guido at python.org (Guido van Rossum) Date: Tue, 27 Mar 2012 13:25:15 -0700 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: <4F72211F.3020502@TZoNE.ORG> References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> <6D42492D-D14A-400A-A85A-BCE4781F1BB8@mac.com> <4F72211F.3020502@TZoNE.ORG> Message-ID: On Tue, Mar 27, 2012 at 1:20 PM, Phil Vandry wrote: > On 2012-03-26 10:57 , Guido van Rossum wrote: >> >> On Mon, Mar 26, 2012 at 1:45 AM, Ronald Oussoren >> ?wrote: >>> >>> Yes. On what platform are you? On unixy platforms filename extensions are >>> just a naming convention that can just as easily be used with directories. >> >> >> IIUC that's how almost all filesystems treat them. However desktop >> software often assigns specific meanings to them -- the user can >> configure these, but there's a large set of predefined bindings too, >> and many key applications also play this game (since there is, >> frankly, not much else to go by -- some important file types are not > > > On the Mac, at least, there is much more to go by: a 4-character file type > and a 4-character creator type associated with every file. The IANA > registration form for MIME types even lets registrants specify the mapping > between these 8 characters and a MIME type. > > However, these do seem to have fallen into disuse in recent versions of > MacOS. I never knew if this was an intentional downgrade or just a lack of > upkeep. Oh, I remember those from hacking MacOS decades ago... I suspect they have fallen by the wayside because only Mac-specific tools keep track of these as files are copied, moved, backed up, archived, restored, uploaded and downloaded, etc. -- --Guido van Rossum (python.org/~guido) From barry at python.org Tue Mar 27 22:29:08 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 27 Mar 2012 16:29:08 -0400 Subject: [Python-ideas] Standard way to get caller name and easy call stack access References: <20120322001725.GA10391@pantoffel-wg.de> Message-ID: <20120327162908.1d6d3a4d@resist.wooz.org> On Mar 22, 2012, at 12:17 AM, Sven Marnach wrote: >(And there is another way -- at least in CPython you can use >'sys._getframe().f_back.f_code.co_name'.) Note that back in the day, the motivating factor for sys._getframe() was an observation I made about internationalizing Python programs. This boils down to a DRY argument. For example, you could do this: def here_kitty(person, pet): print(_('$person has a $pet').safe_substitute(person=person, pet=pet)) So much error-prone DRY. With sys._getframe(), a package like flufl.i18n can allow you to write this like so: def here_kitty(person, pet): print(_('$person has a $pet')) because _() can walk up to the caller's frame and pull out the required variables. Now imagine writing hundreds of translatable strings in your application, which would you prefer? It would be impossible to port flufl.i18n to implementations that don't provide an equivalent of sys._getframe(), so it would be nice if there were perhaps a standard way of spelling this. OTOH, it was recognized at the time that this was pretty specialized, and thus it was hidden in an underscore function. Cheers, -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From barry at python.org Tue Mar 27 22:33:20 2012 From: barry at python.org (Barry Warsaw) Date: Tue, 27 Mar 2012 16:33:20 -0400 Subject: [Python-ideas] Standard way to get caller name and easy call stack access References: <20120322001725.GA10391@pantoffel-wg.de> <4F6AEFD8.5030106@hotpy.org> <20120322173712.GA23302@mcnabbs.org> Message-ID: <20120327163320.3820c7c2@resist.wooz.org> On Mar 22, 2012, at 11:37 AM, Andrew McNabb wrote: >"One unobvious usecase where this is used is the logging module. Don't >use logging module if you want things to be fast." Similarly, the i18n use case that motivated sys._getframe() in the first place shouldn't be considered performance critical. -Barry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: not available URL: From sven at marnach.net Wed Mar 28 01:29:07 2012 From: sven at marnach.net (Sven Marnach) Date: Wed, 28 Mar 2012 00:29:07 +0100 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> <20120327163200.GA21082@bagheera> Message-ID: <20120327232907.GB21082@bagheera> Jim Jewett schrieb am Tue, 27. Mar 2012, um 16:06:51 -0400: > On Tue, Mar 27, 2012 at 12:32 PM, Sven Marnach wrote: > > ?... ?Steven D'Aprano suggested giving full access to the function > > ? objects instead. ?The function object cannot be referenced in the > > ? code object, though, because this would create reference cycles in > > ? CPython. > > Not if you use weakrefs. > > And frankly, even without weakrefs, so what? Functions are typically > immortal; waiting until gc to collect the oddballs isn't such a terrible cost. I didn't carefully think about this one. It's impossible to reference the function object in the code object for a different reason: there can be many function objects corresponding to the same code object. So the discussion whether the reference cycle is tolerable is moot -- it's impossible anyway. If we want access to the function objects from the stack trace, the only way to go is to add a reference to frame objects. Cheers, Sven From ericsnowcurrently at gmail.com Wed Mar 28 03:24:37 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 27 Mar 2012 19:24:37 -0600 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: <20120327163200.GA21082@bagheera> References: <20120322001725.GA10391@pantoffel-wg.de> <20120327163200.GA21082@bagheera> Message-ID: On Tue, Mar 27, 2012 at 10:32 AM, Sven Marnach wrote: > Nick Coghlan schrieb am Tue, 27. Mar 2012, um 23:00:49 +1000: >> There are other potentially beneficial use cases for the new >> qualname attribute as well (e.g. the suggestion of using it in >> tracebacks instead of __name__ is a good idea) > > I think that including the qualified name in tracebacks would indeed > be an worthwhile improvement, and it would be nice if this could be > included in 3.3. ?It would put __qualname__ to an excellent use. There is already an outstanding issue on co_qualname [1]. > > There are some open questions to adding a 'co_qualname' attribute. The same questions stand for adding a co_func (for the function whose definition resulted in the code object) [2]. I'd rather have a co_func than a co_qualname. Then you can do code.co_func.__qualname__. > > 1. What should be done for code objects that don't correspond to > ? something having a __qualname__, like list comprehensions? ?There > ? are two options: Using "function_name..", similar > ? to lambda functions, or simply using 'None' or the same string as > ? 'co_name' to avoid the overhead of computing a qualified name for > ? every code object. > > 2. What about module names and PEP 395 qualified module names? ?One > ? option would of course be to add both of them to the code object as > ? well. ?Steven D'Aprano suggested giving full access to the function > ? objects instead. ?The function object cannot be referenced in the > ? code object, though, because this would create reference cycles in > ? CPython. ?It *can* be referenced in the frame object (and this way, > ? the change would only affect implementations having stack frames in > ? the first place). ?This would only partially solve the use case of > ? including qualified names in the traceback, since it only covers > ? functions, not modules and classes. ?(For classes, we can't do > ? something like this anyway, since the class object does not yet > ? exist while the class body code executes.) > > What would have to be done to push this proposal? I'm all for introspection, but I think it needs to be exposed carefully. Otherwise those APIs become attractive nuisances. An implicit self-referencing name (like __function__) in a function's locals is one such example where it would be nice, but would likely get abused (a concern that Guido had)[3]. -eric [1] http://bugs.python.org/issue13672 [2] discussion starting here: http://mail.python.org/pipermail/python-ideas/2011-August/011053.html [3] http://mail.python.org/pipermail/python-ideas/2011-August/011062.html > > Cheers, > ? ?Sven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From ericsnowcurrently at gmail.com Wed Mar 28 03:30:08 2012 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Tue, 27 Mar 2012 19:30:08 -0600 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> <4F71C1BF.8090209@pearwood.info> Message-ID: On Tue, Mar 27, 2012 at 9:14 AM, Robert Kern wrote: > On 3/27/12 2:33 PM, Steven D'Aprano wrote: >> * Not all functions have meaningful names (think lambdas); or their name >> may be >> ambiguous (think functions produced by factory functions). > > > One problem with this is that some of the intermediate frames (at the very > least, the root frame) won't be functions but rather executed code in the > form of module-level code or exec'ed strings. Furthermore, in CPython at > least, it is difficult to go from the code object (which is what you have > available) to its owning function object (if there is one). You can get the > (well, a) name from the code object, though. Here's a solution for the "this frame is the result of calling which function?" question: http://bugs.python.org/issue12857 I think it strikes the right balance between providing something new/useful and not making it too accessible. -eric > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless enigma > ?that is made terrible by our own mad attempt to interpret it as though it > had > ?an underlying truth." > ?-- Umberto Eco > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From ronaldoussoren at mac.com Wed Mar 28 08:20:53 2012 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 28 Mar 2012 08:20:53 +0200 Subject: [Python-ideas] My objections to implicit package directories In-Reply-To: References: <276F7BB8-E4B3-46DF-A3F5-62953C7B4227@mac.com> <6D42492D-D14A-400A-A85A-BCE4781F1BB8@mac.com> <4F72211F.3020502@TZoNE.ORG> Message-ID: <87F86751-298B-4BAF-A611-335A9C93F761@mac.com> On 27 Mar, 2012, at 22:25, Guido van Rossum wrote: > On Tue, Mar 27, 2012 at 1:20 PM, Phil Vandry wrote: >> On 2012-03-26 10:57 , Guido van Rossum wrote: >>> >>> On Mon, Mar 26, 2012 at 1:45 AM, Ronald Oussoren >>> wrote: >>>> >>>> Yes. On what platform are you? On unixy platforms filename extensions are >>>> just a naming convention that can just as easily be used with directories. >>> >>> >>> IIUC that's how almost all filesystems treat them. However desktop >>> software often assigns specific meanings to them -- the user can >>> configure these, but there's a large set of predefined bindings too, >>> and many key applications also play this game (since there is, >>> frankly, not much else to go by -- some important file types are not >> >> >> On the Mac, at least, there is much more to go by: a 4-character file type >> and a 4-character creator type associated with every file. The IANA >> registration form for MIME types even lets registrants specify the mapping >> between these 8 characters and a MIME type. >> >> However, these do seem to have fallen into disuse in recent versions of >> MacOS. I never knew if this was an intentional downgrade or just a lack of >> upkeep. > > Oh, I remember those from hacking MacOS decades ago... I suspect they > have fallen by the wayside because only Mac-specific tools keep track > of these as files are copied, moved, backed up, archived, restored, > uploaded and downloaded, etc. That, and probably also because the foundations of the osx architecture are based on nextstep which also didn't use the creator and filetype. The reason I didn't claim extensions are always a convention (and which I should have mentioned before) is that at the FAT filesystem has explicit support for extentensions (the old 8+3 filename length restriction). That's totally off-topic though. To get back on topic, I don't particularly dislike directory extensions and would prefer the ".pyp" extension for python packages withouth an __init__.py file because that is more explicit and makes it clear which directories are intented to be a python package. With the other proposal it is not clear which directories are python packages and which are directories that just happen to be on sys.path (for example because they are in the same directory as the python script you're running). Ronald -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4788 bytes Desc: not available URL: From sven at marnach.net Wed Mar 28 13:06:34 2012 From: sven at marnach.net (Sven Marnach) Date: Wed, 28 Mar 2012 12:06:34 +0100 Subject: [Python-ideas] Standard way to get caller name and easy call stack access In-Reply-To: References: <20120322001725.GA10391@pantoffel-wg.de> <20120327163200.GA21082@bagheera> Message-ID: <20120328110634.GC21082@bagheera> Eric Snow schrieb am Tue, 27. Mar 2012, um 19:24:37 -0600: > The same questions stand for adding a co_func (for the function whose > definition resulted in the code object) [2]. I'd rather have a > co_func than a co_qualname. Then you can do > code.co_func.__qualname__. How is this supposed to work? The code object is created at compilation time, while the function object is only created when the function definition is executed. For functions nested inside another function, a new function object is created everytime the outer function runs, reusing the same code object that was created at compilation time. Cheers, Sven From masklinn at masklinn.net Wed Mar 28 13:37:56 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 28 Mar 2012 13:37:56 +0200 Subject: [Python-ideas] A more useful command-line wsgiref.simple_server? Message-ID: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> Currently, calling wsgiref.simple_server simply mounts the (bundled) demo app. I think that's a bit of a lost opportunity: as the community has mostly standardized on a *.wsgi/wsgi.py script providing an `application` name in its global namespace, it would be nice if wsgiref.simple_server could take such a file as parameter and mount the application provided: * This would allow testing that the script has no error without having to go through mounting it in e.g. mod_wsgi * It would make trivial/test applications (e.g. dynamic responders to local JS) simpler to bootstrap as there would be no need for the half-dozen lines of wsgiref.simple_server bootstrapping and "hard" dependency on wsgiref, import wsgiref.simple_server def application(environ, start_response): 'code' if __name__ == '__main__': httpd = make_server('', 8000, application) httpd.serve_forever() could become: def application(environ, start_response): 'code' Since wsgiref already supports `python -mwsgiref.simple_server`, the change would be pretty simple: * the first positional argument is the wsgi script if it is present it is `exec`'d, the `application` key is extracted from the locals and is mounted through make_server; if it is absent, then demo_app is mounted as before * the second positional argument is the host, defaulting to '' * the third positional argument is the port, defaulting to 8000 This way the current sanity test/"PHPInfo" demo app works as it did before, but it becomes possible to very easily serve a WSGI script with almost no overhead in the script itself. Thoughts? From techtonik at gmail.com Wed Mar 28 15:03:49 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 28 Mar 2012 16:03:49 +0300 Subject: [Python-ideas] A more useful command-line wsgiref.simple_server? In-Reply-To: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> References: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> Message-ID: On Wed, Mar 28, 2012 at 2:37 PM, Masklinn wrote: > Currently, calling wsgiref.simple_server simply mounts the (bundled) > demo app. > > I think that's a bit of a lost opportunity I remember the pain of using HTTPServer, CGIHTTPServer and friends just to test a few html pages through a local browser, so I hear you. Unfortunately, bug tracker is down for the maintenance for me to search for similar reports for these libraries. > it would be nice if wsgiref.simple_server could > take such a file as parameter and mount the application provided: +2 > * This would allow testing that the script has no error without having > ?to go through mounting it in e.g. mod_wsgi > * It would make trivial/test applications (e.g. dynamic responders to > ?local JS) simpler to bootstrap as there would be no need for the > ?half-dozen lines of wsgiref.simple_server bootstrapping and "hard" > ?dependency on wsgiref, All points valid. That was also my use case for the CGI/HTTPServers. > Since wsgiref already supports `python -mwsgiref.simple_server`, the > change would be pretty simple: Is it possible to choose a more intuitive name if it is for Python 3.3 only anyway? > * the first positional argument is the wsgi script > ?if it is present it is `exec`'d, the `application` key is > ? ?extracted from the locals and is mounted through make_server; > ?if it is absent, then demo_app is mounted as before > * the second positional argument is the host, defaulting to '' > * the third positional argument is the port, defaulting to 8000 To summarize: python -m wsgiref.simple_server [[wsgi_script.py] [[host] [port]]] A better way: python -m wsgiref.simple_server [-h host] [-p port] But for good API it would be nice to see an overview of command line params of other WSGI servers for some kind of convention. > This way the current sanity test/"PHPInfo" demo app works as it did before, > but it becomes possible to very easily serve a WSGI script with almost no > overhead in the script itself. > > Thoughts? 1. Even more awesome if any WSGI application could be tested (bootstrapped) this way. 2. Let test/"PHPInfo" serve forever and add few more tabs (pyrasite for the current Python?) -- anatoly t. From masklinn at masklinn.net Wed Mar 28 15:30:21 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 28 Mar 2012 15:30:21 +0200 Subject: [Python-ideas] A more useful command-line wsgiref.simple_server? In-Reply-To: References: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> Message-ID: On 2012-03-28, at 15:03 , anatoly techtonik wrote: >> Since wsgiref already supports `python -mwsgiref.simple_server`, the >> change would be pretty simple: > > Is it possible to choose a more intuitive name if it is for Python 3.3 > only anyway? The wsgiref module has not changed on that point in Python 3 (wsgiref.simple_server is still wsgiref.simple_server), so adding behavior would be done to the same place. >> * the first positional argument is the wsgi script >> if it is present it is `exec`'d, the `application` key is >> extracted from the locals and is mounted through make_server; >> if it is absent, then demo_app is mounted as before >> * the second positional argument is the host, defaulting to '' >> * the third positional argument is the port, defaulting to 8000 > > To summarize: > python -m wsgiref.simple_server [[wsgi_script.py] [[host] [port]]] actually, python -m wsgiref.simple_server [wsgi_script [host [port]]] > > A better way: > python -m wsgiref.simple_server [-h > host] [-p port] > I don't think that is a good idea: 1. WSGI scripts don't have to have any extension or can have no extension at all (as far as I can tell, many scripts follow mod_wsgi by using a .wsgi extension) 2. Not sure why the application name would be editable, this adds needless complexity and it does not seem supported by the most populat deployment method (mod_wsgi) 3. Finally, this changes the current behavior mounting the demo app by default, which I'd rather not do unless stdlib maintainers assert it should be done. So just [wsgi_script] > [-h host] [-p port] I'd rather not burn help's -h for host specification. Are there really so many situations where you'd want to specify a port and leave the default host? >> This way the current sanity test/"PHPInfo" demo app works as it did before, >> but it becomes possible to very easily serve a WSGI script with almost no >> overhead in the script itself. >> >> Thoughts? > > 1. Even more awesome if any WSGI application could be tested > (bootstrapped) this way. Well as far as I can tell, most WSGI applications which can work single-threaded would work with just that. From tjreedy at udel.edu Wed Mar 28 21:13:56 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Wed, 28 Mar 2012 15:13:56 -0400 Subject: [Python-ideas] A more useful command-line wsgiref.simple_server? In-Reply-To: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> References: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> Message-ID: On 3/28/2012 7:37 AM, Masklinn wrote: > Currently, calling wsgiref.simple_server simply mounts the (bundled) > demo app. > > I think that's a bit of a lost opportunity: as the community has mostly > standardized on a *.wsgi/wsgi.py script providing an `application` name > in its global namespace, it would be nice if wsgiref.simple_server could > take such a file as parameter and mount the application provided: In my ignorance, this seems like a sensible request. If I understand correctly, you are not proposing to break any existing uses but to make this more useful without being overly specialized. If you file an issue, Phillip Eby would probably be asked to review. The web-sig might be the best place to thrash out details and demonstrate 'community agreement'. A patch file, preferably with a test, along with a good text explanation, usually speeds disposition of an issue. -- Terry Jan Reedy From masklinn at masklinn.net Wed Mar 28 22:23:46 2012 From: masklinn at masklinn.net (Masklinn) Date: Wed, 28 Mar 2012 22:23:46 +0200 Subject: [Python-ideas] A more useful command-line wsgiref.simple_server? In-Reply-To: References: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> Message-ID: <4AD08BF6-0BB6-427F-B008-C4310299EB57@masklinn.net> On 2012-03-28, at 21:13 , Terry Reedy wrote: > On 3/28/2012 7:37 AM, Masklinn wrote: >> Currently, calling wsgiref.simple_server simply mounts the (bundled) >> demo app. >> >> I think that's a bit of a lost opportunity: as the community has mostly >> standardized on a *.wsgi/wsgi.py script providing an `application` name >> in its global namespace, it would be nice if wsgiref.simple_server could >> take such a file as parameter and mount the application provided: > > In my ignorance, this seems like a sensible request. If I understand correctly, you are not proposing to break any existing uses but to make this more useful without being overly specialized. That would be the goal yes. > If you file an issue, Phillip Eby would probably be asked to review. The web-sig might be the best place to thrash out details and demonstrate 'community agreement'. A patch file, preferably with a test, along with a good text explanation, usually speeds disposition of an issue. So you recommend I bring this over to web-sig, ideally alongside a patchfile? Will do, thank you for the suggestion. From techtonik at gmail.com Thu Mar 29 00:38:15 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 29 Mar 2012 01:38:15 +0300 Subject: [Python-ideas] A more useful command-line wsgiref.simple_server? In-Reply-To: References: <99C21799-4D96-4FA9-800C-D0C18D1C2257@masklinn.net> Message-ID: On Wed, Mar 28, 2012 at 4:30 PM, Masklinn wrote: > On 2012-03-28, at 15:03 , anatoly techtonik wrote: >>> Since wsgiref already supports `python -mwsgiref.simple_server`, the >>> change would be pretty simple: >> >> Is it possible to choose a more intuitive name if it is for Python 3.3 >> only anyway? > > The wsgiref module has not changed on that point in Python 3 > (wsgiref.simple_server is still wsgiref.simple_server), so adding > behavior would be done to the same place. I can't see why it can not be something like: python -m wsgiref ... Making wsgiref provide a server is an intuitive guess. Non-intuitive part is that running `wsgiref` doesn't give a hint where to find this server. > ? ?python -m wsgiref.simple_server [wsgi_script [host [port]]] > >> >> A better way: >> python -m wsgiref.simple_server [-h >> host] [-p port] > >> ? ? > > I don't think that is a good idea: > > 1. WSGI scripts don't have to have any extension or can have no > extension at all (as far as I can tell, many scripts follow mod_wsgi by > using a .wsgi extension) All right. Let's drop [.py] > 2. Not sure why the application name would be editable, this adds > needless complexity and it does not seem supported by the most populat > deployment method (mod_wsgi) I don't use `mod_wsgi` deployment, and can't confirm your popularity rating. I use AppEngine - it uses `app` in example: https://developers.google.com/appengine/docs/python/python27/using27#Configuring_WSGI_Script_Handlers gunicorn uses `app` also: http://pypi.python.org/pypi/gunicorn `wsgiref` doesn't place any restriction on callable name, so why hardcode it to `application`. Let it do discovery itself or make the callable name explicit (you know - it's better). > 3. Finally, this changes the current behavior mounting the demo app by > default, which I'd rather not do unless stdlib maintainers assert it > should be done. > > So just [wsgi_script] Well, makes sense, but how do you know that server_example has parameters? It may be better to leave a --demo argument if somebody needs a demo. >> ? [-h host] [-p port] > > I'd rather not burn help's -h for host specification. Are there really so > many situations where you'd want to specify a port and leave the default > host? I forgot that -h is for --help. `-H host[:port]` may be a better notation, indeed, for consistency with fabric. And there are really many situations where I have AppEngine already running on port 8000. Ideally port should be auto chosen from the first that is free above 8000 if not specified explicitly. >> 1. Even more awesome if any WSGI application could be tested >> (bootstrapped) this way. > > Well as far as I can tell, most WSGI applications which can work > single-threaded would work with just that. You still have to hardcode callable name to `application`. It still would be nice to have some autodetection logic. -- anatoly t. From timothy.c.delaney at gmail.com Thu Mar 29 01:58:19 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 29 Mar 2012 10:58:19 +1100 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Message-ID: On 26 March 2012 11:32, Victor Stinner wrote: > 2012/3/16 Yury Selivanov : > > Can we make the __closure__ attribute writeable? Since __code__ already > is, ... > > I never understood why __code__ is writable. What is the usecase of > modifying the code of an existing function? There are many things you can do with bytecode manipulation (whether you should is another question). Among other things, I've used it for optimisation (e.g. my optimised self.super recipe that probably isn't actually available online anymore). Instrumentation of code is another thing, although these days you're probably better off using a decorator. There aren't a lot of real use cases, but if nothing else it can be a lot of fun :) Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From mark at hotpy.org Thu Mar 29 10:05:37 2012 From: mark at hotpy.org (Mark Shannon) Date: Thu, 29 Mar 2012 09:05:37 +0100 Subject: [Python-ideas] make __closure__ writable In-Reply-To: References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> Message-ID: <4F7417D1.7030800@hotpy.org> Tim Delaney wrote: > On 26 March 2012 11:32, Victor Stinner > wrote: > > 2012/3/16 Yury Selivanov >: > > Can we make the __closure__ attribute writeable? Since __code__ > already is, ... > > I never understood why __code__ is writable. What is the usecase of > modifying the code of an existing function? > > > There are many things you can do with bytecode manipulation (whether you > should is another question). Among other things, I've used it for > optimisation (e.g. my optimised self.super recipe that probably isn't > actually available online anymore). Instrumentation of code is another > thing, although these days you're probably better off using a decorator. > > There aren't a lot of real use cases, but if nothing else it can be a > lot of fun :) You can do all of those things without changing the __code__ attribute. Just create a new function instead. Cheers, Mark From timothy.c.delaney at gmail.com Thu Mar 29 10:27:54 2012 From: timothy.c.delaney at gmail.com (Tim Delaney) Date: Thu, 29 Mar 2012 19:27:54 +1100 Subject: [Python-ideas] make __closure__ writable In-Reply-To: <4F7417D1.7030800@hotpy.org> References: <724E3224-50E5-43EA-8F72-878D4B87D3F5@gmail.com> <4F7417D1.7030800@hotpy.org> Message-ID: On 29 March 2012 19:05, Mark Shannon wrote: > Tim Delaney wrote: > >> There are many things you can do with bytecode manipulation (whether you >> should is another question). Among other things, I've used it for >> optimisation (e.g. my optimised self.super recipe that probably isn't >> actually available online anymore). Instrumentation of code is another >> thing, although these days you're probably better off using a decorator. >> >> There aren't a lot of real use cases, but if nothing else it can be a lot >> of fun :) >> > > You can do all of those things without changing the __code__ attribute. > Just create a new function instead. > Not if you want anything that holds an existing reference to the function to get the new behaviour. Sometimes you need to change things in-place. Tim Delaney -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Thu Mar 29 20:19:51 2012 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 29 Mar 2012 21:19:51 +0300 Subject: [Python-ideas] PEP x: Static module/package inspection In-Reply-To: References: <29228470.233.1324982829840.JavaMail.geo-discussion-forums@yqbl25> <20544069.58.1325067324546.JavaMail.geo-discussion-forums@yqiz15> Message-ID: On Thu, Feb 2, 2012 at 4:35 PM, Nick Coghlan wrote: > On Thu, Dec 29, 2011 at 1:28 AM, Michael Foord wrote: >> On a simple level, all of this is already "obtainable" by using the ast >> module that can parse Python code. I would love to see a "python-object" >> layer on top of this that will take an ast for a module (or other object) >> and return something that represents the same object as the ast. >> >> So all module level objects will have corresponding objects - where they are >> Python objects (builtin-literals) then they will represented exactly. For >> classes and functions you'll get an object back that has the same attributes >> plus some metadata (e.g. for functions /? methods what arguments they take >> etc). >> >> That is certainly doable and would make introspecting-without-executing a >> lot simpler. > > The existing 'clbr' (class browser) module in the stdlib also attempts > to play in this same space. I wouldn't say it does it particularly > *well* (since it's easy to confuse with valid Python constructs), but > it tries. Unfortunately http://docs.python.org/library/pyclbr.html misses info about variables. In the meanwhile I've patches my `astdump` module even further: - function to query top level variables changed from get_top_vars() to top_level_vars(), which is now accepts filename as a parameter. Now it will be even more convenient to use it for generating `setup.py` for simple modules. Sample `setup.py` generator is included. http://pypi.python.org/pypi/astdump/1.0 -- anatoly t. From andrew.svetlov at gmail.com Thu Mar 29 21:48:28 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Thu, 29 Mar 2012 22:48:28 +0300 Subject: [Python-ideas] Thread stopping Message-ID: I propose to add Thread.interrupt() function. th.interrupt() will set a flag in ThreadState structure. When interpreter switches to next thread it will check that flag. If flag is on then ThreadInterruptionError will be raised in thread context. If thread has blocked via threading locks (Lock, RLock, Condition, Semaphore etc) ? exception is raised also. Of course we cannot interrupt thread if it has been locked by C Extension call or just waiting for blocking IO. But, I think, the way to force stopping of some thread can be useful and has no incompatibility effect. The standard way to stop thread is the sending some message which is the signal to thread for termination. Pushing None or sentinel into thread message queue for example. Other variants: ? check 'interrupted' state explicitly by call threading.current_thread().interrupted() than do what you want. ? do the same as boost.threading does: check state in direct interruption point and locks if interruption is enabled. BTW, we can disable interruption mechanic by default and use it only if switched on by threading.enable_interruption() What do you think? -- Thanks, Andrew Svetlov From yselivanov.ml at gmail.com Thu Mar 29 22:08:40 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 29 Mar 2012 16:08:40 -0400 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: <31AF9B20-CA3D-4C0F-B78E-8A4FE73FA075@gmail.com> On 2012-03-29, at 3:48 PM, Andrew Svetlov wrote: > I propose to add Thread.interrupt() function. > > th.interrupt() will set a flag in ThreadState structure. > > When interpreter switches to next thread it will check that flag. > If flag is on then ThreadInterruptionError will be raised in thread context. > If thread has blocked via threading locks (Lock, RLock, Condition, > Semaphore etc) ? exception is raised also. +1. This feature would be nice for thread-pools. For instance to be able join a pool with a timeout. - Yury From fuzzyman at gmail.com Fri Mar 30 00:56:03 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Thu, 29 Mar 2012 23:56:03 +0100 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On 29 March 2012 20:48, Andrew Svetlov wrote: > I propose to add Thread.interrupt() function. > > th.interrupt() will set a flag in ThreadState structure. > > When interpreter switches to next thread it will check that flag. > If flag is on then ThreadInterruptionError will be raised in thread > context. > If thread has blocked via threading locks (Lock, RLock, Condition, > Semaphore etc) ? exception is raised also. > I've worked with .NET where you can interrupt threads and it was very useful. There is a complication though, if a thread is interrupted inside a finally block then vital resource cleanup can be interrupted. The way .NET solves this is to never raise the interrupt exception inside a finally block. Once a finally block is completed a pending thread interrupt exception will be raised. The normal response to requests like this is for people to suggest that the thread itself should check if it has been requested to stop - this is fine for fine grained tasks but not for very coarse grained tasks. Michael Foord > > Of course we cannot interrupt thread if it has been locked by C > Extension call or just waiting for blocking IO. > But, I think, the way to force stopping of some thread can be useful > and has no incompatibility effect. > > The standard way to stop thread is the sending some message which is > the signal to thread for termination. > Pushing None or sentinel into thread message queue for example. > > Other variants: > ? check 'interrupted' state explicitly by call > threading.current_thread().interrupted() than do what you want. > ? do the same as boost.threading does: check state in direct > interruption point and locks if interruption is enabled. > > BTW, we can disable interruption mechanic by default and use it only > if switched on by threading.enable_interruption() > > What do you think? > > -- > Thanks, > Andrew Svetlov > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From yselivanov.ml at gmail.com Fri Mar 30 01:10:43 2012 From: yselivanov.ml at gmail.com (Yury Selivanov) Date: Thu, 29 Mar 2012 19:10:43 -0400 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: <32BE53BA-4101-4FED-B9ED-54306AEEEE43@gmail.com> On 2012-03-29, at 6:56 PM, Michael Foord wrote: > I've worked with .NET where you can interrupt threads and it was very useful. There is a complication though, if a thread is interrupted inside a finally block then vital resource cleanup can be interrupted. The way .NET solves this is to never raise the interrupt exception inside a finally block. Once a finally block is completed a pending thread interrupt exception will be raised. That's the same problem we had in our coroutine framework (based on generators + greenlets). We've decided to modify the __code__ object of generators and add a simple counter of finally statements, and when it reaches 0 - call a special callback that checks if the interruption exception should be raised. Much cleaner way would be to embed this functionality in the python interpreter itself, but I'm not sure about the performance impact. - Yury From paul at colomiets.name Fri Mar 30 01:20:44 2012 From: paul at colomiets.name (Paul Colomiets) Date: Fri, 30 Mar 2012 02:20:44 +0300 Subject: [Python-ideas] Thread stopping In-Reply-To: <32BE53BA-4101-4FED-B9ED-54306AEEEE43@gmail.com> References: <32BE53BA-4101-4FED-B9ED-54306AEEEE43@gmail.com> Message-ID: Hi Yury, On Fri, Mar 30, 2012 at 2:10 AM, Yury Selivanov wrote: > Much cleaner way would be to embed this functionality in the python > interpreter itself, but I'm not sure about the performance impact. > Do you have exact semantics in mind? Can you start a concrete proposal? We have same problem, and don't want to patch code object :) But I'm not sure I have clear idea of the implementation. Also I want something that works for greenlets too. -- Paul From grosser.meister.morti at gmx.net Fri Mar 30 01:57:28 2012 From: grosser.meister.morti at gmx.net (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=) Date: Fri, 30 Mar 2012 01:57:28 +0200 Subject: [Python-ideas] Implied try blocks In-Reply-To: References: Message-ID: <4F74F6E8.9060103@gmx.net> On 03/26/2012 12:59 PM, anatoly techtonik wrote: > On Sun, Mar 25, 2012 at 5:15 PM, Calvin Spealman wrote: >> >> r = some_single_statement() >> except TypeError: >> print "oh no!" >> raise OhNoException() >> else: >> p = prepare(r) >> print "I got", p > > -1, because.. When I reading this and encounter "except TypeError", I > have to update the code that I've already read (the stuff in my head) > to place it into exception handling block. That's a good anti-pattern > for readability. > Slightly off topic: And that's why I stopped to write "f() if g()" in Ruby and use the more verbose "if g(); f(); end" there. At first it looks like a nice small syntax, but then it turns out to severely hurt readability. I even found code like this once: begin # quite some lines of code rescue # more code end unless something.nil? Yes, the whole block is actually conditional. I think syntax, that requires you to update already parsed code is bad. From josh at bartletts.id.au Fri Mar 30 02:00:27 2012 From: josh at bartletts.id.au (Joshua Bartlett) Date: Fri, 30 Mar 2012 10:00:27 +1000 Subject: [Python-ideas] Yielding through context managers Message-ID: I'd like to propose adding the ability for context managers to catch and handle control passing into and out of them via yield and generator.send() / generator.next(). For instance, class cd(object): def __init__(self, path): self.inner_path = path def __enter__(self): self.outer_path = os.getcwd() os.chdir(self.inner_path) def __exit__(self, exc_type, exc_val, exc_tb): os.chdir(self.outer_path) def __yield__(self): self.inner_path = os.getcwd() os.chdir(self.outer_path) def __send__(self): self.outer_path = os.getcwd() os.chdir(self.inner_path) Here __yield__() would be called when control is yielded through the with block and __send__() would be called when control is returned via .send() or .next(). To maintain compatibility, it would not be an error to leave either __yield__ or __send__ undefined. The rationale for this is that it's sometimes useful for a context manager to set global or thread-global state as in the example above, but when the code is used in a generator, the author of the generator needs to make assumptions about what the calling code is doing. e.g. def my_generator(path): with cd(path): yield do_something() do_something_else() Even if the author of this generator knows what effect do_something() and do_something_else() have on the current working directory, the author needs to assume that the caller of the generator isn't touching the working directory. For instance, if someone were to create two my_generator() generators with different paths and advance them alternately, the resulting behaviour could be most unexpected. With the proposed change, the context manager would be able to handle this so that the author of the generator doesn't need to make these assumptions. Naturally, nested with blocks would be handled by calling __yield__ from innermost to outermost and __send__ from outermost to innermost. I rather suspect that if this change were included, someone could come up with a variant of the contextlib.contextmanager decorator to simplify writing generators for this sort of situation. Cheers, J. D. Bartlett -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Fri Mar 30 03:05:42 2012 From: tjreedy at udel.edu (Terry Reedy) Date: Thu, 29 Mar 2012 21:05:42 -0400 Subject: [Python-ideas] Yielding through context managers In-Reply-To: References: Message-ID: On 3/29/2012 8:00 PM, Joshua Bartlett wrote: > I'd like to propose adding the ability for context managers to catch and > handle control passing into and out of them via yield and > generator.send() / generator.next(). > > For instance, > > class cd(object): > def __init__(self, path): > self.inner_path = path > > def __enter__(self): > self.outer_path = os.getcwd() > os.chdir(self.inner_path) > > def __exit__(self, exc_type, exc_val, exc_tb): > os.chdir(self.outer_path) > > def __yield__(self): > self.inner_path = os.getcwd() > os.chdir(self.outer_path) > > def __send__(self): > self.outer_path = os.getcwd() > os.chdir(self.inner_path) > > Here __yield__() would be called when control is yielded through the > with block and __send__() would be called when control is returned via > .send() or .next(). To maintain compatibility, it would not be an error > to leave either __yield__ or __send__ undefined. This strikes me as the wrong solution to the fragility of dubious code. The context manager protocol is simple: two special methods. Ditto for the iterator protocol. The generator protocol has been complexified; not good, but there are benefits and the extra complexity can be ignored. But I would be reluctant to complexify the cm protocol. This is aside from technical difficulties. > The rationale for this is that it's sometimes useful for a context > manager to set global or thread-global state as in the example above, > but when the code is used in a generator, the author of the generator > needs to make assumptions about what the calling code is doing. e.g. > > def my_generator(path): > with cd(path): > yield do_something() > do_something_else() Pull the yield out of the with block. def my_gen(path): with cd(path): directory = yield do_something(directory) do_else(directory) or def my_gen(p): with cd(p): res = do_something() yield res with cd(p): do_else() Use same 'result' trick if do_else also yields. > Even if the author of this generator knows what effect do_something() > and do_something_else() have on the current working directory, the > author needs to assume that the caller of the generator isn't touching > the working directory. For instance, if someone were to create two > my_generator() generators with different paths and advance them > alternately, the resulting behaviour could be most unexpected. With the > proposed change, the context manager would be able to handle this so > that the author of the generator doesn't need to make these assumptions. Or make with manipulation of global resources self-contained, as suggested above and as intended for with blocks. -- Terry Jan Reedy From anacrolix at gmail.com Fri Mar 30 04:34:46 2012 From: anacrolix at gmail.com (Matt Joiner) Date: Fri, 30 Mar 2012 10:34:46 +0800 Subject: [Python-ideas] Thread stopping In-Reply-To: References: <32BE53BA-4101-4FED-B9ED-54306AEEEE43@gmail.com> Message-ID: Wouldn't it be better to raise a SystemExit in a thread you want interrupted? -------------- next part -------------- An HTML attachment was scrubbed... URL: From eliben at gmail.com Fri Mar 30 06:53:45 2012 From: eliben at gmail.com (Eli Bendersky) Date: Fri, 30 Mar 2012 06:53:45 +0200 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On Thu, Mar 29, 2012 at 21:48, Andrew Svetlov wrote: > I propose to add Thread.interrupt() function. > Could you specify some use cases where you believe this would be better than explicitly asking the thread to stop? Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Fri Mar 30 12:09:12 2012 From: storchaka at gmail.com (Serhiy Storchaka) Date: Fri, 30 Mar 2012 13:09:12 +0300 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: 29.03.12 22:48, Andrew Svetlov ???????(??): > I propose to add Thread.interrupt() function. > > th.interrupt() will set a flag in ThreadState structure. > > When interpreter switches to next thread it will check that flag. > If flag is on then ThreadInterruptionError will be raised in thread context. > If thread has blocked via threading locks (Lock, RLock, Condition, > Semaphore etc) ? exception is raised also. At first glance this is a very attractive suggestion. But how about alternative GIL-less implementations? The interpreter can execute some threads at the same time. Java has a similar mechanism Thread.interrupt(), but that works only if the thread has blocked via threading locks. There is a stronger Thread.stop(), but it is recognized as unsafe and is deprecated. It would be wrong to say that this is the way to *force* stopping of some thread. ThreadInterruptionError can and should be caught in some cases. > BTW, we can disable interruption mechanic by default and use it only > if switched on by threading.enable_interruption() And we need context manager for noninterrable critical sections (or for interrable non-critical sections?). P. S. I've had a crazy idea. What if we allow to raise any exception, not only ThreadInterruptionError, in another thread? From steve at pearwood.info Fri Mar 30 12:14:43 2012 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 30 Mar 2012 21:14:43 +1100 Subject: [Python-ideas] Thread stopping In-Reply-To: References: <32BE53BA-4101-4FED-B9ED-54306AEEEE43@gmail.com> Message-ID: <4F758793.7000503@pearwood.info> Matt Joiner wrote: > Wouldn't it be better to raise a SystemExit in a thread you want > interrupted? I must admit I'm rather confused about this suggestion. Do you mean the thread itself should raise SystemExit? If so, why not just exit the normal way? If you mean the main thread somehow injects a SystemExit in another thread, how would you do that? Perhaps I'm missing something obvious, but I don't know how to do that. -- Steven From tlesher at gmail.com Fri Mar 30 14:02:23 2012 From: tlesher at gmail.com (Tim Lesher) Date: Fri, 30 Mar 2012 08:02:23 -0400 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On Fri, Mar 30, 2012 at 06:09, Serhiy Storchaka wrote: > P. S. I've had a crazy idea. What if we allow to raise any exception, not > only ThreadInterruptionError, in another thread? Technically you can already do this, but only from C (i.e., from an extension or from the interpreter itself). See PyThreadState_SetAsyncExc(). We do this in our software (which embeds Python in a C program) to raise SystemExit in Python-launched threads before finalizing the interpreter. -- Tim Lesher From miki.tebeka at gmail.com Fri Mar 30 19:31:26 2012 From: miki.tebeka at gmail.com (Miki Tebeka) Date: Fri, 30 Mar 2012 10:31:26 -0700 (PDT) Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: <17079424.99.1333128686555.JavaMail.geo-discussion-forums@ynuu20> > I propose to add Thread.interrupt() function. Does the new http://docs.python.org/dev/library/signal.html#signal.pthread_kill help in any way? -------------- next part -------------- An HTML attachment was scrubbed... URL: From zuo at chopin.edu.pl Fri Mar 30 21:39:38 2012 From: zuo at chopin.edu.pl (Jan Kaliszewski) Date: Fri, 30 Mar 2012 21:39:38 +0200 Subject: [Python-ideas] Yielding through context managers In-Reply-To: References: Message-ID: <20120330193938.GC1758@chopin.edu.pl> > >class cd(object): > > def __init__(self, path): > > self.inner_path = path > > > > def __enter__(self): > > self.outer_path = os.getcwd() > > os.chdir(self.inner_path) > > > > def __exit__(self, exc_type, exc_val, exc_tb): > > os.chdir(self.outer_path) > > > > def __yield__(self): > > self.inner_path = os.getcwd() > > os.chdir(self.outer_path) > > > > def __send__(self): > > self.outer_path = os.getcwd() > > os.chdir(self.inner_path) [snip] > >def my_generator(path): > > with cd(path): > > yield do_something() > > do_something_else() Interesting idea, though doing this with present Python does not seem to be very painful: class cd(object): def __init__(self, path): self.inner_path = path def __enter__(self): self.outer_path = os.getcwd() os.chdir(self.inner_path) return self def __exit__(self, exc_type, exc_val, exc_tb): os.chdir(self.outer_path) def my_generator(path): with cd(path) as context: output = do_something() with cd(context.outer_path): yield output ... Cheers. *j From grosser.meister.morti at gmx.net Sat Mar 31 00:39:22 2012 From: grosser.meister.morti at gmx.net (=?UTF-8?B?TWF0aGlhcyBQYW56ZW5iw7Zjaw==?=) Date: Sat, 31 Mar 2012 00:39:22 +0200 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: <4F76361A.8070402@gmx.net> On 03/30/2012 12:09 PM, Serhiy Storchaka wrote: > > P. S. I've had a crazy idea. What if we allow to raise any exception, not only > ThreadInterruptionError, in another thread? > From an OOP point of view this is insanity. But Python (like other dynamic languages) does not enforce strict OOP semantics anyway. -panzi From grosser.meister.morti at gmx.net Sat Mar 31 00:41:38 2012 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sat, 31 Mar 2012 00:41:38 +0200 Subject: [Python-ideas] Thread stopping In-Reply-To: <17079424.99.1333128686555.JavaMail.geo-discussion-forums@ynuu20> References: <17079424.99.1333128686555.JavaMail.geo-discussion-forums@ynuu20> Message-ID: <4F7636A2.8060301@gmx.net> On 03/30/2012 07:31 PM, Miki Tebeka wrote: > > I propose to add Thread.interrupt() function. > > Does the new http://docs.python.org/dev/library/signal.html#signal.pthread_kill help in any way? > This is only available on Unix. You can't send a signal just to one thread in Windows. It's always sent to the whole process. From greg at krypto.org Sat Mar 31 01:02:08 2012 From: greg at krypto.org (Gregory P. Smith) Date: Fri, 30 Mar 2012 16:02:08 -0700 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On Thu, Mar 29, 2012 at 9:53 PM, Eli Bendersky wrote: > On Thu, Mar 29, 2012 at 21:48, Andrew Svetlov wrote: > >> I propose to add Thread.interrupt() function. >> > > > Could you specify some use cases where you believe this would be better > than explicitly asking the thread to stop? > > Eli > The only case I can come up with is wanting to attempt to force a "clean" shutdown of threads spawned by library code out of the main processes control that has not been instrumented with its own well placed checks to see if it is time to shut down. Given such a mechanism can't interrupt blocking system calls or extension module or other VM native language level code, I'm not super excited about implementing this given the limitations. It would be relatively easy to add, I just don't see a huge benefit. As for using SystemExit as the exception or exception base class, I believe existing code such so some well known event loops catches, logs and ignores SystemExit today... ;) -gps -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Sat Mar 31 01:04:13 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Mar 2012 01:04:13 +0200 Subject: [Python-ideas] Thread stopping References: <32BE53BA-4101-4FED-B9ED-54306AEEEE43@gmail.com> <4F758793.7000503@pearwood.info> Message-ID: <20120331010413.2cc42d54@pitrou.net> On Fri, 30 Mar 2012 21:14:43 +1100 Steven D'Aprano wrote: > Matt Joiner wrote: > > Wouldn't it be better to raise a SystemExit in a thread you want > > interrupted? > > I must admit I'm rather confused about this suggestion. Do you mean the thread > itself should raise SystemExit? If so, why not just exit the normal way? > > If you mean the main thread somehow injects a SystemExit in another thread, > how would you do that? Perhaps I'm missing something obvious, but I don't know > how to do that. http://docs.python.org/dev/c-api/init.html#PyThreadState_SetAsyncExc Regards Antoine. From josh at bartletts.id.au Sat Mar 31 03:45:14 2012 From: josh at bartletts.id.au (Joshua Bartlett) Date: Sat, 31 Mar 2012 11:45:14 +1000 Subject: [Python-ideas] Yielding through context managers In-Reply-To: References: Message-ID: > Interesting idea, though doing this with present Python does not seem to > be very painful: > > class cd(object): > > def __init__(self, path): > self.inner_path = path > > def __enter__(self): > self.outer_path = os.getcwd() > os.chdir(self.inner_path) > return self > > def __exit__(self, exc_type, exc_val, exc_tb): > os.chdir(self.outer_path) > > def my_generator(path): > with cd(path) as context: > output = do_something() > with cd(context.outer_path): > yield output > ... > Yes, that's possible, although as the context manager gets more complicated (e.g. modifying os.environ as well as working directory, I'd currently start using something like this: def my_generator(arg): with context_manager(arg) as context: output = do_something() with context.undo(): yield output ... But nevertheless adding __yield__ and __send__ (or equivalent) to context managers means that the author of the context manager can make sure that it's free of unintended side effects, rather than relying on the user to be careful as in the examples above. Cheers, J. D. Bartlett -------------- next part -------------- An HTML attachment was scrubbed... URL: From fuzzyman at gmail.com Sat Mar 31 12:33:30 2012 From: fuzzyman at gmail.com (Michael Foord) Date: Sat, 31 Mar 2012 11:33:30 +0100 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: On 30 March 2012 11:09, Serhiy Storchaka wrote: > 29.03.12 22:48, Andrew Svetlov ???????(??): > > I propose to add Thread.interrupt() function. >> >> th.interrupt() will set a flag in ThreadState structure. >> >> When interpreter switches to next thread it will check that flag. >> If flag is on then ThreadInterruptionError will be raised in thread >> context. >> If thread has blocked via threading locks (Lock, RLock, Condition, >> Semaphore etc) ? exception is raised also. >> > > At first glance this is a very attractive suggestion. But how about > alternative GIL-less implementations? The interpreter can execute some > threads at the same time. > > Java has a similar mechanism Thread.interrupt(), but that works only if > the thread has blocked via threading locks. There is a stronger > Thread.stop(), but it is recognized as unsafe and is deprecated. > > It is "unsafe" because it can interrupt finally blocks - so it is impossible to protect resource cleanup from thread interruptions. Java solved this problem by deprecating thread interrupting - .NET solved it by ensuring that a thread interrupt can't happen in a finally block (so it is "safe"). The .NET solution is better. :-) > It would be wrong to say that this is the way to *force* stopping of some > thread. ThreadInterruptionError can and should be caught in some cases. > > > BTW, we can disable interruption mechanic by default and use it only >> if switched on by threading.enable_interruption(**) >> > > And we need context manager for noninterrable critical sections (or for > interrable non-critical sections?). > > "Critical section" has a particular meaning, and by raising the exception at the interpreter level (and not using the OS thread killing) we can ensure critical sections can't be interrupted. If you just mean "important sections" then preventing thread interrupts in a finally block provides one mechanism for this. An "uninterruptable context manager" would be nice - but would probably need extra vm support and isn't essential. Michael > > P. S. I've had a crazy idea. What if we allow to raise any exception, not > only ThreadInterruptionError, in another thread? > > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrew.svetlov at gmail.com Sat Mar 31 20:06:39 2012 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Sat, 31 Mar 2012 21:06:39 +0300 Subject: [Python-ideas] Thread stopping In-Reply-To: References: Message-ID: After thinking I agree with Michael Foord and others. Automatic threading interruption has good use cases, but it's unsafe. Uninterrable context manager (or decorator) is good idea, but without direct support from VM python thread is 'interruptable' in all places. For now there are no way to detect where we are ? in resource cleanup code or regular block which can be interrupted. Resource cleanup contains not only finally blocks but also __exit__ methods, probably except blocks, user defined callbacks etc ? there are no spec for that. As result, I see no way to introduce concept of 'block for resource cleanup' without backward incompatibility and breaking existing code. I mean that concept have to mark existing finalizers as uninterruptable but we don't know that code is finalizer. >From other point it maybe useful to just add .interrupt() method and .interrupted readonly property to Thread class. The purpose of that is to make standard flag/method as request for interruption and leave actual stopping to user code. That has not so big value as (working) automatic safe interruption but can be considered as way to standardize interruption procedure. On Sat, Mar 31, 2012 at 1:33 PM, Michael Foord wrote: > > > On 30 March 2012 11:09, Serhiy Storchaka wrote: >> >> 29.03.12 22:48, Andrew Svetlov ???????(??): >> >>> I propose to add Thread.interrupt() function. >>> >>> th.interrupt() will set a flag in ThreadState structure. >>> >>> When interpreter switches to next thread it will check that flag. >>> If flag is on then ThreadInterruptionError will be raised in thread >>> context. >>> If thread has blocked via threading locks (Lock, RLock, Condition, >>> Semaphore etc) ? exception is raised also. >> >> >> At first glance this is a very attractive suggestion. But how about >> alternative GIL-less implementations? The interpreter can execute some >> threads at the same time. >> >> Java has a similar mechanism Thread.interrupt(), but that works only if >> the thread has blocked via threading locks. There is a stronger >> Thread.stop(), but it is recognized as unsafe and is deprecated. >> > > > It is "unsafe" because it can interrupt finally blocks - so it is impossible > to protect resource cleanup from thread interruptions. Java solved this > problem by deprecating thread interrupting - .NET solved it by ensuring that > a thread interrupt can't happen in a finally block (so it is "safe"). The > .NET solution is better. :-) > >> >> It would be wrong to say that this is the way to *force* stopping of some >> thread. ThreadInterruptionError can and should be caught in some cases. >> >> >>> BTW, we can disable interruption mechanic by default and use it only >>> if switched on by threading.enable_interruption() >> >> >> And we need context manager for noninterrable critical sections (or for >> interrable non-critical sections?). >> > > > "Critical section" has a particular meaning, and by raising the exception at > the interpreter level (and not using the OS thread killing) we can ensure > critical sections can't be interrupted. If you just mean "important > sections" then preventing thread interrupts in a finally block provides one > mechanism for this. An "uninterruptable context manager" would be nice - but > would probably need extra vm support and isn't essential. > > Michael > >> >> >> P. S. I've had a crazy idea. What if we allow to raise any exception, not >> only ThreadInterruptionError, in another thread? >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > > -- > > http://www.voidspace.org.uk/ > > May you do good and not evil > May you find forgiveness for yourself and forgive others > > May you share freely, never taking more than you give. > -- the sqlite blessing http://www.sqlite.org/different.html > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Thanks, Andrew Svetlov From solipsis at pitrou.net Sat Mar 31 20:32:42 2012 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 31 Mar 2012 20:32:42 +0200 Subject: [Python-ideas] Thread stopping References: <17079424.99.1333128686555.JavaMail.geo-discussion-forums@ynuu20> Message-ID: <20120331203242.029359fd@pitrou.net> On Fri, 30 Mar 2012 10:31:26 -0700 (PDT) Miki Tebeka wrote: > > > I propose to add Thread.interrupt() function. > > Does the new > http://docs.python.org/dev/library/signal.html#signal.pthread_kill help in > any way? Not in any way, no, because Python only executes its own signal handlers in the main thread, even when the signal was received in another thread. Therefore, with interpreter threads, pthread_kill()'s only point is to make a running system call return with EINTR. Regards Antoine.