From jason.orendorff at gmail.com Thu Mar 1 23:46:55 2007 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 1 Mar 2007 17:46:55 -0500 Subject: [Python-ideas] priorityqueue, sortedlist in collections? Message-ID: I wonder if anyone else finds the heapq and bisect APIs a little dated. It seems like these things could be offered in a more intuitive and featureful way by making them classes. They could go in the collections module: class priorityqueue: def __init__(self, elements=(), *, cmp=None, key=None, reversed=False) def add(self, element) def pop(self) --> remove and return the min element def __iter__(self) --> (while self: yield self.pop()) ... ... any other list methods that make sense ... for example, __len__ but not __getitem__ ... class sortedlist: def __init__(self, elements=(), *, cmp=None, key=None, reversed=False) def add(self, element) # insort # Methods involving searching are O(log N). def __contains__(self, element) def index(self, element) def remove(self, element) ... ... plus all the other read-only list methods, ... and the modifying list methods that make sense ... The point is, most of the API comes from list and sort. I think they'd be easier to use/remember than what we have. Once upon a time Tim mentioned (maybe semi-seriously) possibly adding a Fibonacci heap container. I think a plain old binary heap would be good enough for starters. I could do this. Any interest? -j From rhamph at gmail.com Fri Mar 2 01:37:37 2007 From: rhamph at gmail.com (Adam Olsen) Date: Thu, 1 Mar 2007 17:37:37 -0700 Subject: [Python-ideas] priorityqueue, sortedlist in collections? In-Reply-To: References: Message-ID: On 3/1/07, Jason Orendorff wrote: > I wonder if anyone else finds the heapq and bisect APIs a little dated. > > It seems like these things could be offered in a more intuitive and > featureful way by making them classes. They could go in the > collections module: I agree, at least for heapq. The algorithm involves enough voodoo that it's not obvious how to extend it, so it might as well be wrapped in a class that hides as much as possible. The same can't be said for bisect though. All it does is replace del a[bisect_left(a, x)] with a.remove(x). A little cleaner, but no more obvious. And the cost is still O(n) since it involves moving all the pointers after x. > > class priorityqueue: Should inherit from object if this is going into 2.x. > def __init__(self, elements=(), *, > cmp=None, key=None, reversed=False) > def add(self, element) > def pop(self) --> remove and return the min element > def __iter__(self) --> (while self: yield self.pop()) __iter__ shouldn't modify the container. > ... > ... any other list methods that make sense > ... for example, __len__ but not __getitem__ > ... -- Adam Olsen, aka Rhamphoryncus From taleinat at gmail.com Fri Mar 2 13:25:39 2007 From: taleinat at gmail.com (Tal Einat) Date: Fri, 2 Mar 2007 14:25:39 +0200 Subject: [Python-ideas] priorityqueue, sortedlist in collections? In-Reply-To: References: Message-ID: <7afdee2f0703020425m4aa2c1do7a44755bf8ae93a3@mail.gmail.com> On 3/2/07, Adam Olsen wrote: > > On 3/1/07, Jason Orendorff wrote: > > I wonder if anyone else finds the heapq and bisect APIs a little dated. Ooh, me, me! > It seems like these things could be offered in a more intuitive and > > featureful way by making them classes. They could go in the > > collections module: I agree, at least for heapq. The algorithm involves enough voodoo > that it's not obvious how to extend it, so it might as well be wrapped > in a class that hides as much as possible. Sounds good to me. +1 The same can't be said for bisect though. All it does is replace del > a[bisect_left(a, x)] with a.remove(x). A little cleaner, but no more > obvious. And the cost is still O(n) since it involves moving all the > pointers after x. A sortedlist class seems like a useful abstraction to me, since while using an instance of such a class, you could be sure that the list remains sorted. For example, you don't have to worry about it being changed outside of your code and no longer being sorted. Also, wouldn't a sortedlist class also have in insert() method to replace bisect.insort(), as well as different implementations of __contains__ and index? -------------- next part -------------- An HTML attachment was scrubbed... URL: From jimjjewett at gmail.com Fri Mar 2 19:32:11 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 2 Mar 2007 13:32:11 -0500 Subject: [Python-ideas] priorityqueue, sortedlist in collections? In-Reply-To: References: Message-ID: On 3/1/07, Jason Orendorff wrote: > I wonder if anyone else finds the heapq and bisect APIs a little dated. I find them too much of a special case. > ... They could go in the collections module: This alone would be an improvement. I don't disagree with the rest of your proposal, but for me, this is the big one. -jJ From jimjjewett at gmail.com Fri Mar 2 19:37:35 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 2 Mar 2007 13:37:35 -0500 Subject: [Python-ideas] priorityqueue, sortedlist in collections? In-Reply-To: References: Message-ID: On 3/1/07, Adam Olsen wrote: > On 3/1/07, Jason Orendorff wrote: > > class priorityqueue: > Should inherit from object if this is going into 2.x. > > def __init__(self, elements=(), *, > > cmp=None, key=None, reversed=False) > > def add(self, element) > > def pop(self) --> remove and return the min element > > def __iter__(self) --> (while self: yield self.pop()) > __iter__ shouldn't modify the container. generators do not need to be reiterable. lists are reiterable, but I wouldn't expect a socket (treated as a file) to be. For a queue that is already sorted, the natural use case is to take the task and do it, explicitly adding it back to the end if need be. I would expect an operation that *didn't* remove things from the queue to have view in the name somewhere. -jJ From collinw at gmail.com Fri Mar 2 19:45:57 2007 From: collinw at gmail.com (Collin Winter) Date: Fri, 2 Mar 2007 12:45:57 -0600 Subject: [Python-ideas] priorityqueue, sortedlist in collections? In-Reply-To: References: Message-ID: <43aa6ff70703021045y5e6a1d50re19deac6d4d88fce@mail.gmail.com> On 3/2/07, Jim Jewett wrote: > On 3/1/07, Adam Olsen wrote: > > On 3/1/07, Jason Orendorff wrote: > > > > class priorityqueue: > > Should inherit from object if this is going into 2.x. > > > > def __init__(self, elements=(), *, > > > cmp=None, key=None, reversed=False) > > > def add(self, element) > > > def pop(self) --> remove and return the min element > > > def __iter__(self) --> (while self: yield self.pop()) > > __iter__ shouldn't modify the container. > > generators do not need to be reiterable. > > lists are reiterable, but I wouldn't expect a socket (treated as a file) to be. > > For a queue that is already sorted, the natural use case is to take > the task and do it, explicitly adding it back to the end if need be. > I would expect an operation that *didn't* remove things from the queue > to have view in the name somewhere. I would be incredibly surprised if for x in queue: .... destroyed the queue. __iter__ should be implemented non-destructively. Collin Winter From phd at phd.pp.ru Fri Mar 2 20:05:24 2007 From: phd at phd.pp.ru (Oleg Broytmann) Date: Fri, 2 Mar 2007 22:05:24 +0300 Subject: [Python-ideas] priorityqueue, sortedlist in collections? In-Reply-To: <43aa6ff70703021045y5e6a1d50re19deac6d4d88fce@mail.gmail.com> References: <43aa6ff70703021045y5e6a1d50re19deac6d4d88fce@mail.gmail.com> Message-ID: <20070302190524.GC21602@phd.pp.ru> On Fri, Mar 02, 2007 at 12:45:57PM -0600, Collin Winter wrote: > I would be incredibly surprised if > > for x in queue: > .... > > destroyed the queue. __iter__ should be implemented non-destructively. Impossible for external streams such as pipes and sockets. Oleg. -- Oleg Broytmann http://phd.pp.ru/ phd at phd.pp.ru Programmers don't die, they just GOSUB without RETURN. From rhamph at gmail.com Fri Mar 2 20:40:32 2007 From: rhamph at gmail.com (Adam Olsen) Date: Fri, 2 Mar 2007 12:40:32 -0700 Subject: [Python-ideas] priorityqueue, sortedlist in collections? In-Reply-To: References: Message-ID: On 3/2/07, Jim Jewett wrote: > On 3/1/07, Adam Olsen wrote: > > __iter__ shouldn't modify the container. > > generators do not need to be reiterable. > > lists are reiterable, but I wouldn't expect a socket (treated as a file) to be. > > For a queue that is already sorted, the natural use case is to take > the task and do it, explicitly adding it back to the end if need be. > I would expect an operation that *didn't* remove things from the queue > to have view in the name somewhere. A priorityqueue seems far more like a list than a socket though. I'd be very surprised if __iter__ was destructive. If destructive is the only option I would prefer it be a explicit method. Especially since it's common modify and reiterate over a priority queue, which I believe never happens for files or sockets. -- Adam Olsen, aka Rhamphoryncus From rrr at ronadam.com Fri Mar 2 20:57:21 2007 From: rrr at ronadam.com (Ron Adam) Date: Fri, 02 Mar 2007 13:57:21 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <45DF7A91.8020402@canterbury.ac.nz> References: <3d2ce8cb0702221739r2fbca4ffy4825d02361748e13@mail.gmail.com> <45DEC783.8040601@canterbury.ac.nz> <45DF0C84.8020403@ronadam.com> <45DF7A91.8020402@canterbury.ac.nz> Message-ID: <45E881A1.3020902@ronadam.com> It seems this thread has run it's course on python-dev, but being on the road for several thousand miles and since I've given it some thought during the trip, I'll go ahead and post my own conclusions for the record. This would be a python 3.0 suggestion only I think. Greg Ewing wrote: > Ron Adam wrote: > >> Ok, so what if... instead of bool being a type, it be a keyword that >> is just a nicer way to spell 'not not'? > > The main benefit of having a bool type is so that its > values display as "True" and "False" rather than whatever > surrogate values you happen to have used to represent > true and false. This wouldn't change. A bool keyword would still return True or False "bool" objects just as 'not' currently does. It would just complete the set of keywords so we have... and, or --> flow control boolean operators not, bool --> boolean operators Where 'bool' could generate more specific byte code like 'not' already does in some situations. > The fact that bool() can be used canonicalise a truth > value is secondary -- the same thing could just as well > be achieved by a plain function (as it was before bool > became a type). It's an extremely rare thing to need > to do in any case. It seems to me it is the main use, with testing for bool types as the secondary use. All other bool object uses return or compare already created bool objects. > If any change is to be made to bool in 3.0, my preference > would be to keep it but make it a separate type. I think the bool type should still continue to be a subtype of int. The reason for making it it's own type is to limit bools functionality to cases that are more specific in order to avoid some errors. But I feel those errors are not really that common and having bool be a subclass of int increases the situations where bool is useful in a nice way. ie... there are more benefits than consequences. > > Are 'and' and 'or' going to be changed in 3.0 to return bool types too? > > I very much hope not! Their current behaviour is very > useful, and I would hate to lose it. I agree. I only asked to be sure. A little review: On python-dev a question weather or not bool conversion was a wart was brought up because of an apparent inconsistency of value returned in some cases. It was pointed out that there was no inconsistency, and that the current behavior was correct and consistent, as well as being desirable. However, there are some people who feel there is room for improvement where bool is concerned. I think this unsettled/incomplete feeling is due to a small inconsistency of how bool is used, (rather than what it does), in relation to the 'and', 'or' and 'not' keywords. The 'and' and 'or' keywords return either the left or right values depending on their bool interpretation by comparison methods. This is not a problem, and it is both the expected and desired behavior. * There is both implicit (or equivalent) and explicit bool operations in python. the 'not' keyword returns a bool by calling a numbers __nonzero__ method, or a containers __len__ method which in turn is used to returns either a True or False bool object. There is no bool keyword equivalent except 'not not' which isn't the most readable way to do it. This is because any object can be used as an implicit bool, though there are situations where returning an explicitly constructed bool (True or False) expresses a clearer intent. Interpretation: The "small" inconsistency (if any) here is the 'not' keyword vs the 'bool()' constructor. They pretty much do the same thing yet work in modestly different ways. I believe this is a matter of both how the bool objects True and False came into python and a matter of practicality. Boolean objects are used frequently in loop tests and it is very good that they work as fast as possible. (In the case of 'not' different byte code is generated depending on how it is used.) The 'not' operator proceeded bool, other wise we might have had a not() function or type that returns a bool object in it's place, or bool would have been a keyword working in much the same way as not does now. It appears the use of bools are becoming integrated even more closely with the interpreters byte code, so in my opinion it makes more since to make bool a keyword that works in the same way as 'not', rather than make 'not' a sub-classed constructor, (to bool), that returns a bool object. The only difficulty I see with a bool keyword is the bool type would need to be spelled 'Bool' rather than 'bool'. In most cases this change looks like it would be backwards compatible as well. bool (expresssion) <==> bool expression True and False could then become protected built-in objects like None: True = Bool(1) False = Bool(0) The documents on None say the following. Changed in version 2.4: None became a constant and is now recognized by the compiler as a name for the built-in object None. Although it is not a keyword, you cannot assign a different object to it. PEP 3100 only says... * None becomes a keyword [4] (What about True, False?) The reference [4] lists compatibility to python 2.3 as a reason for not protecting True and False, but it seems the general consensus from various threads I've examined is to have them protected. http://mail.python.org/pipermail/python-dev/2004-July/046294.html Making True and False protected objects along with making bool a keyword may have some performance benefits since in many cases the compiler could then directly return the already constructed True and False object instead of creating new one(s). [it may already do this under the covers in some cases(?)] This may or may not effect the case where 'while 1:' is faster than 'while True:'. It is also helpful to look at some byte code examples to compare how 'not', 'not not' and bool() work in different situations and see how it may be changed if bool where a keyword and True and False become protected objects. It may be that in some (common?) cases the C code for some byte codes and objects could return True and False references directly. In my opinion this really just finishes up a process of change that is already started and ties up the loose ends. It does not change what bool(), True, False, and not do in any major way. It may create opportunities for further optimizations. Cheers, Ron From jcarlson at uci.edu Fri Mar 2 23:16:03 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 02 Mar 2007 14:16:03 -0800 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <45E881A1.3020902@ronadam.com> References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com> Message-ID: <20070302141412.AEB0.JCARLSON@uci.edu> Ron Adam wrote: > Interpretation: > > The "small" inconsistency (if any) here is the 'not' keyword vs the > 'bool()' constructor. They pretty much do the same thing yet work in > modestly different ways. Maybe I'm missing something, but is there a place where the following is true? (not not x) != bool(x) I can't think of any that I've ever come across. - Josiah From greg.ewing at canterbury.ac.nz Fri Mar 2 22:37:41 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 03 Mar 2007 10:37:41 +1300 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <45E881A1.3020902@ronadam.com> References: <3d2ce8cb0702221739r2fbca4ffy4825d02361748e13@mail.gmail.com> <45DEC783.8040601@canterbury.ac.nz> <45DF0C84.8020403@ronadam.com> <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com> Message-ID: <45E89925.5050500@canterbury.ac.nz> Ron Adam wrote: > > Greg Ewing wrote: > >> The fact that bool() can be used canonicalise a truth >> value is secondary > > It seems to me it is the main use, with testing for bool types as the > secondary use. What I meant was that this usage is not reason enough on its own for bool to be a type. Given that bool is already a type, it's convenient for its constructor to also be the canonicalising function. It's secondary as a reason for making bool a type, not in the sense of how it's normally used. > having bool be a > subclass of int increases the situations where bool is useful in a nice > way. I still think that most of those uses could be covered by giving bool an __index__ method. > Making True and False protected objects along with making bool a keyword > may have some performance benefits since in many cases the compiler > could then directly return the already constructed True and False object > instead of creating new one(s). [it may already do this under the > covers in some cases(?)] I believe that True and False are treated as singletons, so this should always happen. I don't see anywhere enough benefit to be worth making a bool keyword. Optimisation of access to True and False is probably better addressed through a general mechanism for optimising lookups of globals, although I wouldn't object if they were treated as reserved identifiers a la None. Summary of my position on this: +0 making bool a separate type with __index__ in 3.0 +0 treating True and False like None -1 bool keyword -- Greg From rrr at ronadam.com Sat Mar 3 01:17:32 2007 From: rrr at ronadam.com (Ron Adam) Date: Fri, 02 Mar 2007 18:17:32 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <20070302141412.AEB0.JCARLSON@uci.edu> References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com> <20070302141412.AEB0.JCARLSON@uci.edu> Message-ID: <45E8BE9C.9010600@ronadam.com> Josiah Carlson wrote: > Ron Adam wrote: >> Interpretation: >> >> The "small" inconsistency (if any) here is the 'not' keyword vs the >> 'bool()' constructor. They pretty much do the same thing yet work in >> modestly different ways. > > Maybe I'm missing something, but is there a place where the following is > true? > > (not not x) != bool(x) > > I can't think of any that I've ever come across. > > - Josiah I don't think you are missing anything. I did say it was a *small* inconsistency in how they are used in relation to 'and', 'or' and 'not'. ie.. a constructor vs a keyword in a similar situation. And I also pointed out it doesn't change what they do, it has more to do with how they work underneath and that there may be some benefits in changing this. >>> def foo(x): ... return (not not x) != bool(x) ... >>> dis.dis(foo) 2 0 LOAD_FAST 0 (x) 3 UNARY_NOT 4 UNARY_NOT 5 LOAD_GLOBAL 0 (bool) 8 LOAD_FAST 0 (x) 11 CALL_FUNCTION 1 14 COMPARE_OP 3 (!=) 17 RETURN_VALUE In this case the lines... 5 LOAD_GLOBAL 0 (bool) 8 LOAD_FAST 0 (n) 11 CALL_FUNCTION 1 Would be replaced by... 5 LOAD_FAST 0 (n) 6 UNARY_BOOL The compiler could also replace UNARY_NOT, UNARY_NOT pairs with a single UNARY_BOOL. There may be other situations where this could result in different more efficient byte code as well. _Ron From collinw at gmail.com Sat Mar 3 01:27:23 2007 From: collinw at gmail.com (Collin Winter) Date: Fri, 2 Mar 2007 18:27:23 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <45E8BE9C.9010600@ronadam.com> References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com> <20070302141412.AEB0.JCARLSON@uci.edu> <45E8BE9C.9010600@ronadam.com> Message-ID: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> On 3/2/07, Ron Adam wrote: > Josiah Carlson wrote: > > Ron Adam wrote: > >> Interpretation: > >> > >> The "small" inconsistency (if any) here is the 'not' keyword vs the > >> 'bool()' constructor. They pretty much do the same thing yet work in > >> modestly different ways. > > > > Maybe I'm missing something, but is there a place where the following is > > true? > > > > (not not x) != bool(x) > > > > I can't think of any that I've ever come across. > > > > - Josiah > > I don't think you are missing anything. I did say it was a *small* > inconsistency in how they are used in relation to 'and', 'or' and 'not'. > ie.. a constructor vs a keyword in a similar situation. 'and', 'or' and 'not' are operators (and hence keywords) because making them functions is incredibly ugly: and(or(a, b), not(c)). Changing "bool(x)" to "bool x" introduces a much larger inconsistency between bool and the other built-in types. > And I also pointed out it doesn't change what they do, it has more to do > with how they work underneath and that there may be some benefits in > changing this. So the main reason is for optimization? Are you really calling bool() frequently in inner-loop code? Is a profiler telling you that bool() is a bottleneck in your applications? Collin Winter From rrr at ronadam.com Sat Mar 3 02:42:34 2007 From: rrr at ronadam.com (Ron Adam) Date: Fri, 02 Mar 2007 19:42:34 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com> <20070302141412.AEB0.JCARLSON@uci.edu> <45E8BE9C.9010600@ronadam.com> <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> Message-ID: <45E8D28A.1020009@ronadam.com> Collin Winter wrote: > On 3/2/07, Ron Adam wrote: >> Josiah Carlson wrote: >> > Ron Adam wrote: >> >> Interpretation: >> >> >> >> The "small" inconsistency (if any) here is the 'not' keyword vs the >> >> 'bool()' constructor. They pretty much do the same thing yet work in >> >> modestly different ways. >> > >> > Maybe I'm missing something, but is there a place where the >> following is >> > true? >> > >> > (not not x) != bool(x) >> > >> > I can't think of any that I've ever come across. >> > >> > - Josiah >> >> I don't think you are missing anything. I did say it was a *small* >> inconsistency in how they are used in relation to 'and', 'or' and 'not'. >> ie.. a constructor vs a keyword in a similar situation. > > 'and', 'or' and 'not' are operators (and hence keywords) because > making them functions is incredibly ugly: and(or(a, b), not(c)). I agree. > Changing "bool(x)" to "bool x" introduces a much larger inconsistency > between bool and the other built-in types. A bool keyword would not be a built-in type but be an operator just like 'and', 'or', and 'not'. So it would be even more consistent. Bool() with a capital 'B' would be the built in type and only the very *tiny* naming inconsistency of the capital 'B' would exist in relation to 'int' and the other built_in types. So I think this adds more consistency than it does inconsistency. The performance benefits are a nice additional bonus. And how much that is (if it is a deciding factor) would need to be determined. Note: The Bool() type as a constructor would hardly ever be needed with the presence of a bool operator. So the capital "B" probably won't be a major problem for anyone. If it is, it could be called boolean() instead. >> And I also pointed out it doesn't change what they do, it has more to do >> with how they work underneath and that there may be some benefits in >> changing this. > > So the main reason is for optimization? Are you really calling bool() > frequently in inner-loop code? Is a profiler telling you that bool() > is a bottleneck in your applications? I'm not using it that way my self, but you are correct that the idea needs to be tested. When 'not' is used with 'if' and 'while' it results in specific byte code to that situation instead of UNARY_NOT. A bool operator may also work that way in other similar situations. So it may be a greater benefit than expected. (?) It needs research I admit. And I do realize "if bool x:" and "while bool x:" would not be the preferred spelling and would result in the same *exact* byte code as "if x:" and "while x:" I believe. This was a response to the expressed desire to change bool started in python-dev. Ie... a question and some follow up messages suggesting other possibly greater changes in semantics than this. So it appears there is some consensus that there is something that could be changed, but just what wasn't precisely determined or agreed on. After thinking on it some this is the best I could come up with. ;-) Consider this as 1 part aesthetics, .5 part performance. And yes these changes are *minor*. I've already stated that in quite a few places now. It's the type of thing that could be changed in python 3000 if desired, but doesn't need to be changed if it is felt it shouldn't be changed. So I have no problem if it's ruled out on the grounds that 'there is not sufficient need', or on grounds that "it's incorrect" in some other way. ;-) _Ron From collinw at gmail.com Sat Mar 3 03:08:05 2007 From: collinw at gmail.com (Collin Winter) Date: Fri, 2 Mar 2007 20:08:05 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <45E8D28A.1020009@ronadam.com> References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com> <20070302141412.AEB0.JCARLSON@uci.edu> <45E8BE9C.9010600@ronadam.com> <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> <45E8D28A.1020009@ronadam.com> Message-ID: <43aa6ff70703021808h7febf3fau400f0172f5edcc82@mail.gmail.com> On 3/2/07, Ron Adam wrote: > Collin Winter wrote: > > On 3/2/07, Ron Adam wrote: > >> Josiah Carlson wrote: > >> > Ron Adam wrote: > >> >> Interpretation: > >> >> > >> >> The "small" inconsistency (if any) here is the 'not' keyword vs the > >> >> 'bool()' constructor. They pretty much do the same thing yet work in > >> >> modestly different ways. > >> > > >> > Maybe I'm missing something, but is there a place where the > >> following is > >> > true? > >> > > >> > (not not x) != bool(x) > >> > > >> > I can't think of any that I've ever come across. > >> > > >> > - Josiah > >> > >> I don't think you are missing anything. I did say it was a *small* > >> inconsistency in how they are used in relation to 'and', 'or' and 'not'. > >> ie.. a constructor vs a keyword in a similar situation. > > > > 'and', 'or' and 'not' are operators (and hence keywords) because > > making them functions is incredibly ugly: and(or(a, b), not(c)). > > I agree. > > > Changing "bool(x)" to "bool x" introduces a much larger inconsistency > > between bool and the other built-in types. > > A bool keyword would not be a built-in type but be an operator just like > 'and', 'or', and 'not'. So it would be even more consistent. [snip] > Bool() with a capital 'B' would be the built in type and only the very > *tiny* naming inconsistency of the capital 'B' would exist in relation to > 'int' and the other built_in types. > > So I think this adds more consistency than it does inconsistency. Added consistency: - Things related to booleans are operators (bool, not, and, or). Added inconsistency: - The Bool type does not follow the same naming convention as int, float, dict, list, tuple and set. - There's now a keyword that has 99% of the same spelling, fulfills *fewer* of the same uses-cases and has the *exact* same semantics as a built-in constructor/function. That's a bizarre trade-off. -1000. Collin Winter From jcarlson at uci.edu Sat Mar 3 03:18:25 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Fri, 02 Mar 2007 18:18:25 -0800 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <45E8D28A.1020009@ronadam.com> References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> <45E8D28A.1020009@ronadam.com> Message-ID: <20070302181642.AEB6.JCARLSON@uci.edu> Ron Adam wrote: > So I have no problem if it's ruled out on the grounds that 'there is not > sufficient need' Ok. YAGNI. Seriously. The performance advantage in real code, I guarantee, isn't measurable. Also, the inconsistancy is subjective, and I've never heard anyone complain before. - Josiah From rrr at ronadam.com Sat Mar 3 04:19:33 2007 From: rrr at ronadam.com (Ron Adam) Date: Fri, 02 Mar 2007 21:19:33 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <43aa6ff70703021808h7febf3fau400f0172f5edcc82@mail.gmail.com> References: <45DF7A91.8020402@canterbury.ac.nz> <45E881A1.3020902@ronadam.com> <20070302141412.AEB0.JCARLSON@uci.edu> <45E8BE9C.9010600@ronadam.com> <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> <45E8D28A.1020009@ronadam.com> <43aa6ff70703021808h7febf3fau400f0172f5edcc82@mail.gmail.com> Message-ID: <45E8E945.1070107@ronadam.com> Collin Winter wrote: > On 3/2/07, Ron Adam wrote: >> Collin Winter wrote: >> > On 3/2/07, Ron Adam wrote: >> >> Josiah Carlson wrote: >> >> > Ron Adam wrote: >> >> >> Interpretation: >> >> >> >> >> >> The "small" inconsistency (if any) here is the 'not' keyword vs the >> >> >> 'bool()' constructor. They pretty much do the same thing yet >> work in >> >> >> modestly different ways. >> >> > >> >> > Maybe I'm missing something, but is there a place where the >> >> following is >> >> > true? >> >> > >> >> > (not not x) != bool(x) >> >> > >> >> > I can't think of any that I've ever come across. >> >> > >> >> > - Josiah >> >> >> >> I don't think you are missing anything. I did say it was a *small* >> >> inconsistency in how they are used in relation to 'and', 'or' and >> 'not'. >> >> ie.. a constructor vs a keyword in a similar situation. >> > >> > 'and', 'or' and 'not' are operators (and hence keywords) because >> > making them functions is incredibly ugly: and(or(a, b), not(c)). >> >> I agree. >> >> > Changing "bool(x)" to "bool x" introduces a much larger inconsistency >> > between bool and the other built-in types. >> >> A bool keyword would not be a built-in type but be an operator just like >> 'and', 'or', and 'not'. So it would be even more consistent. > [snip] >> Bool() with a capital 'B' would be the built in type and only the very >> *tiny* naming inconsistency of the capital 'B' would exist in relation to >> 'int' and the other built_in types. >> >> So I think this adds more consistency than it does inconsistency. > > Added consistency: > - Things related to booleans are operators (bool, not, and, or). Yes > Added inconsistency: > - The Bool type does not follow the same naming convention as int, > float, dict, list, tuple and set. Ok, so name the type boolean instead. > - There's now a keyword that has 99% of the same spelling, It can be good it has the same *exact* semantics and spelling. That makes it easier to migrate code > fulfills *fewer* of the same uses-cases The only difference I can think of is in testing for a bool type. boolean would need to be used instead of bool. > and has the *exact* same semantics as a built-in constructor/function. This is not an uncommon thing in python. [1,2,3] <==> list((1,2,3)) {'a':1} <==> dict(a=1) Currently: not not n <==> bool(n) not n <==> bool(not n) Alternative: bool n <==> boolean(n) not n <==> boolean(not n) I suspect the biggest reason against this suggestion is it changes the status quo. As to weather or not it is a keyword, there is just as strong an argument against 'not' not being a keyword as there is for 'bool' being a keyword. But the addition of new keywords is historically something to be avoided with python. In this case, I feel boolean types have become such a fundamental object in python that this just may be doable. So with the above changes you have... * bool becomes an operator * boolean() replaces bool() as a constructor. Consistency: + 1 Things related to booleans are operators (bool, not, and, or). Performance: + .5 Compiler is able to generate more specific byte code. (This is probably only a small benefit so it only gets +.5) Keyword: - 1 it *is* a new keyword, which is something to be avoided. Name: + 1 The bool operator is the same as the old constructor enabling easy migration. - 1 A different name 'boolean' must be used to test for boolean types. And the total is... + .5 And If you add in a -1 for anything that changes anything for the sake that people just don't like changes in general. ie... changing the status quo. It then becomes ... -.5 Does that sound like a fair evaluation? I tried. ;-) _Ron From rrr at ronadam.com Sat Mar 3 04:38:54 2007 From: rrr at ronadam.com (Ron Adam) Date: Fri, 02 Mar 2007 21:38:54 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <20070302181642.AEB6.JCARLSON@uci.edu> References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> <45E8D28A.1020009@ronadam.com> <20070302181642.AEB6.JCARLSON@uci.edu> Message-ID: <45E8EDCE.7000000@ronadam.com> Josiah Carlson wrote: > Ron Adam wrote: >> So I have no problem if it's ruled out on the grounds that 'there is not >> sufficient need' > > Ok. YAGNI. Seriously. The performance advantage in real code, I > guarantee, isn't measurable. Also, the inconsistancy is subjective, and > I've never heard anyone complain before. > > - Josiah You are right. It's more cosmetic than actual. I basically said that from the start. Not as a direct complaint, no. It turns up more as a misunderstanding of how 'and' and 'or' work or a misunderstanding why bool returns what it does instead of what someone thinks it should, as in the thread that started this last week. This addresses that in a subtle indirect way by redefining bool as an operator. Weather or not it would actually improve the ease of understanding how python uses and interacts with boolean types and operators, I'm really not sure. As far as performance gain, I was thinking that in combination with making True and False constants, it might be greater. But it may very well not be. Maybe sometime in the near future I will try to test just how much of a difference it could make. In any case the making of True and False into constants is still an open python 3k issue. This was more exploratory and since nobody else jumped in and took it up, it seems it would be un-popular as well. But it's here for the record in any case. Cheers, Ron From rrr at ronadam.com Sat Mar 3 08:14:42 2007 From: rrr at ronadam.com (Ron Adam) Date: Sat, 03 Mar 2007 01:14:42 -0600 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: <20070302181642.AEB6.JCARLSON@uci.edu> References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> <45E8D28A.1020009@ronadam.com> <20070302181642.AEB6.JCARLSON@uci.edu> Message-ID: <45E92062.5010306@ronadam.com> Josiah Carlson wrote: > Ron Adam wrote: >> So I have no problem if it's ruled out on the grounds that 'there is not >> sufficient need' > > Ok. YAGNI. Seriously. The performance advantage in real code, I > guarantee, isn't measurable. It is measurable, but the total average gain would not be substantial only because of it being a relatively rare operation. The most common use of bool() seems to be in returning a bool value. return bool(exp) A few timeit tests... The bool constructor: $ python2.5 -m timeit -n 10000000 'bool(1)' 10000000 loops, best of 3: 0.465 usec per loop $ python2.5 -m timeit -n 10000000 'bool("a")' 10000000 loops, best of 3: 0.479 usec per loop $ python2.5 -m timeit -n 10000000 'bool([1])' 10000000 loops, best of 3: 0.697 usec per loop A bool operator would have the same (or faster) speed as the 'not' operator: $ python2.5 -m timeit -n 10000000 'not 1' 10000000 loops, best of 3: 0.165 usec per loop $ python2.5 -m timeit -n 10000000 'not "a"' 10000000 loops, best of 3: 0.164 usec per loop $ python2.5 -m timeit -n 10000000 'not [1]' 10000000 loops, best of 3: 0.369 usec per loop The real gain is dependent on how and where it is used of course. In most cases 'not not x' would still be much faster than bool(x). Its nearly as fast as a single 'not' but really isn't the most readable way to do it. $ python2.5 -m timeit -n 10000000 'not not 1' 10000000 loops, best of 3: 0.18 usec per loop $ python2.5 -m timeit -n 10000000 'not not "a"' 10000000 loops, best of 3: 0.185 usec per loop $ python2.5 -m timeit -n 10000000 'not not [1]' 10000000 loops, best of 3: 0.381 usec per loop The following may be an indication of how much speed difference may still be gained in other related situations such as flow control or comparisons where bool values are used implicitly. $ python2.5 -m timeit -n 10000000 'if "a": pass' 10000000 loops, best of 3: 0.0729 usec per loop $ python2.5 -m timeit -n 10000000 'if not "a": pass' 10000000 loops, best of 3: 0.17 usec per loop $ python2.5 -m timeit -n 10000000 'if bool("a"): pass' 10000000 loops, best of 3: 0.486 usec per loop In the case of a bool operator, it's not a major issue because of how rarely it's actually used. Cheers, Ron From armin.ronacher at active-4.com Sat Mar 3 09:25:19 2007 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Sat, 3 Mar 2007 08:25:19 +0000 (UTC) Subject: [Python-ideas] if with as Message-ID: Hi all, Many languages allow assignment expressions in if conditions. (Perl, PHP, Ruby, C, etc..) I know that this was dismissed by guido because it can lead to mistakes. And i can support that. However in some situations it might be required because you have to nest if blocks and regular expressions for example:: while pos < text_length: if match = name_re.match(text, pos): pos = match.end() do_something(match) elif match = digit_re.match(text, pos): pos = match.end() do_something(match) else: pos += 1 Well. But that would require an assignment. Why not use the "as" keyword introduced in python2.5 with the future import:: while pos < text_length: if name_re.match(text, pos) as match: pos = match.end() do_something(match) elif digit_re.match(text, pos) as match: pos = match.end() do_something(match) else: pos += 1 Advantages: Still no assignment expression, no additional keyword, simple to understand. Regards, Armin From jcarlson at uci.edu Sat Mar 3 10:40:01 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Sat, 03 Mar 2007 01:40:01 -0800 Subject: [Python-ideas] if with as In-Reply-To: References: Message-ID: <20070303011935.AEBC.JCARLSON@uci.edu> Armin Ronacher wrote: > > Hi all, > > Many languages allow assignment expressions in if conditions. (Perl, PHP, Ruby, > C, etc..) I know that this was dismissed by guido because it can lead to > mistakes. And i can support that. However in some situations it might be > required because you have to nest if blocks and regular expressions for example:: [snip] You could convert the code you have offered into the following: while pos < text_length: match = name_re.match(text, pos) or \ digit_re.match(text, pos) or \ None if match: post = match.end() do_something(match) else: pos += 1 Not only does it work today in any Python with the re module, has the same number of lines as what you provided, and doesn't repeat itself, it can be shortened with a simple helper function. -1 Also, I think it would be confusing. It is currently used in imports and the with statement, but it was *necessary* to have some assignment semantic in with statement - there is no such necessity in if/elif (or even while, which is the next step after if/elif). - Josiah From greg.ewing at canterbury.ac.nz Sat Mar 3 12:04:28 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 04 Mar 2007 00:04:28 +1300 Subject: [Python-ideas] if with as In-Reply-To: <20070303011935.AEBC.JCARLSON@uci.edu> References: <20070303011935.AEBC.JCARLSON@uci.edu> Message-ID: <45E9563C.8060002@canterbury.ac.nz> Josiah Carlson wrote: > You could convert the code you have offered into the following: > > while pos < text_length: > match = name_re.match(text, pos) or \ > digit_re.match(text, pos) or \ > None > if match: > post = match.end() > do_something(match) > else: > pos += 1 I think the idea was that the do_something(match) could be a different thing for each re. -- Greg From armin.ronacher at active-4.com Sat Mar 3 12:13:14 2007 From: armin.ronacher at active-4.com (Armin Ronacher) Date: Sat, 3 Mar 2007 11:13:14 +0000 (UTC) Subject: [Python-ideas] if with as References: <20070303011935.AEBC.JCARLSON@uci.edu> <45E9563C.8060002@canterbury.ac.nz> Message-ID: Greg Ewing writes: > > I think the idea was that the do_something(match) could > be a different thing for each re. > > -- > Greg > Yes. It is, Regards, Armin From talin at acm.org Sat Mar 3 19:50:46 2007 From: talin at acm.org (Talin) Date: Sat, 03 Mar 2007 10:50:46 -0800 Subject: [Python-ideas] if with as In-Reply-To: References: Message-ID: <45E9C386.6020303@acm.org> Armin Ronacher wrote: > Hi all, > > Many languages allow assignment expressions in if conditions. (Perl, PHP, Ruby, > C, etc..) I know that this was dismissed by guido because it can lead to > mistakes. And i can support that. However in some situations it might be > required because you have to nest if blocks and regular expressions for example:: > > while pos < text_length: > if match = name_re.match(text, pos): > pos = match.end() > do_something(match) > elif match = digit_re.match(text, pos): > pos = match.end() > do_something(match) > else: > pos += 1 > > Well. But that would require an assignment. Why not use the "as" keyword > introduced in python2.5 with the future import:: > > while pos < text_length: > if name_re.match(text, pos) as match: > pos = match.end() > do_something(match) > elif digit_re.match(text, pos) as match: > pos = match.end() > do_something(match) > else: > pos += 1 > > Advantages: Still no assignment expression, no additional keyword, simple to > understand. Personally, I like it - it's an issue that I've brought up before, but your syntax is better. With the introduction of 2.5's "with A as B", and the new exception-handling syntax in Py3K 'except E as v' (and the already existing import syntax), it seems to me that we are, in fact, establishing a general rule that: as : ...is a common syntactical pattern in Python, meaning 'do something special with expression, and then as a side effect, assign that expression to the named variable for this block." -- Talin From guido at python.org Sun Mar 4 00:56:35 2007 From: guido at python.org (Guido van Rossum) Date: Sat, 3 Mar 2007 15:56:35 -0800 Subject: [Python-ideas] bool keyword: (was [Python-Dev] bool conversion wart?) In-Reply-To: References: <43aa6ff70703021627y28c7bfa6gfc0d6df302526cd9@mail.gmail.com> <45E8D28A.1020009@ronadam.com> <20070302181642.AEB6.JCARLSON@uci.edu> <45E92062.5010306@ronadam.com> Message-ID: FWIW, there's zero chance that bool will change in Py3k. Well, unless I get hit by a bus before it's released. It sounds like a typical bikeshed color discussion, not worth the electrons killed to transmit the posts. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From larry at hastings.org Mon Mar 5 08:52:07 2007 From: larry at hastings.org (Larry Hastings) Date: Sun, 04 Mar 2007 23:52:07 -0800 Subject: [Python-ideas] if with as In-Reply-To: <45E9C386.6020303@acm.org> References: <45E9C386.6020303@acm.org> Message-ID: <45EBCC27.7090008@hastings.org> Talin wrote: > Armin Ronacher wrote: > >> if name_re.match(text, pos) as match: >> > Personally, I like it - [...] it seems to me that we are, in fact, > establishing a general rule that: > > as : > > ...is a common syntactical pattern in Python, meaning 'do something > special with expression, and then as a side effect, assign that > expression to the named variable for this block." I like it too. However: unlike "except x as e" and "with a as b", the indented blocks under if and else don't establish a new scope. So it begs the question: what is the lifetime of the variable created by "if x as y" here? If it magically goes away, even as you create new variables in the scope, that just seems a little too weird to me. If it outlives the nested block under the if, that's weird too. (Personally I'd prefer it if the blocks under if and else *did* establish a new scope, but I know that's never going to change.) Cheers, /larry/ From free.condiments at gmail.com Mon Mar 5 20:37:06 2007 From: free.condiments at gmail.com (Sam) Date: Mon, 5 Mar 2007 19:37:06 +0000 Subject: [Python-ideas] if with as In-Reply-To: <45EBCC27.7090008@hastings.org> References: <45E9C386.6020303@acm.org> <45EBCC27.7090008@hastings.org> Message-ID: (apologies to Larry - the GMail interface is still confusing and arbirary) On 05/03/07, Larry Hastings wrote: > I like it too. However: unlike "except x as e" and "with a as b", the > indented blocks under if and else don't establish a new scope. So it > begs the question: what is the lifetime of the variable created by "if x > as y" here? If it magically goes away, even as you create new variables > in the scope, that just seems a little too weird to me. If it outlives > the nested block under the if, that's weird too. I could be using a different definition of 'scope' than you, but: >>> try: raise Exception("foo") except Exception, e: pass >>> print e foo seems to suggest that except blocks don't set up a new scope to me. >>> with file('C:/foo.txt') as f: pass >>> print f says the same thing about the with statement to me. So the variable defined with an if as : statement outliving the end of the indented block would be absolutely no surprise according to the current semantics of other statements. --Sam From larry at hastings.org Mon Mar 5 23:19:40 2007 From: larry at hastings.org (Larry Hastings) Date: Mon, 05 Mar 2007 14:19:40 -0800 Subject: [Python-ideas] if with as In-Reply-To: References: <45E9C386.6020303@acm.org> <45EBCC27.7090008@hastings.org> Message-ID: <45EC977C.3020808@hastings.org> Sam wrote: > except blocks don't set up a new scope to me. > [...second code example deleted...] > says the same thing about the with statement to me. > (What version of Python are you using, a 2.6 build? In 2.5 release, a file() is not a context manager, so 'with file() as f:' doesn't work. And in py3k the "except E, N" syntax is gone.) First off, you are absolutely right, the "with" and "except" statements do not open new scopes. I was wrong about that. But PEP 3110, "Catching Exceptions In Python:, describes "except E as N" this way: try: try_body except E as N: except_body ... gets translated to (in Python 2.5 terms) try: try_body except E, N: try: except_body finally: N = None del N ... http://www.python.org/dev/peps/pep-3110/ So the intent for "except E as N" is that N does *not* outlive the "except" block. And indeed that's what happens in my month-old Py3k. Also, in your example "with file(...) as f:", outside the "with" block "f" evaluated as "". The file was closed because "with file(...) as f:" wasn't simply assigning the result of the "with" expression to f; it assigns to f the result of calling __enter__() on the with with expression's result. It seems that post-Python 2.5 file.__enter__() returns the file. This isn't a bad thing, but it's a little misleading as "with A as X" doesn't usually wind up with X containing the "result" of A. So I can state: in Py3k, in both cases of "except E as N" and "with E as N", after the nested block has exited, N does not contain the direct result of E. I guess I'm starting to disagree with sprinkling "as" into the syntax as the defacto inline assignment operator. Quoting from PEP 343: So now the final hurdle was that the PEP 310 syntax: with VAR = EXPR: BLOCK1 would be deceptive, since VAR does *not* receive the value of EXPR. Borrowing from PEP 340 , it was an easy step to: with EXPR as VAR: BLOCK1 http://www.python.org/dev/peps/pep-0343/ Clearly the original intent with "as" was that it was specifically *not* direct assignment. Python added a *new keyword*, specifically because this was not direct inline assignment. So I assert a Python programmer should not see "as" and think "= with the two sides swapped". For my final trick, I will attempt to channel Guido. (Which is spooky, as he hasn't even Crossed Over To The Other Side.) Python was originally written in C, the poster child for inline assignment operators, so he was obviously familiar with the syntactic construct. But Python does *not* support inline assignment everywhere. AFAIK it specifically has supports for it in one place: allowing multiple assignments at once. This works: a = b = file(...) But this does not: if a = expression: # hooray, expression worked else: # boo, expression failed Since nobody likes to type, I'm sure people have requested inline assignment zillions of times before. Since Python does not support it, I'm guessing Guido doesn't want it for some reason. Perhaps it's to save programmers from the inevitable heartache of typing "if a = x" when they meant "if a == x" (or vice-versa); if so then that's a little surprising, considering the "consenting adults" design tenet, but at least "if E as N" would not have that problem. If there was some other reason, then I bet "if E as N" doesn't address it. Knock three times, /larry/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From talin at acm.org Mon Mar 5 23:40:55 2007 From: talin at acm.org (Talin) Date: Mon, 05 Mar 2007 14:40:55 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment Message-ID: <45EC9C77.10603@acm.org> As you probably know, the next version of Javascript (1.7) will have a number of ideas that have been borrowed from Python. In particular, the "tuple assignment" syntax will now be supported in Javascript, which will be a pleasant addition to the language. However, I noticed that the Javascript version is, in some respects, a superset of the Python functionality. If you are interested, you might have a look at this page: http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7 Go to the section called "Destructuring assignment" to check out how the new syntax is going to work. As an example of what I mean, the Javascript unpacking syntax will allow variables to be skipped: [a,,b] = [1,2,3] In other words, a is assigned the value 1, the value 2 is thrown away, and b is assigned the value 3. In today's Python, this requires a dummy variable. I admit that this is not a particularly important feature; However, given that Javascript is being inspired by Python in this case, maybe it would be appropriate to return the favor? -- Talin From greg.ewing at canterbury.ac.nz Tue Mar 6 00:00:33 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 06 Mar 2007 12:00:33 +1300 Subject: [Python-ideas] if with as In-Reply-To: <45EC977C.3020808@hastings.org> References: <45E9C386.6020303@acm.org> <45EBCC27.7090008@hastings.org> <45EC977C.3020808@hastings.org> Message-ID: <45ECA111.60708@canterbury.ac.nz> Larry Hastings wrote: > So the intent for "except E as N" is that N does *not* outlive the > "except" block. And indeed that's what happens in my month-old Py3k. That's only because the traceback is being attached to the exception, and there's a desire to avoid creating a cycle. There are indications that this idea might be dropped, in which case deleting the exception would no longer be necessary. In any case, it's still not really introducing a new scope, since if you use 'e' anywhere else in the function, it's the same variable. > I'm sure people have requested inline > assignment zillions of times before. Since Python does not support it, > I'm guessing Guido doesn't want it for some reason. I believe the main objection is that it would reintroduce the potential for accidentally writing if a = b: instead of if a == b: and having it go undetected. Using an 'as' clause would be one way of avoiding that. Using a different operator, such as if a := b: would be another way. A further possible objection is that allowing in-line assignments anywhere in any expression could lead to hard-to-follow code. That could be mitigated by only allowing them in certain places, such as the conditions of if and while statements. > Clearly the original intent with "as" was that it was > specifically *not* direct assignment. Each usage of "as" is unique. It's whatever it needs to be in each case. In general it's not direct assignment, but that doesn't mean that some of its uses couldn't be direct assignment if we wanted. -- Greg From brett at python.org Tue Mar 6 00:12:58 2007 From: brett at python.org (Brett Cannon) Date: Mon, 5 Mar 2007 15:12:58 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EC9C77.10603@acm.org> References: <45EC9C77.10603@acm.org> Message-ID: On 3/5/07, Talin wrote: > As you probably know, the next version of Javascript (1.7) will have a > number of ideas that have been borrowed from Python. In particular, the > "tuple assignment" syntax will now be supported in Javascript, which > will be a pleasant addition to the language. > > However, I noticed that the Javascript version is, in some respects, a > superset of the Python functionality. > > If you are interested, you might have a look at this page: > > http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7 > > Go to the section called "Destructuring assignment" to check out how the > new syntax is going to work. > > As an example of what I mean, the Javascript unpacking syntax will allow > variables to be skipped: > > [a,,b] = [1,2,3] > > In other words, a is assigned the value 1, the value 2 is thrown away, > and b is assigned the value 3. In today's Python, this requires a dummy > variable. It skips in the middle?!? Yuck. > > I admit that this is not a particularly important feature; However, > given that Javascript is being inspired by Python in this case, maybe it > would be appropriate to return the favor? > I personally am -1 on the idea. Explicit is better than implicit. -Brett From greg.ewing at canterbury.ac.nz Tue Mar 6 00:48:09 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 06 Mar 2007 12:48:09 +1300 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> Message-ID: <45ECAC39.4090908@canterbury.ac.nz> Brett Cannon wrote: > I personally am -1 on the idea. Explicit is better than implicit. One thing I *would* like to see, that Javascript doesn't seem to have either yet, is a *-arg on the end of the unpacking list: a, b, *c = [1, 2, 3, 4, 5] giving a == 1, b == 2, c == [3, 4, 5]. -- Greg From brett at python.org Tue Mar 6 01:31:09 2007 From: brett at python.org (Brett Cannon) Date: Mon, 5 Mar 2007 16:31:09 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45ECAC39.4090908@canterbury.ac.nz> References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> Message-ID: On 3/5/07, Greg Ewing wrote: > Brett Cannon wrote: > > > I personally am -1 on the idea. Explicit is better than implicit. > > One thing I *would* like to see, that Javascript doesn't > seem to have either yet, is a *-arg on the end of the > unpacking list: > > a, b, *c = [1, 2, 3, 4, 5] > > giving a == 1, b == 2, c == [3, 4, 5]. > Now I know that was discussed on python-dev once, but I don't remember why didn't end up happening. -Brett From jason.orendorff at gmail.com Tue Mar 6 01:55:10 2007 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Mon, 5 Mar 2007 19:55:10 -0500 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EC9C77.10603@acm.org> References: <45EC9C77.10603@acm.org> Message-ID: On 3/5/07, Talin wrote: > As you probably know, the next version of Javascript (1.7) will have a > number of ideas that have been borrowed from Python. [...] JavaScript 1.7 is the current version. It shipped with Firefox 2.0. -j From jason.orendorff at gmail.com Tue Mar 6 04:17:22 2007 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Mon, 5 Mar 2007 22:17:22 -0500 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> Message-ID: To me, the interesting development on the JavaScript front is Tamarin, the new JavaScript VM contributed by Adobe. Tamarin contains a just-in-time compiler. I don't know how many people here have seriously looked at the JavaScript engine, but it's not *that* different from Python. I know compiling to machine code has been discussed here, and the consensus is that it's not worth it. I'm wondering: how can this be right for JavaScript and wrong for Python? Is Mozilla making a mistake? -j From jimjjewett at gmail.com Tue Mar 6 18:47:05 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Tue, 6 Mar 2007 12:47:05 -0500 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> Message-ID: On 3/5/07, Brett Cannon wrote: > On 3/5/07, Talin wrote: > > http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7 ... > > [a,,b] = [1,2,3] > > > > In other words, a is assigned the value 1, the value 2 is thrown away, > > and b is assigned the value 3. In today's Python, this requires a dummy > > variable. > It skips in the middle?!? Yuck. Are you assuming variable-length skips, so that [a,,b] = [1,2,3,4,5] would mean a=1;b=5 ? I had read it is just not requiring a name for the unused dummy variable. It doesn't really seem worse than [a, _junk, b] = [1,2,3] or less explicit than [a,_,b] = [1,2,3] -jJ From jcarlson at uci.edu Tue Mar 6 19:25:30 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 06 Mar 2007 10:25:30 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: Message-ID: <20070306102331.74A3.JCARLSON@uci.edu> "Jason Orendorff" wrote: > To me, the interesting development on the JavaScript front is Tamarin, > the new JavaScript VM contributed by Adobe. Tamarin contains a > just-in-time compiler. > > I don't know how many people here have seriously looked at the > JavaScript engine, but it's not *that* different from Python. I know > compiling to machine code has been discussed here, and the consensus > is that it's not worth it. I'm wondering: how can this be right for > JavaScript and wrong for Python? Is Mozilla making a mistake? For reference: psyco is a JIT compiler for Python, and I believe PyPy has a JIT compiler (or possibly could, as I believe it has a backend for LLVM, what is used to generate object code in GCC/G++). - Josiah From brett at python.org Tue Mar 6 20:26:20 2007 From: brett at python.org (Brett Cannon) Date: Tue, 6 Mar 2007 11:26:20 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> Message-ID: On 3/6/07, Jim Jewett wrote: > On 3/5/07, Brett Cannon wrote: > > On 3/5/07, Talin wrote: > > > http://developer.mozilla.org/en/docs/New_in_JavaScript_1.7 > ... > > > > [a,,b] = [1,2,3] > > > > > > In other words, a is assigned the value 1, the value 2 is thrown away, > > > and b is assigned the value 3. In today's Python, this requires a dummy > > > variable. > > > It skips in the middle?!? Yuck. > > Are you assuming variable-length skips, so that > > [a,,b] = [1,2,3,4,5] would mean a=1;b=5 ? > Yep, that's how I read it. > I had read it is just not requiring a name for the unused dummy > variable. It doesn't really seem worse than > > [a, _junk, b] = [1,2,3] > > or less explicit than > > [a,_,b] = [1,2,3] Right, but why the ends? And what if that example had three variables to unpack to, e.g., a, b, and c? With a = 1 and c = 5, what does b get? 2, 4, 3? It can be intrepreted in so many ways its ambiguous without referencing the documentation. -Brett From tdelaney at avaya.com Tue Mar 6 23:10:55 2007 From: tdelaney at avaya.com (Delaney, Timothy (Tim)) Date: Wed, 7 Mar 2007 09:10:55 +1100 Subject: [Python-ideas] Javascript Destructuring Assignment Message-ID: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com> Brett Cannon wrote: >> Are you assuming variable-length skips, so that >> >> [a,,b] = [1,2,3,4,5] would mean a=1;b=5 ? >> > > Yep, that's how I read it. I read it differently. I'm think you need to have a comma for each skipped element: [a,,,b] = [1,2,3,4,5] Anyone with in-depth knowledge, or FireFox 2.0 available to test this? Tim Delaney From greg.ewing at canterbury.ac.nz Tue Mar 6 23:20:36 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 07 Mar 2007 11:20:36 +1300 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> Message-ID: <45EDE934.9040100@canterbury.ac.nz> Jason Orendorff wrote: > Tamarin contains a just-in-time compiler. > > I'm wondering: how can this be right for > JavaScript and wrong for Python? Javascript is a much simpler language than Python, with far fewer potential data types. Maybe that makes it easier to compile efficiently. Most likely it makes the task of writing a compiler much easier. Or, who knows, maybe it isn't right for Javascript either. Just because someone does something doesn't necessarily mean it's a good idea. -- Greg From thobes at gmail.com Wed Mar 7 00:17:57 2007 From: thobes at gmail.com (Tobias Ivarsson) Date: Wed, 7 Mar 2007 00:17:57 +0100 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com> Message-ID: <9997d5e60703061517v41bb961fq42bfa659710c18d4@mail.gmail.com> On 3/6/07, Delaney, Timothy (Tim) wrote: > > Brett Cannon wrote: > > >> Are you assuming variable-length skips, so that > >> > >> [a,,b] = [1,2,3,4,5] would mean a=1;b=5 ? > >> > > > > Yep, that's how I read it. > > I read it differently. I'm think you need to have a comma for each > skipped element: > > [a,,,b] = [1,2,3,4,5] > > Anyone with in-depth knowledge, or FireFox 2.0 available to test this? Firefox 2.0 gives me these results: [a,,b] = [1,2,3,4]; a == 1 b == 3 [a,,,b] = [1,2,3,4]; a == 1 b == 4 [,a,,b] = [1,2,3,4]; a == 2 b == 4 /Tobias Ivarsson Tim Delaney > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bwinton at latte.ca Wed Mar 7 01:15:24 2007 From: bwinton at latte.ca (Blake Winton) Date: Tue, 06 Mar 2007 19:15:24 -0500 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com> References: <2773CAC687FD5F4689F526998C7E4E5FF1ECED@au3010avexu1.global.avaya.com> Message-ID: <45EE041C.1020909@latte.ca> (re-sending, because I forgot to subscribe first...) Delaney, Timothy (Tim) wrote: > Brett Cannon wrote: >>> Are you assuming variable-length skips, so that >>> [a,,b] = [1,2,3,4,5] would mean a=1;b=5 ? >> Yep, that's how I read it. > I read it differently. I'm think you need to have a comma for each > skipped element: > [a,,,b] = [1,2,3,4,5] > Anyone with in-depth knowledge, or FireFox 2.0 available to test this? Oddly enough, I build a Javascript 1.7 interpreter just yesterday... js> [a,b] = [1,2] 1,2 js> a 1 js> b 2 js> [a,b] = [1,2,3] 1,2,3 js> a 1 js> b 2 js> [a,,b] = [1,2,3,4,5] 1,2,3,4,5 js> a 1 js> b 3 js> [a,,,b] = [1,2,3,4,5] 1,2,3,4,5 js> a 1 js> b 4 which is pretty much what I'ld expect. Having said that, it does make it easy to mess up in exactly the way you messed up in your example... Also, for the curious: js> [a,[b,c],d] = [1,2,3,4] 1,2,3,4 js> a 1 js> b js> c js> d 3 I'ld far prefer it to throw an error if you tried to unpack into a sequence of the wrong length, but that's not the Javascript way, I suppose... Later, Blake. From rrr at ronadam.com Wed Mar 7 07:13:24 2007 From: rrr at ronadam.com (Ron Adam) Date: Wed, 07 Mar 2007 00:13:24 -0600 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> Message-ID: <45EE5804.9080301@ronadam.com> Brett Cannon wrote: > On 3/5/07, Greg Ewing wrote: >> Brett Cannon wrote: >> >>> I personally am -1 on the idea. Explicit is better than implicit. >> One thing I *would* like to see, that Javascript doesn't >> seem to have either yet, is a *-arg on the end of the >> unpacking list: >> >> a, b, *c = [1, 2, 3, 4, 5] >> >> giving a == 1, b == 2, c == [3, 4, 5]. >> > > Now I know that was discussed on python-dev once, but I don't remember > why didn't end up happening. (If I recall correctly) There was some support for using the '*' outside of function signatures. I think it died out because of too many alternative suggestions. Or there was some sort of ambiguous situations I'm not remembering, possibly confusion with the multiply operator. I have mixed feeling on it myself. The reason being, (to me), using the '*' for both packing and unpacking is not the most readable solution. Also the '*' syntax can't be used to unpack nested items. >>> data = [1, [2, [3, 4]]] >>> a, (b, (c, d)) = data >>> print a, b, c, d 1 2 3 4 Ron From greg.ewing at canterbury.ac.nz Wed Mar 7 09:34:34 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 07 Mar 2007 21:34:34 +1300 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EE5804.9080301@ronadam.com> References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> <45EE5804.9080301@ronadam.com> Message-ID: <45EE791A.8000307@canterbury.ac.nz> Ron Adam wrote: > I have mixed feeling on it myself. As far as I remember, there weren't really any objective reasons given against it, just gut feelings of dislike from some people, including Guido. I'm living in hopes that he may come round to the idea in time (as he did with conditional expressions, for example). > The reason being, (to me), using the '*' for both packing > and unpacking is not the most readable solution. To me it doesn't seem any less readable than using [...,...] or (...,...) for both packing and unpacking, or using * in both function definitions and calls. In fact the symmetry is a large part of the beauty of the idea. > Also the '*' syntax can't be used to unpack nested items. I'm not sure what you mean by that. >>> [[a, b, *c], [d, e, *f], *g] = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] >>> print a, b, c, d, e, f, g 1 2 [3, 4] 5 6 [7, 8] [9, 10] Makes perfectly good sense to me. -- Greg From scott+python-ideas at scottdial.com Sun Mar 4 10:23:22 2007 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Sun, 04 Mar 2007 04:23:22 -0500 Subject: [Python-ideas] if with as In-Reply-To: <45E9C386.6020303@acm.org> References: <45E9C386.6020303@acm.org> Message-ID: <45EA900A.7010905@scottdial.com> Talin wrote: > Personally, I like it - it's an issue that I've brought up before, but > your syntax is better. With the introduction of 2.5's "with A as B", and > the new exception-handling syntax in Py3K 'except E as v' (and the > already existing import syntax), it seems to me that we are, in fact, > establishing a general rule that: > > as : > > ...is a common syntactical pattern in Python, meaning 'do something > special with expression, and then as a side effect, assign that > expression to the named variable for this block." While I am not going to advocate it, I would like to point out that these are all just broken versions of an infix assignment operator[1]. As Josiah pointed out, they are used right now in places where explicit assignment is not possible. I don't believe you will ever successfully push such an operator through when it could easily be done explicitly. As for this particular case, it is only useful in a very restricted set of expressions and I was only able to find a handful of cases in stdlib where I could drop in a "if x as y". I believe this is an indication of how rarely one wants to do this. YMMV. -Scott [1] An infix assignment operator would be grammatically add to tests the form: test 'as' testlist. This is just a can of worms yielding pure obfuscation. -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From g.brandl at gmx.net Wed Mar 7 11:19:59 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 07 Mar 2007 11:19:59 +0100 Subject: [Python-ideas] if with as In-Reply-To: <45EA900A.7010905@scottdial.com> References: <45E9C386.6020303@acm.org> <45EA900A.7010905@scottdial.com> Message-ID: Scott Dial schrieb: > Talin wrote: >> Personally, I like it - it's an issue that I've brought up before, but >> your syntax is better. With the introduction of 2.5's "with A as B", and >> the new exception-handling syntax in Py3K 'except E as v' (and the >> already existing import syntax), it seems to me that we are, in fact, >> establishing a general rule that: >> >> as : >> >> ...is a common syntactical pattern in Python, meaning 'do something >> special with expression, and then as a side effect, assign that >> expression to the named variable for this block." > > While I am not going to advocate it, I would like to point out that > these are all just broken versions of an infix assignment operator[1]. > As Josiah pointed out, they are used right now in places where explicit > assignment is not possible. I don't believe you will ever successfully > push such an operator through when it could easily be done explicitly. The existing uses for "as" are all different. The ones with "except" and "with" do not just assign the left hand expression to the right hand name, but the one with "import" does. > As for this particular case, it is only useful in a very restricted set > of expressions and I was only able to find a handful of cases in stdlib > where I could drop in a "if x as y". I believe this is an indication of > how rarely one wants to do this. YMMV. It does here. I can't present use cases right now, but I do recall several times where I thought this could have made the intent of the code clearer, I think I even proposed it myself some time ago. Georg From jimjjewett at gmail.com Wed Mar 7 16:48:05 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 7 Mar 2007 10:48:05 -0500 Subject: [Python-ideas] while with as [was: if with as] Message-ID: Scott Dial schrieb: > Talin wrote: ... >> already existing import syntax), it seems to me that we are, in fact, >> establishing a general rule that: >> as : >> ...is a common syntactical pattern in Python, meaning 'do something >> special with expression, and then as a side effect, assign that >> expression to the named variable for this block." Rather, assign the result of the do-something. > these are all just broken versions of an infix assignment operator[1]. Not quite, because the precise do-something is implicit in the keywords. With "import", you assign the imported module (or object), rather than the name (string) that was the expression itself. With "with", you assign the result of the enter call, rather than the context object itself. With except, you assign the caught instance, rather than the tuple of classes that caught it. By strict analogy, in while expr as myvar myvar should be the result of calling bool(expr). Fortunately, that is so useless that people would understand the slight change by analogy to "and" and "or", which return the full value rather than the boolean to which it maps. while expr as myvar: # proposal foo(myvar) <==> while (myvar=expr; bool(myvar)): # except that we can't have inline statements foo(myvar) <==> # current python workaround -- though normally we use object-specific # knowlege to keep it from being quite this ugly. _indirect=[] def _test_and_set(val): _indirect[0] = val return val while _test_and_set(val): foo(_indirect[0]) > As for this particular case, it is only useful in a very restricted set > of expressions and I was only able to find a handful of cases in stdlib > where I could drop in a "if x as y". I have wanted it, though not nearly so often as I have wanted it for the "while" loop. The workarounds with break/continue or changing to a for-loop always bother me. That said, I'm still not sure it would be worth the cost, because people might start trying to write while (1,2,3) and expecting an implicit iteration; the confusion to beginners *might* outweigh the benefits. -jJ From g.brandl at gmx.net Wed Mar 7 17:20:18 2007 From: g.brandl at gmx.net (Georg Brandl) Date: Wed, 07 Mar 2007 17:20:18 +0100 Subject: [Python-ideas] while with as [was: if with as] In-Reply-To: References: Message-ID: Jim Jewett schrieb: >> As for this particular case, it is only useful in a very restricted set >> of expressions and I was only able to find a handful of cases in stdlib >> where I could drop in a "if x as y". > > I have wanted it, though not nearly so often as I have wanted it for > the "while" loop. > > The workarounds with break/continue or changing to a for-loop always bother me. > > That said, I'm still not sure it would be worth the cost, because > people might start trying to write > > while (1,2,3) > > and expecting an implicit iteration; the confusion to beginners > *might* outweigh the benefits. Hm, why would anyone write that because of the new "as" syntax? Georg From jimjjewett at gmail.com Wed Mar 7 18:09:53 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 7 Mar 2007 12:09:53 -0500 Subject: [Python-ideas] while with as [was: if with as] In-Reply-To: References: Message-ID: On 3/7/07, Georg Brandl wrote: > Jim Jewett schrieb: > > That said, I'm still not sure it would be worth the cost, because > > people might start trying to write > > while (1,2,3) > > and expecting an implicit iteration; the confusion to beginners > > *might* outweigh the benefits. > Hm, why would anyone write that because of the new "as" syntax? It isn't always clear (particularly to a beginner, or someone coming from another programming language) when to use "while" and when to use "for". I've seen plenty of C code with for loops that don't increment a counter -- they could easily be while loops. I imagine getting into it somehow along the following lines; # OK, I want to go through the list. # I need a loop. "while" gives me a loop. while [1, 2, 3] as num: print num # wait, I don't really need the number for this, I just need to get this # stupid thing to loop. Maybe if I take out the "as"? while [1,2,3]: print "got in" Obviously, you can say that the right answer here is a for loop for num in [1, 2, 3]: ... but I'm not sure how hard it would be to explain to a new user. It may be that I'm still thinking pre-file-iterators, and newbies won't have a problem ... but I'm not confident of that. -jJ From rrr at ronadam.com Wed Mar 7 20:37:55 2007 From: rrr at ronadam.com (Ron Adam) Date: Wed, 07 Mar 2007 13:37:55 -0600 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EE791A.8000307@canterbury.ac.nz> References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> <45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz> Message-ID: <45EF1493.9090906@ronadam.com> Greg Ewing wrote: > Ron Adam wrote: > >> I have mixed feeling on it myself. > > As far as I remember, there weren't really any objective > reasons given against it, just gut feelings of dislike > from some people, including Guido. I'm living in hopes > that he may come round to the idea in time (as he did > with conditional expressions, for example). > > > The reason being, (to me), using the '*' for both packing > > and unpacking is not the most readable solution. > > To me it doesn't seem any less readable than using > [...,...] or (...,...) for both packing and unpacking, > or using * in both function definitions and calls. In > fact the symmetry is a large part of the beauty of > the idea. > >> Also the '*' syntax can't be used to unpack nested items. > > I'm not sure what you mean by that. > > >>> [[a, b, *c], [d, e, *f], *g] = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] > > >>> print a, b, c, d, e, f, g > 1 2 [3, 4] 5 6 [7, 8] [9, 10] > > Makes perfectly good sense to me. Didn't say it didn't. Symmetry is not always the best solution. Sometimes asymmetry is good because it can communicate a different context more clearly. That is more of a 'human' issue than machine one. My opinion is in regards to what would be better for me. It's not a right or wrong point of view and may not be better for others. Hmmm... I think there might be an idea related to this of separating formatting from the assignment is such a way that the destination isn't specific to the source structure. (food for thought?) For example: *what if* a sort of string formatting style where used to repack objects at their destination? Where '%%' means repack this object this way at the destination, '[]' is unpack, '*' is pack, and ',' are used to indicate place holders. The example from above could then be... >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] >>> a, b, c, d, e, f, g = "[[,,*],[,,*],*]" %% data >>> print a, b, c, d, e, f, g 1 2 [3, 4] 5 6 [7, 8] [9, 10] A chained operation would need to be be done in this next example. Works form left to right. >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] >>> a, b, c, d = ",,,*" %% "[,[],,]" %% data >>> print a, b, c, d [1, 2, 3, 4] 5 6 [7, 8, 9, 10] Possibly the repack specifiers would be pushed onto a stack then pulled back off to do the actual repacking at assignment time. (?) These are just examples to illustrate a concept. The point here is to allow for the separation of the data structure knowledge from assignments while still being an operation that can happen at the destination. That may avoid creating intermediate objects. This type of abstraction may make it easier to interface different types of objects and data structures dynamically. Of course a function could be made to do this, but it would likely be much much slower. a, b, c = repack(data, repack_specifier) Ron From free.condiments at gmail.com Wed Mar 7 20:59:58 2007 From: free.condiments at gmail.com (Sam) Date: Wed, 7 Mar 2007 19:59:58 +0000 Subject: [Python-ideas] if with as In-Reply-To: <45EA900A.7010905@scottdial.com> References: <45E9C386.6020303@acm.org> <45EA900A.7010905@scottdial.com> Message-ID: On 04/03/07, Scott Dial wrote: > As for this particular case, it is only useful in a very restricted set > of expressions and I was only able to find a handful of cases in stdlib > where I could drop in a "if x as y". I believe this is an indication of > how rarely one wants to do this. YMMV. I have a concrete use case. My hobby MUD server needs to do a lot of parsing of players' input, as you'd expect. I use pyparsing to construct the grammar, but I then have the slightly hairy problem of dispatching to the correct method. Let's consider a single command, 'target', where the rest of the line can take one of three valid forms: 'set $name to word list here', 'clear $name', and 'list'. Presently, the code looks like this (minus some permission-checking complications and room sanity checks): def targetDistributor(actor, text): try: name, target = target_set_pattern.parseString(text) except ParseException: pass else: targetSet(actor, name.lower(), target) return try: name, = target_clear_pattern.parseString(text) except ParseException: pass else: targetClear(actor, name.lower()) return try: target_list_pattern.parseString(text) except ParseException: pass else: targetList(actor) return badSyntax(actor) Yuck. But, let's rewrite this using as-syntax and two new functions: #for patterns which return no useful results, but just need to match (whose results, when bool-ified, result in False) def matchnoresults(pattern, string): try: pattern.parseString(string) except ParseException: return False return True def matchwithresults(pattern, string): try: res = pattern.parseString(string) except ParseException: return False return res def targetDistributor(actor, rest, info): if matchwithresults(target_set_pattern, rest) as name, target: targetSet(actor, name, target) elif matchwithresults(target_clear_pattern, rest) as name,: targetClear(actor, name) elif matchnoresults(target_list_pattern, rest): targetList(actor) else: badSyntax(actor) I do think that the majority of use cases will be parsing, or in similar cases where one needs to both test for success and obtain results from the test. From jcarlson at uci.edu Wed Mar 7 22:43:17 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 07 Mar 2007 13:43:17 -0800 Subject: [Python-ideas] if with as In-Reply-To: References: <45EA900A.7010905@scottdial.com> Message-ID: <20070307132522.74B6.JCARLSON@uci.edu> Sam wrote: > On 04/03/07, Scott Dial wrote: > > As for this particular case, it is only useful in a very restricted set > > of expressions and I was only able to find a handful of cases in stdlib > > where I could drop in a "if x as y". I believe this is an indication of > > how rarely one wants to do this. YMMV. [snip] > I do think that the majority of use cases will be parsing, or in > similar cases where one needs to both test for success and obtain > results from the test. I'm going to agree with Scott Dial on this. You aren't going to need it often enough to warrant this syntax change. Coupled with the potential confusion that Jim Jewett pointed out, and this particular suggestion smells like a misfeature. Also, with the nonlocal declaration that is going to be making it into Python 3 (or was it 2.6?), you can use a closure without a list to do the same thing. def foo(): result = None def do_something(...): nonlocal result ... result = 8 return True if do_something(...): #do something with result ... Then again, closures, assignments in while/if, etc., all smell to me like a way of getting the features of object semantics, without actually using objects. Take your final example and use an object instead. def targetDistributor(actor, rest, info): matcher = Matcher() if matcher.match(target_set_pattern, rest): name, target = matcher.result targetSet(actor, name, target) elif matcher.match(target_clear_pattern, rest): name, = matcher.result targetClear(actor, name) elif matcher.match(target_list_pattern, rest): targetList(actor) else: badSyntax(actor) You know what? That works *today* AND is clearer and more concise than your original code. Even better, it doesn't require a language syntax change. Try using an object for this. You may find that it can do everything that the assignment thing could do. - Josiah From jimjjewett at gmail.com Wed Mar 7 23:14:06 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Wed, 7 Mar 2007 17:14:06 -0500 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EF1493.9090906@ronadam.com> References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> <45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz> <45EF1493.9090906@ronadam.com> Message-ID: On 3/7/07, Ron Adam wrote: > The example from above could then be... > > >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] > >>> a, b, c, d, e, f, g = "[[,,*],[,,*],*]" %% data > > >>> print a, b, c, d, e, f, g > 1 2 [3, 4] 5 6 [7, 8] [9, 10] ... > >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] > >>> a, b, c, d = ",,,*" %% "[,[],,]" %% data > >>> print a, b, c, d > [1, 2, 3, 4] 5 6 [7, 8, 9, 10] ... > This type of abstraction may make it easier to interface different types of > objects and data structures dynamically. I can see how this might be made efficient. I'm not seeing how I could ever maintain code that used it. -jJ From greg.ewing at canterbury.ac.nz Thu Mar 8 00:08:18 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 08 Mar 2007 12:08:18 +1300 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EF1493.9090906@ronadam.com> References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> <45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz> <45EF1493.9090906@ronadam.com> Message-ID: <45EF45E2.6010308@canterbury.ac.nz> Ron Adam wrote: > Greg Ewing wrote: > >> Ron Adam wrote: >> >>> Also the '*' syntax can't be used to unpack nested items. >> >> Makes perfectly good sense to me. > > Didn't say it didn't. Then I still don't know what you meant by your original comment. What kind of nested item unpacking would you like to do that the * syntax wouldn't handle? > Symmetry is not always the best solution. Sometimes asymmetry is good > because it can communicate a different context more clearly. Perhaps, but why do you think that the symmetry we already have between packing and unpacking is okay, but a new symmetry involving * wouldn't be okay? -- Greg From rrr at ronadam.com Thu Mar 8 05:23:08 2007 From: rrr at ronadam.com (Ron Adam) Date: Wed, 07 Mar 2007 22:23:08 -0600 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> <45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz> <45EF1493.9090906@ronadam.com> Message-ID: <45EF8FAC.2080306@ronadam.com> Jim Jewett wrote: > On 3/7/07, Ron Adam wrote: >> The example from above could then be... >> >> >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] >> >>> a, b, c, d, e, f, g = "[[,,*],[,,*],*]" %% data >> >> >>> print a, b, c, d, e, f, g >> 1 2 [3, 4] 5 6 [7, 8] [9, 10] > > ... > >> >>> data = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] >> >>> a, b, c, d = ",,,*" %% "[,[],,]" %% data >> >>> print a, b, c, d >> [1, 2, 3, 4] 5 6 [7, 8, 9, 10] > > ... > >> This type of abstraction may make it easier to interface different >> types of >> objects and data structures dynamically. > > I can see how this might be made efficient. > > I'm not seeing how I could ever maintain code that used it. Yes, that would be a problem. (And I need to resist posting things like this because then I feel obligated to try to explain them. ie... wait 24 hours and if it still seems like a good idea, then post it.) Regarding maintaining code: For interfacing purposes you would probably define the re-packers with the data and not where you actually use them. And use in-line comments as well. Trying a different format. (This can't be a serious proposal unless it can be made simple to understand.) '[]' for unpack items, these can be nested. '.' a single unchanged item '.n' for next n unchanged items '*n' for packing next n items '*' pack the rest of the items, use at the end only. '' Null, don't change anything * Doing it in two steps is the easy way, unpack+repack. * Both unpacking/repacking can be done in one step if the items are aligned. * partial unpacking/repacking is perfectly ok. (if it does what you want) data_source_v1 = [[1, 2, 3, 4], [5, 6, 7, 8], 9, 10] unpack_v1 = "[[][]..]" repack_to_v2 = "..*2..*2*" # 1,2,[3,4],5,6,[7,8],[9,10] data_source_v2 = [1, 2, [3, 4], 5, 6, [7, 8], [9, 10]] unpack_v2 = "[..[]..[][]]" repack_to_v1 = "*4*4.." # [1,2,3,4][5,6,7,8],9,10 def interface_v1(data, unpack='', repack=''): a, b, c, d = repack %% unpack %% data # Do something with a thru d. def interface_v2(data, unpack='', repack=''): a, b, c, d, e ,f, g = repack %% unpack %% data # do something with a thru g # use data sources with interface v1 interface_v1(data_source_v1) # use v1 data as is interface_v1(data_source_v2, unpack_v2, repack_to_v1) # use data sources with interface V2 interface_v2(data_source_v2) # use v2 data as is interface_v2(data_source_v1, upack_v1, repack_to_v2) Decorators probably currently fit this use better I think. Cheers, Ron From jcarlson at uci.edu Thu Mar 8 06:52:42 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 07 Mar 2007 21:52:42 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EF8FAC.2080306@ronadam.com> References: <45EF8FAC.2080306@ronadam.com> Message-ID: <20070307215028.74D1.JCARLSON@uci.edu> Ron Adam wrote: [sni[ > Trying a different format. (This can't be a serious proposal unless it > can be made simple to understand.) > > '[]' for unpack items, these can be nested. > '.' a single unchanged item > '.n' for next n unchanged items > '*n' for packing next n items > '*' pack the rest of the items, use at the end only. > '' Null, don't change anything Anything with a . or * is going to be *very* confusing to write and maintain. - Josiah From rrr at ronadam.com Thu Mar 8 06:54:10 2007 From: rrr at ronadam.com (Ron Adam) Date: Wed, 07 Mar 2007 23:54:10 -0600 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45EF45E2.6010308@canterbury.ac.nz> References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> <45EE5804.9080301@ronadam.com> <45EE791A.8000307@canterbury.ac.nz> <45EF1493.9090906@ronadam.com> <45EF45E2.6010308@canterbury.ac.nz> Message-ID: <45EFA502.6070305@ronadam.com> Greg Ewing wrote: > Ron Adam wrote: >> Greg Ewing wrote: >> >>> Ron Adam wrote: >>> >>>> Also the '*' syntax can't be used to unpack nested items. >>> >>> Makes perfectly good sense to me. >> >> Didn't say it didn't. > > Then I still don't know what you meant by your original > comment. What kind of nested item unpacking would you > like to do that the * syntax wouldn't handle? This was what I was that originally referring to. data = [1, 2, [3, 4], 5] a, b, c, d, e = data # can't unpack [3,4] here with * a, b, [c, d], e = data # must use [] on the left side instead >> Symmetry is not always the best solution. Sometimes asymmetry is good >> because it can communicate a different context more clearly. > > Perhaps, but why do you think that the symmetry we already > have between packing and unpacking is okay, but a new > symmetry involving * wouldn't be okay? The * is used mostly with names and they tend to look more alike than the equivalent packing or unpacking using () or [] in more situations. The () and [] are much more explicit in both packing and unpacking operations. I like the packing and unpacking features of python very much, it's just I would like it a bit more if the '*' symbol for packing and unpacking were different in this particular case. And especially so if '*' packing and unpacking features are used in a more general way. def foo(*args, **kwds): # packs here bar(&args, &&kwds): # unpacks here a, b, *c = a, b, c, d # packs here a, b, c = &items # unpacks here It would just be easier for me to see and keep in my head while I'm working with it. I'm not sure I can explain it better than that, or say exactly why as it probably has more to do with how my brain works than weather it is okay or not okay in any programing sense. Is that clearer? (I don't expect this to be changed) Ron From greg.ewing at canterbury.ac.nz Sat Mar 10 08:53:28 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 10 Mar 2007 20:53:28 +1300 Subject: [Python-ideas] Syntax to help with currying In-Reply-To: References: <20070228205212.GD5537@performancedrivers.com> <20070310001054.GN18179@performancedrivers.com> Message-ID: <45F263F8.6000004@canterbury.ac.nz> Brett Cannon wrote: > def register(cls, account_id, form_id): > def inner(to_decorate): > ... > return inner Thinking the other day about how this pattern arises quite frequently when doing currying like this, and wondering what sort of syntax might make it easier, I came up with def f(): return g(): ... -- Greg From talin at acm.org Sun Mar 11 08:51:11 2007 From: talin at acm.org (Talin) Date: Sat, 10 Mar 2007 23:51:11 -0800 Subject: [Python-ideas] Mutable chars objects Message-ID: <45F3B4EF.1010103@acm.org> Something that I have wanted in Python for a long time is something like the Java StringBuffer class - a mutable buffer, with string-like methods, that holds characters instead of bytes. I do a lot of stuff with parsing, and its often convenient to build up long strings of text one character at a time. Doing this with strings in Python is obviously not the way to go, since each time you append a character you have to construct a new string object. Doing it with lists is better, except that you still have to pay the overhead of the dynamic typing information for each character. Also, unlike a list or an array, you'd ideally want something that has string-like methods, such as toupper() and so on. Calling str( buffer ) should create a string of the contents of the buffer, not generate a repr() of the object which is what would happen if you call str() on a list or array. Passing this buffer to 'print' should also just print the characters. Similarly, you ought to be able to comparisons between the mutable buffer and a real string; slices of the buffer should be strings, not lists, and so on. In other words - it ought to act pretty much like STL strings. Also, the class ought to be optimized for single-character appending, it should be smart enough to grow memory in the right-sized chunks; And no, there's no particular reason why the memory needs to be contiguous, although it could be. Originally, I had thought that such a class might be called 'characters' (to correspond with 'bytes' in Python 3000), but it could just as easily be called strbuffer or something else. -- Talin From larry at hastings.org Sun Mar 11 09:56:44 2007 From: larry at hastings.org (Larry Hastings) Date: Sun, 11 Mar 2007 00:56:44 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> Message-ID: <45F3C44C.4020401@hastings.org> Brett Cannon wrote: > On 3/5/07, Greg Ewing wrote: > >> One thing I *would* like to see, that Javascript doesn't >> seem to have either yet, is a *-arg on the end of the >> unpacking list: >> a, b, *c = [1, 2, 3, 4, 5] >> giving a == 1, b == 2, c == [3, 4, 5]. > Now I know that was discussed on python-dev once, but I don't remember why didn't end up happening. Surely you haven't forgotten your own recipe? http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/163968 That was also published in the Python Cookbook (2nd Edition) as recipe 19.4: "Unpacking a Few Items in a Multiple Assignment". As mentioned in the recipe, the original discussion was here: http://mail.python.org/pipermail/python-dev/2002-November/030380.html The last message posted on the thread was from GvR, who was against it: http://mail.python.org/pipermail/python-dev/2002-November/030437.html Cheers, /larry/ From jcarlson at uci.edu Sun Mar 11 11:31:48 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 11 Mar 2007 02:31:48 -0800 Subject: [Python-ideas] Mutable chars objects In-Reply-To: <45F3B4EF.1010103@acm.org> References: <45F3B4EF.1010103@acm.org> Message-ID: <20070311022621.751C.JCARLSON@uci.edu> Talin wrote: > Something that I have wanted in Python for a long time is something like > the Java StringBuffer class - a mutable buffer, with string-like > methods, that holds characters instead of bytes. 8-bit ASCII characters, or compile-time specified unicode characters (16 or 32 bit)? If all you wanted was mutable characters, array.array('c'), it's smart about appending. The lack of string methods kind of kills it though. One of the reasons I was pushing for string views oh, about 7 months ago was for very similar reasons; it would be *really* nice to be able to add string methods to anything that provided the buffer interface. Nevermind that if it offered a multi-byte buffer view (like the extended buffer interface that will be coming in Py3k), you could treat arbitrary data as if it were strings - an array of 16 bit ints would be the same as 8 bit ints, the same as 8 bit characters, the same as 32 bit ints, etc. I guess I was 7 months too early in my proposal. - Josiah From brett at python.org Sun Mar 11 19:51:59 2007 From: brett at python.org (Brett Cannon) Date: Sun, 11 Mar 2007 10:51:59 -0800 Subject: [Python-ideas] Javascript Destructuring Assignment In-Reply-To: <45F3C44C.4020401@hastings.org> References: <45EC9C77.10603@acm.org> <45ECAC39.4090908@canterbury.ac.nz> <45F3C44C.4020401@hastings.org> Message-ID: On 3/11/07, Larry Hastings wrote: > Brett Cannon wrote: > > On 3/5/07, Greg Ewing wrote: > > > >> One thing I *would* like to see, that Javascript doesn't > >> seem to have either yet, is a *-arg on the end of the > >> unpacking list: > >> a, b, *c = [1, 2, 3, 4, 5] > >> giving a == 1, b == 2, c == [3, 4, 5]. > > Now I know that was discussed on python-dev once, but I don't remember why didn't end up happening. > Surely you haven't forgotten your own recipe? > http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/163968 > That was also published in the Python Cookbook (2nd Edition) as recipe > 19.4: "Unpacking a Few Items in a Multiple Assignment". I remember writing the code, but I totally forgot I submitted it (and it got published) as a recipe. =) -Brett From jason.orendorff at gmail.com Sun Mar 11 21:53:17 2007 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Sun, 11 Mar 2007 16:53:17 -0400 Subject: [Python-ideas] Mutable chars objects In-Reply-To: <45F3B4EF.1010103@acm.org> References: <45F3B4EF.1010103@acm.org> Message-ID: On 3/11/07, Talin wrote: > I do a lot of stuff with parsing, and its often convenient to build up > long strings of text one character at a time. Could you be more specific about this? When I write a parser it always starts with either _token_re = re.compile(r'''(?x) ...15 lines omitted... ''') or import yapps2 # wheeee! I've never had much luck hand-coding a lexer in straight-up Python. Not only is it slow, I feel like Python's syntax is working against me-- no switch statement, no do-while. (This is not a complaint! It's all right. I should be using a parser-generator anyway.) Josiah mentioned array.array('c'). There's also array.array('u'), which is an array of Py_UNICODEs. You can add the string methods in a subclass for a quick prototype. -j From jcarlson at uci.edu Sun Mar 11 23:47:02 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 11 Mar 2007 15:47:02 -0700 Subject: [Python-ideas] Mutable chars objects In-Reply-To: References: <45F3B4EF.1010103@acm.org> Message-ID: <20070311154251.FB62.JCARLSON@uci.edu> "Jason Orendorff" wrote: > > On 3/11/07, Talin wrote: > > I do a lot of stuff with parsing, and its often convenient to build up > > long strings of text one character at a time. > > Could you be more specific about this? When I write a parser it > always starts with either I don't believe he's talking about parsing in the language lexer sense, I believe he is talking about perhaps url parsing (breaking it down into its component parts), "unmarshaling" (think pickle, marshal, etc.), or possibly even configuration files. > Josiah mentioned array.array('c'). There's also array.array('u'), > which is an array of Py_UNICODEs. You can add the string methods in a > subclass for a quick prototype. Kind-of, but it's horribly slow. The point of string views that I mentioned is that you get all of the benefits of the underlying C implementation; from speed to "it's already been implemented and debugged". - Josiah From collinw at gmail.com Mon Mar 12 05:59:53 2007 From: collinw at gmail.com (Collin Winter) Date: Sun, 11 Mar 2007 23:59:53 -0500 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> Message-ID: <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> [Moving to python-ideas...] On 3/9/07, Steven Bethard wrote on python-dev: > On 3/9/07, Collin Winter wrote on python-dev: > > One solution that just occurred to me -- and that skirts the issue of > > choosing an interpretation -- is that, when comparing date and > > datetime objects, the datetime's .date() method is called and the > > result of that call is compared to the original date. That is, > > > > datetime_obj < date_obj > > > > is implicitly equivalent to > > > > datetime_obj.date() < date_obj > > Using the .date() is fine when the year/month/day doesn't match. So > the following are fine:: > datetime.datetime(2005, 1, 1, 0, 0, 0) < datetime.date(2006, 1, 1) > datetime.datetime(2007, 1, 1, 0, 0, 0) > datetime.date(2006, 1, 1) > It's *not* okay to say that a date() is less than, greater than or > equal to a datetime() if the year/month/day *does* match. The correct > temporal relation is During, but Python doesn't have a During > operator. During is not the same as less-than, greater-than or > equal-to, so all of these should be False:: > datetime.datetime(2006, 1, 1, 0, 0, 0) < datetime.date(2006, 1, 1) > datetime.datetime(2006, 1, 1, 0, 0, 0) > datetime.date(2006, 1, 1) > datetime.datetime(2006, 1, 1, 0, 0, 0) == datetime.date(2006, 1, 1) > That is, the datetime() is not less than, greater than or equal to the > corresponding date(). > > Some discussion of these kinds of issues is here: > http://citeseer.ist.psu.edu/allen94actions.html > The essence is that in order to properly compare intervals, you need > the Meets, Overlaps, Starts, During and Finishes operators in addition > to the Before (<) and Simulaneous (=) operators. > > So, let's not conflate Before, After or Simultaneous with the other > relations -- if it's not strictly Before (<), After (>) or > Simultaneous (=), we can just say so by returning False. It might be neat to add a __contains__ method to date() objects so that "datetime(2007, 1, 1, 0, 0, 0) in date(2007, 1, 1)" would be True. This would seem to fulfill the During operator. Collin Winter From lists at cheimes.de Mon Mar 12 14:07:58 2007 From: lists at cheimes.de (Christian Heimes) Date: Mon, 12 Mar 2007 14:07:58 +0100 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> Message-ID: Collin Winter wrote: > It might be neat to add a __contains__ method to date() objects so > that "datetime(2007, 1, 1, 0, 0, 0) in date(2007, 1, 1)" would be > True. This would seem to fulfill the During operator. Yeah, I had the same idea: http://permalink.gmane.org/gmane.comp.python.devel/86828 In my opinion the comparison operations for date and datetime object should be defined as: datetime before date: < ----------------------- >>> datetime(2007, 1, 1, 23, 59, 59) < date(2007, 1, 2) True Valid for all datetimes before (smaller than) datetime(2007, 1, 2, 0, 0, 0) datetime during date: in ------------------------ >>> datetime(2007, 1, 2, 23, 59, 59) in date(2007, 1, 2) True Valid for all datetimes((2007, 1, 2, *, *, *).date() == date(2007, 1, 2) datetime after date: > ---------------------- >>> datetime(2007, 1, 3, 0, 0, 0) in date(2007, 1, 2) True Valid for all datetimes after 2007-01-02 (greater or equal 2007-01-03 00:00:00) datetime <= date and datetime => date should raise a TypeError. The result is ambiguous. A date with time is never equal to a date. Christian From jimjjewett at gmail.com Mon Mar 12 16:43:18 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 12 Mar 2007 11:43:18 -0400 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> Message-ID: On 3/12/07, Christian Heimes wrote: > datetime before date: < > datetime during date: in > datetime after date: > > datetime <= date and datetime => date should raise a TypeError. The > result is ambiguous. A date with time is never equal to a date. As a practical matter, sorting uses <= I would be somewhat annoyed if dt < d were True, but when I tried to sort them, dt <= d raised a TypeError. It would be OK with me to raise an error (ValueError?) on "dt <= d" if "dt in d". Others might feel differently, and prefer to always get the TypeError -- which is probably why dt References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> Message-ID: <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com> On 3/12/07, Steven Bethard wrote: > [I'm not on this list, so please keep me in the CC if you reply] > > On 3/11/07, Collin Winter wrote: > > On 3/9/07, Steven Bethard wrote on python-dev: > > > It's *not* okay to say that a date() is less than, greater than or > > > equal to a datetime() if the year/month/day *does* match. The correct > > > temporal relation is During, but Python doesn't have a During > > > operator. During is not the same as less-than, greater-than or > > > equal-to, so all of these should be False:: > > > datetime.datetime(2006, 1, 1, 0, 0, 0) < datetime.date(2006, 1, 1) > > > datetime.datetime(2006, 1, 1, 0, 0, 0) > datetime.date(2006, 1, 1) > > > datetime.datetime(2006, 1, 1, 0, 0, 0) == datetime.date(2006, 1, 1) > > > That is, the datetime() is not less than, greater than or equal to the > > > corresponding date(). > > > > > > Some discussion of these kinds of issues is here: > > > http://citeseer.ist.psu.edu/allen94actions.html > > > The essence is that in order to properly compare intervals, you need > > > the Meets, Overlaps, Starts, During and Finishes operators in addition > > > to the Before (<) and Simulaneous (=) operators. > > > > > > So, let's not conflate Before, After or Simultaneous with the other > > > relations -- if it's not strictly Before (<), After (>) or > > > Simultaneous (=), we can just say so by returning False. > > > > It might be neat to add a __contains__ method to date() objects so > > that "datetime(2007, 1, 1, 0, 0, 0) in date(2007, 1, 1)" would be > > True. This would seem to fulfill the During operator. > > That's a nice idea. With the simplest implementation, you could then > guarantee that one of the following would always be true:: > > datetime < date > datetime in date > datetime > date > > (That would actually conflate the Starts, Finishes and During > relations in the __contains__ operator, but I think that's a perfectly > reasonable interpretation, and I know it would be useful in my code at > least.) I'll work up a patch. Collin Winter From lists at cheimes.de Mon Mar 12 17:08:04 2007 From: lists at cheimes.de (Christian Heimes) Date: Mon, 12 Mar 2007 17:08:04 +0100 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com> References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com> Message-ID: Collin Winter wrote: > I'll work up a patch. I'm already working on a patch, too. Christian From steven.bethard at gmail.com Mon Mar 12 18:42:21 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 12 Mar 2007 11:42:21 -0600 Subject: [Python-ideas] datetime module enhancements Message-ID: On 3/12/07, Christian Heimes wrote: > datetime before date: < > datetime during date: in > datetime after date: > > datetime <= date and datetime => date should raise a TypeError. The > result is ambiguous. A date with time is never equal to a date. Jim Jewett wrote: > As a practical matter, sorting uses <= Are you sure? >>> class C(object): ... def __init__(self, x): ... self.x = x ... def __lt__(self, other): ... return self.x < other.x ... def __le__(self, other): ... raise TypeError() ... def __repr__(self): ... return 'C(%r)' % self.x ... >>> sorted([C(3), C(2), C(1)]) [C(1), C(2), C(3)] Looks like it's just using < to me. (And thanks to Collin and Christian for jumping on the patch already.) STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From lists at cheimes.de Mon Mar 12 18:56:56 2007 From: lists at cheimes.de (Christian Heimes) Date: Mon, 12 Mar 2007 18:56:56 +0100 Subject: [Python-ideas] datetime module enhancements In-Reply-To: References: Message-ID: Steven Bethard schrieb: >> datetime <= date and datetime => date should raise a TypeError. The >> result is ambiguous. A date with time is never equal to a date. > > Jim Jewett wrote: >> As a practical matter, sorting uses <= > > Are you sure? It depends what tp slot is defined. The C api supports either comparison (cmp style -1, 0, 1) or rich comparison (func(self, other, operator)). Christian From steven.bethard at gmail.com Mon Mar 12 19:06:15 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 12 Mar 2007 12:06:15 -0600 Subject: [Python-ideas] datetime module enhancements In-Reply-To: References: Message-ID: On 3/12/07, Christian Heimes wrote: > Steven Bethard schrieb: > >> datetime <= date and datetime => date should raise a TypeError. The > >> result is ambiguous. A date with time is never equal to a date. > > > > Jim Jewett wrote: > >> As a practical matter, sorting uses <= > > > > Are you sure? > > It depends what tp slot is defined. The C api supports either comparison > (cmp style -1, 0, 1) or rich comparison (func(self, other, operator)). Fair enough. My only point was that as long as __lt__ is defined, __le__ can throw a TypeError() and it won't break sorted(). STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From collinw at gmail.com Mon Mar 12 20:01:55 2007 From: collinw at gmail.com (Collin Winter) Date: Mon, 12 Mar 2007 14:01:55 -0500 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com> References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com> Message-ID: <43aa6ff70703121201q27594bedwebbc4a2078c96f2d@mail.gmail.com> On 3/12/07, Collin Winter wrote: > On 3/12/07, Steven Bethard wrote: > > [I'm not on this list, so please keep me in the CC if you reply] > > [snip] > > That's a nice idea. With the simplest implementation, you could then > > guarantee that one of the following would always be true:: > > > > datetime < date > > datetime in date > > datetime > date > > > > (That would actually conflate the Starts, Finishes and During > > relations in the __contains__ operator, but I think that's a perfectly > > reasonable interpretation, and I know it would be useful in my code at > > least.) > > I'll work up a patch. Posted as #1679204 (http://python.org/sf/1679204). In addition to date.__contains__, I had to add a datetime.__contains__ that throws a TypeError (since datetime inherits from date). While writing the patch, I had the idea of making "time in date" always return True, but I'm not sure that would be useful. Collin Winter From steven.bethard at gmail.com Mon Mar 12 20:07:19 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 12 Mar 2007 13:07:19 -0600 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: <43aa6ff70703121201q27594bedwebbc4a2078c96f2d@mail.gmail.com> References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> <43aa6ff70703120852y773d11edr690a39b8dc2c2ce@mail.gmail.com> <43aa6ff70703121201q27594bedwebbc4a2078c96f2d@mail.gmail.com> Message-ID: On 3/12/07, Collin Winter wrote: > Posted as #1679204 (http://python.org/sf/1679204). In addition to > date.__contains__, I had to add a datetime.__contains__ that throws a > TypeError (since datetime inherits from date). Very cool. Thanks! > While writing the patch, I had the idea of making "time in date" > always return True, but I'm not sure that would be useful. Yeah, I'd hold off on that one until someone indicates that they need it. Seems like there would be a number of places where True might be wrong for a particular bit of code. Steve -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From greg.ewing at canterbury.ac.nz Mon Mar 12 23:35:32 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Tue, 13 Mar 2007 11:35:32 +1300 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> Message-ID: <45F5D5B4.3060607@canterbury.ac.nz> Jim Jewett wrote: > It would be OK with me to raise an error (ValueError?) on "dt <= d" if > "dt in d". Others might feel differently, and prefer to always get > the TypeError -- which is probably why dt obvious thing. Seems to me that given all the conflicting behaviours peole want this to have in different circumstances, refusing to compare dates and datetimes is the right thing to do. All the use cases I've seen here for comparing them are easily accommodated by either extracting a date from the datetime to compare with the other date, or deriving a datetime from the date with whatever default time part you want. EIBTI. -- Greg From guido at python.org Mon Mar 12 23:49:51 2007 From: guido at python.org (Guido van Rossum) Date: Mon, 12 Mar 2007 15:49:51 -0700 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: <45F5D5B4.3060607@canterbury.ac.nz> References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> <45F5D5B4.3060607@canterbury.ac.nz> Message-ID: On 3/12/07, Greg Ewing wrote: > Jim Jewett wrote: > > It would be OK with me to raise an error (ValueError?) on "dt <= d" if > > "dt in d". Others might feel differently, and prefer to always get > > the TypeError -- which is probably why dt > obvious thing. > > Seems to me that given all the conflicting behaviours > peole want this to have in different circumstances, > refusing to compare dates and datetimes is the right > thing to do. Right. A lot of thought went into this when the original design was done. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From steven.bethard at gmail.com Tue Mar 13 00:13:40 2007 From: steven.bethard at gmail.com (Steven Bethard) Date: Mon, 12 Mar 2007 17:13:40 -0600 Subject: [Python-ideas] [Python-Dev] datetime module enhancements In-Reply-To: <45F5D5B4.3060607@canterbury.ac.nz> References: <43aa6ff70703091153s58eca045l3b483e0225eb6bcd@mail.gmail.com> <43aa6ff70703091314k4392c64r4d366c9703ea1a14@mail.gmail.com> <43aa6ff70703112159k683bfe7y3d8209f5d424ecb1@mail.gmail.com> <45F5D5B4.3060607@canterbury.ac.nz> Message-ID: On 3/12/07, Greg Ewing wrote: > Jim Jewett wrote: > > > It would be OK with me to raise an error (ValueError?) on "dt <= d" if > > "dt in d". Others might feel differently, and prefer to always get > > the TypeError -- which is probably why dt > obvious thing. > > Seems to me that given all the conflicting behaviours > peole want this to have in different circumstances, > refusing to compare dates and datetimes is the right > thing to do. The "conflicting behaviors" are really just at the boundaries, and the only reason they're "conflicting" is because people don't really seem to know yet what they need. This is probably an indication that an appropriately extended datetime module should be distributed as a third-party module for a while first... STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy From jimjjewett at gmail.com Tue Mar 13 02:17:06 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Mon, 12 Mar 2007 21:17:06 -0400 Subject: [Python-ideas] datetime module enhancements In-Reply-To: References: Message-ID: On 3/12/07, Steven Bethard wrote: > On 3/12/07, Christian Heimes wrote: > > Steven Bethard schrieb: > > >> datetime <= date and datetime => date should raise a TypeError. The > > >> result is ambiguous. A date with time is never equal to a date. But it can still be <=, by being <. I would personally be OK with just saying that (year, month, day) sorts less than (year, month, day, ...) regardless of time, simply because of the type -- but I admit that would be arbitrary. ... > Fair enough. My only point was that as long as __lt__ is defined, > __le__ can throw a TypeError() and it won't break sorted(). Mea culpa. I was mis-remembering, and thought that even this would break because of sort stability. -jJ From thobes at gmail.com Wed Mar 21 17:57:44 2007 From: thobes at gmail.com (Tobias Ivarsson) Date: Wed, 21 Mar 2007 17:57:44 +0100 Subject: [Python-ideas] Builtin infinite generator Message-ID: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com> Quite often I find myself wanting to write an infinite for-loop, or rather a loop with a counter, that terminates on some inner condition. Recently I even wanted to do this together with the generator comprehension syntax, I don't remember exactly what I wanted to do, but it was something like: zip( some_list_of_unknown_length, ('a' for x in infinit_generator) ) I admit that my example is silly, but it still serves as an example of what I wanted to do. Then I read that Ellipsis will become generally accepted in py3k [1] and thought why not let range accept ... as end-parameter to mean "until forever". More silly examples to demonstrate how it would work: >>> for i in range(...): ... print i ... if i == 4: break 0 1 2 3 4 >>> for i in range(10,...,10): ... print i ... if i == 40: break 10 20 30 40 Any thoughts on this? /Tobias [1] http://mail.python.org/pipermail/python-3000/2006-April/000996.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From collinw at gmail.com Wed Mar 21 18:35:22 2007 From: collinw at gmail.com (Collin Winter) Date: Wed, 21 Mar 2007 12:35:22 -0500 Subject: [Python-ideas] Builtin infinite generator In-Reply-To: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com> References: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com> Message-ID: <43aa6ff70703211035o5e47bc5xb5d8ced11d4723d8@mail.gmail.com> On 3/21/07, Tobias Ivarsson wrote: > Quite often I find myself wanting to write an infinite for-loop, or rather a > loop with a counter, that terminates on some inner condition. Recently I > even wanted to do this together with the generator comprehension syntax, I > don't remember exactly what I wanted to do, but it was something like: > zip( some_list_of_unknown_length, ('a' for x in infinit_generator) ) > I admit that my example is silly, but it still serves as an example of what > I wanted to do. itertools.count() (http://docs.python.org/lib/itertools-functions.html) handles just this. Wrapping it in a genexp can take care of the second example: > >>> for i in range(...): > ... print i > ... if i == 4: break for i in itertools.count(): .... > >>> for i in range(10,...,10): > ... print i > ... if i == 40: break for i in (i * 10 for i in itertools.count(1)): .... Collin Winter From jcarlson at uci.edu Wed Mar 21 18:43:54 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Wed, 21 Mar 2007 10:43:54 -0700 Subject: [Python-ideas] Builtin infinite generator In-Reply-To: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com> References: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com> Message-ID: <20070321103536.FC8F.JCARLSON@uci.edu> "Tobias Ivarsson" wrote: > Quite often I find myself wanting to write an infinite for-loop, or rather a > loop with a counter, that terminates on some inner condition. Recently I > even wanted to do this together with the generator comprehension syntax, I > don't remember exactly what I wanted to do, but it was something like: > zip( some_list_of_unknown_length, ('a' for x in infinit_generator) ) > I admit that my example is silly, but it still serves as an example of what > I wanted to do. > > Then I read that Ellipsis will become generally accepted in py3k [1] and > thought why not let range accept ... as end-parameter to mean "until > forever". > > More silly examples to demonstrate how it would work: > > >>> for i in range(...): > ... print i > ... if i == 4: break [snip] > Any thoughts on this? It isn't reasonable in Python 2.x, as range() returns an actual list, and an infinite list isn't reasonable. It would be *possible* with xrange() in Python 2.x, but then the question is "is such desireable?". The answer there is also no, as currently 2.x range() and xrange() are limited by your platform int size (generally 32 bits)... >>> xrange(2**31) Traceback (most recent call last): File "", line 1, in OverflowError: long int too large to convert to int >>> xrange(2**31-1) xrange(2147483647) >>> In Python 3, ints and longs will be unified, so the whole 'limited by platform int' shouldn't be applicable. Further, xrange is renamed to range. So it becomes more reasonable. On the other hand, a user (you) should be able to give a *huge* value to range, and it would work as you want ... to. Generally, you are pretty safe with 2**64 * increment, but if you are really anal, go with 2**128 * increment, that should keep you safe until the big freeze. Ultimately, -1. It's not usable in Python 2.x, and Python 3.x will support a variant trivially. - Josiah From greg.ewing at canterbury.ac.nz Wed Mar 21 23:06:39 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 22 Mar 2007 10:06:39 +1200 Subject: [Python-ideas] Builtin infinite generator In-Reply-To: <20070321103536.FC8F.JCARLSON@uci.edu> References: <9997d5e60703210957x581eec9do9deed56199a155c5@mail.gmail.com> <20070321103536.FC8F.JCARLSON@uci.edu> Message-ID: <4601AC6F.4040904@canterbury.ac.nz> Josiah Carlson wrote: > Generally, you are pretty safe with 2**64 * increment, but > if you are really anal, go with 2**128 * increment, that should keep you > safe until the big freeze. Although this will work for all practical purposes, code which does things like this is still a bit smelly. I prefer it when there are ways of explicitly representing infinitely-large or unbounded values. -- Greg From talin at acm.org Thu Mar 22 08:50:39 2007 From: talin at acm.org (Talin) Date: Thu, 22 Mar 2007 00:50:39 -0700 Subject: [Python-ideas] Python and Concurrency Message-ID: <4602354F.1010003@acm.org> I would like to direct your attention to a presentation that was given by Mr. Herb Sutter on the subject of concurrency support in programming languages. It's a long video, but I would urge you to watch at least the first five minutes of it: http://video.google.com/videoplay?docid=7625918717318948700&q=herb+sutter Mr. Sutter is the secretary of the ISO/C++ standards committee. I mention this only so that you'll understand he's not just some random yo-yo with an opinion. (There are a number of other papers by Mr. Sutter on the same topic which you can google for, however none of them are as compelling or comprehensive as the video presentation IMHO.) The gist of the introduction is this (my words, not his): Starting from now, and for the rest of your career as a programmer, you better start programming for massively concurrent systems or you are doomed to irrelevance. As he points out in the talk, the good news is that Moore's law is going to continue for several decades to come. The bad news is that most people don't understand Moore's law or what they get from it. Moore's law is really about the number of transistors on a chip. It says nothing about processor speed or performance. He calls it "the end of the free lunch", meaning that in the past you could write a program, wait a little while, and it would run faster - because the hardware got faster while you were waiting. This will no longer be true for single-threaded programs. Scalar processors, with all of their pipelining, prefetching and other tricks, have gotten about as fast as they are ever going to get; We've reached the point of diminishing returns. So the only thing left to do is throw more and more cores on the chip. There's no physical reason why the number of cores won't continue to double every 18 months; the only reason why manufacturers might choose not to make 128-core CPUs by the year 2012 is that there won't be enough software to take advantages of that degree of parallelism. (Even with today's transistor count, you could put 100 Pentium-class processors on a single die.) Moreover, programming languages which have built-in support for scaling to many processors are going to win in the long run, and programming languages that don't have strong, built-in concurrency support will gradually fade away. The reason for this is simple: People don't like using applications which don't get faster when they buy a better computer. Now, on the server side, all of this is a solved problem. We have a very robust model for handling hundreds of requests in parallel, and a very strong model for dealing with shared state. This model is called "transactions" or more formally ACID. On the client side, however, things are much worse. As he puts it, locks are the best we have, and locks suck. Locks are not "logically composable", in the sense that you can't simply take two pieces of software that were written independently and glue them together unless they share a common locking architecture. This ruins a large part of the power of modern software, which is the ability to compose software out of smaller, independently-written components. On the client side, you have heterogeneous components accessing shared state at high bandwidth with myriads of pointer-based references to that shared state. Add concurrency, and what you end up with is a system in which it is impossible to make logical inferences about the behavior of the system. (And before you ask - yes, functional languages - and their drawbacks - are addressed in the talk. The same is true for 'lock-free programming' and other techniques du jour.) Now, in the talk he proposes some strategies for building concurrency into the language so that client-side programming can be done in a way that isn't rocket science. We won't be able to get rid of locks, he claims, but with the right language support at least we'll be able to reason about them, and maybe push them off to the side so that they mostly stay in their corner. In any case, what I would like to open up is a discussion of how these ideas might influence the design of the Python language. One thing that is important to understand is that I'm not talking about "automatic parallelization" where the compiler automatically figures out what parts can be parallelized. That would require so much change to the Python language as to make it no longer Python. Nor am I talking about manually creating "thread" objects, and having to carefully place protections around any shared state. That kind of parallelism is too low-level, too coarse-grained, and too hard to do, even for people who think they know how to write concurrent code. I am not even necessarily talking about changing the Python language (although certainly the internals of the interpreter will need to be changed.) The same language can be used to describe the same kinds of problems and operations, but the implications of those language elements will change. This is analogous to the fact that these massively multicore CPUs 10 years from now will most likely be using the same instruction sets as today - but that does not mean that programming as a discipline will anything like what it is now. As an example of what I mean, suppose the Python 'with' statement could be used to indicate an atomic transaction. Any operations that were done within that block would either be committed or rolled back. This is not a new idea - here's an interesting paper on it: http://www-sal.cs.uiuc.edu/~zilles/tm.html I'm sure that there's a lot more like that out there. However, there is also a lot of stuff out there that is *superficially* similar to what I am talking about, and I want to make the distinction clear. For example, any discussion of concurrency in Python will naturally raise the topic of both IPython and Stackless. However, IPython (from what I understand) is really about distributed computing and not so much about fine-grained concurrency; And Stackless (from what I understand) is really about coroutines or continuations, which is a different kind of concurrency. Unless I am mistaken (and I very well could be) neither of these are immediately applicable to the problem of authoring Python programs for multi-core CPUs, but I think that both of them contain valuable ideas that are worth discussing. Now, I will be the first to admit that I am not an expert in these matters. Don't let my poor, naive ideas be a limit to what is discussed. What I've primarily tried to do in this posting is to get your attention and convince you that this is important and worth talking about. And I hope to be able to learn something along the way. My overall goal here is to be able to continue writing programs in Python 10 years from now, not just as a hobby, but as part of my professional work. If Python is able to leverage the power of the CPUs that are being created at that time, I will be able to make a strong case for doing so. On the other hand, if I have a 128-core CPU on my desk, and Python is only able to utilize 1/128th of that power without resorting to tedious calculations of race conditions and deadlocks, then its likely that my Python programming will be relegated to the role of a hobby. -- Talin From rrr at ronadam.com Thu Mar 22 11:24:21 2007 From: rrr at ronadam.com (Ron Adam) Date: Thu, 22 Mar 2007 05:24:21 -0500 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4602354F.1010003@acm.org> References: <4602354F.1010003@acm.org> Message-ID: <46025955.5030509@ronadam.com> Talin wrote: > My overall goal here is to be able to continue writing programs in > Python 10 years from now, not just as a hobby, but as part of my > professional work. If Python is able to leverage the power of the CPUs > that are being created at that time, I will be able to make a strong > case for doing so. On the other hand, if I have a 128-core CPU on my > desk, and Python is only able to utilize 1/128th of that power without > resorting to tedious calculations of race conditions and deadlocks, then > its likely that my Python programming will be relegated to the role of a > hobby. > > -- Talin A Very nice introduction Talin. I will certainly look at the video tomorrow morning. Thanks. My first thought is it's not quite as bad as it seems, because any third party extensions will be able to use the remaining 127/128th of the power. ;-) You would need to subtract any CPU's used by the OS and other concurrent processes. (These would probably continue to use more resources as well.) I wonder if some of cpu's would be definable for special purposes or not? Maybe 64 of them set aside for simultaneous parallel calculations? There may be a way to express to the OS that a particular process on a data structure be carried out as 'Wide' as possible. ('Narrow' as possible being on a single CPU in a single Thread.) It might be vary nice for these 'wide' structures to have their own API as well. All speculation of course, ;-) Cheers, Ron From lucio.torre at gmail.com Thu Mar 22 14:33:57 2007 From: lucio.torre at gmail.com (Lucio Torre) Date: Thu, 22 Mar 2007 10:33:57 -0300 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4602354F.1010003@acm.org> References: <4602354F.1010003@acm.org> Message-ID: <999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com> > I'm sure that there's a lot more like that out there. However, there is > also a lot of stuff out there that is *superficially* similar to what I > am talking about, and I want to make the distinction clear. For example, > any discussion of concurrency in Python will naturally raise the topic > of both IPython and Stackless. However, IPython (from what I understand) > is really about distributed computing and not so much about fine-grained > concurrency; And Stackless (from what I understand) is really about > coroutines or continuations, which is a different kind of concurrency. > Unless I am mistaken (and I very well could be) neither of these are > immediately applicable to the problem of authoring Python programs for > multi-core CPUs, but I think that both of them contain valuable ideas > that are worth discussing. >From what i understand, i think that the main contribution of the stackless aproach to concurrency is microthreads: The ability to have lots and lots of cheap threads. If you want to program for some huge amount of cores, you will have to have even more threads than cores you have today. The problem is that rigth now python (on my linux box) will only let me have 381 threads. And if we want concurrent programming to be made easy, the user is not supposed to start its programms with "how many threads can i create? 500? ok, so i should partition my app like this". This leads to: a) applications that wont get faster when the user can create more than 500 threads b) or, a lot of complicated logic to partition the software on runtime The method should go the other way around: make all the threads you can think about, if there are enough cores, they will run in parallel. jm2c. Lucio Regards, Lucio. From jcarlson at uci.edu Thu Mar 22 18:03:14 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 22 Mar 2007 10:03:14 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com> References: <4602354F.1010003@acm.org> <999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com> Message-ID: <20070322095206.FCA8.JCARLSON@uci.edu> "Lucio Torre" wrote: > > I'm sure that there's a lot more like that out there. However, there is > > also a lot of stuff out there that is *superficially* similar to what I > > am talking about, and I want to make the distinction clear. For example, > > any discussion of concurrency in Python will naturally raise the topic > > of both IPython and Stackless. However, IPython (from what I understand) > > is really about distributed computing and not so much about fine-grained > > concurrency; And Stackless (from what I understand) is really about > > coroutines or continuations, which is a different kind of concurrency. > > Unless I am mistaken (and I very well could be) neither of these are > > immediately applicable to the problem of authoring Python programs for > > multi-core CPUs, but I think that both of them contain valuable ideas > > that are worth discussing. > > From what i understand, i think that the main contribution of the > stackless aproach to concurrency is microthreads: The ability to have > lots and lots of cheap threads. If you want to program for some huge > amount of cores, you will have to have even more threads than cores > you have today. But it's not about threads, it is about concurrent execution of code (which threads in Python do not allow). The only way to allow this is to basically attach a re-entrant lock on every single Python object (depending on the platform, perhaps 12 bytes minimum for count, process, thread). The sheer volume of the number of acquire/release cycles during execution is staggering (think about the number of incref/decref operations), and the increase in size of every object by around 12 bytes is not terribly appealing. On the upside, this is possible (change the PyObject_HEAD macro, PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to actually make it happen is huge, and it would likely result in negative performance until sheer concurrency wins out over the acquire/release overhead. - Josiah From jcarlson at uci.edu Thu Mar 22 18:30:58 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 22 Mar 2007 10:30:58 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4602354F.1010003@acm.org> References: <4602354F.1010003@acm.org> Message-ID: <20070322100428.FCAB.JCARLSON@uci.edu> Talin wrote: > I would like to direct your attention to a presentation that was given > by Mr. Herb Sutter on the subject of concurrency support in programming > languages. It's a long video, but I would urge you to watch at least the > first five minutes of it: I'll watch it later today, but you seem to have done a great job of explaining what it is saying. [snip] > The gist of the introduction is this (my words, not his): Starting from > now, and for the rest of your career as a programmer, you better start > programming for massively concurrent systems or you are doomed to > irrelevance. Given an inexpensive fork() with certain features (the optional ability to not copy threads, certain file handles, be able to quickly pass data between the parent and child processes, etc.), task-level concurrency is not quite as hard as it is right now. In the same way that one can use a threadpool to handle queued tasks, one could use whole processes to do the same thing, which gets us concurrency in Python. Of course we run into the issue where processor scheduling needs to get better to handle all of the processes, but that's going to be a requirement for these multi-core processors anyways. Windows doesn't have a .fork() (cygwin emulates it by using a shared mmap to copy all program state). Sometimes transferring objects between processes is difficult (file handles end up being relatively easy on *nix AND Windows), but there. Etc. - Josiah From rrr at ronadam.com Thu Mar 22 21:55:34 2007 From: rrr at ronadam.com (Ron Adam) Date: Thu, 22 Mar 2007 15:55:34 -0500 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070322095206.FCA8.JCARLSON@uci.edu> References: <4602354F.1010003@acm.org> <999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com> <20070322095206.FCA8.JCARLSON@uci.edu> Message-ID: <4602ED46.9050202@ronadam.com> Josiah Carlson wrote: > "Lucio Torre" wrote: >>> I'm sure that there's a lot more like that out there. However, there is >>> also a lot of stuff out there that is *superficially* similar to what I >>> am talking about, and I want to make the distinction clear. For example, >>> any discussion of concurrency in Python will naturally raise the topic >>> of both IPython and Stackless. However, IPython (from what I understand) >>> is really about distributed computing and not so much about fine-grained >>> concurrency; And Stackless (from what I understand) is really about >>> coroutines or continuations, which is a different kind of concurrency. >>> Unless I am mistaken (and I very well could be) neither of these are >>> immediately applicable to the problem of authoring Python programs for >>> multi-core CPUs, but I think that both of them contain valuable ideas >>> that are worth discussing. >> From what i understand, i think that the main contribution of the >> stackless aproach to concurrency is microthreads: The ability to have >> lots and lots of cheap threads. If you want to program for some huge >> amount of cores, you will have to have even more threads than cores >> you have today. > > But it's not about threads, it is about concurrent execution of code > (which threads in Python do not allow). The only way to allow this is > to basically attach a re-entrant lock on every single Python object > (depending on the platform, perhaps 12 bytes minimum for count, process, > thread). The sheer volume of the number of acquire/release cycles > during execution is staggering (think about the number of incref/decref > operations), and the increase in size of every object by around 12 bytes > is not terribly appealing. > > On the upside, this is possible (change the PyObject_HEAD macro, > PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to > actually make it happen is huge, and it would likely result in negative > performance until sheer concurrency wins out over the acquire/release > overhead. It seems to me some types of operations are more suited for concurrent operations than others, so maybe new objects that are designed to be naturally usable in this way could help. Or maybe there's a way to lock groups of objects at the same time by having them share a lock if they are related? I imagine there will be some low level C support that could be used transparently, such as copying large areas of memory with multiple CPU's. These may even be the existing C copy functions reimplemented to take advantage of multiple CPU environments so new versions of python may have limited use of this even if no support is explicitly added. Thinking out loud of ways a python program may use concurrent processing: * Applying a single function concurrently over a list. (A more limited function object might make this easier.) * Feeding a single set of arguments concurrently over a list of callables. * Generators with the semantics of calculating first and waiting on 'yield' for 'next', so the value is immediately returned. (depends on CPU load) * Listcomps that perform the same calculation on each item may be a natural multi-processing structure. Ron From jason.orendorff at gmail.com Thu Mar 22 22:01:11 2007 From: jason.orendorff at gmail.com (Jason Orendorff) Date: Thu, 22 Mar 2007 17:01:11 -0400 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4602354F.1010003@acm.org> References: <4602354F.1010003@acm.org> Message-ID: On 3/22/07, Talin wrote: > I would like to direct your attention to a presentation that was given > by Mr. Herb Sutter on the subject of concurrency support in programming > languages. It's a long video, but I would urge you to watch at least the > first five minutes of it: > > http://video.google.com/videoplay?docid=7625918717318948700&q=herb+sutter Thanks for the excellent summary. I won't try to summarize Brendan Eich's February blog entry on this topic, because it's short enough and worth reading: http://weblogs.mozillazine.org/roadmap/archives/2007/02/threads_suck.html -j From guido at python.org Thu Mar 22 22:20:45 2007 From: guido at python.org (Guido van Rossum) Date: Thu, 22 Mar 2007 14:20:45 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: References: <4602354F.1010003@acm.org> Message-ID: On 3/22/07, Jason Orendorff wrote: > On 3/22/07, Talin wrote: > > I would like to direct your attention to a presentation that was given > > by Mr. Herb Sutter on the subject of concurrency support in programming > > languages. It's a long video, but I would urge you to watch at least the > > first five minutes of it: > > > > http://video.google.com/videoplay?docid=7625918717318948700&q=herb+sutter > > Thanks for the excellent summary. > > I won't try to summarize Brendan Eich's February blog entry > on this topic, because it's short enough and worth reading: > > http://weblogs.mozillazine.org/roadmap/archives/2007/02/threads_suck.html Right. I saw Sutter give this talk (or a very similar one) in Oxford last April, and I'm thoroughly unconvinced that Python is doomed unless it adds more thread support. -- --Guido van Rossum (home page: http://www.python.org/~guido/) From jcarlson at uci.edu Thu Mar 22 23:40:27 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 22 Mar 2007 15:40:27 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4602ED46.9050202@ronadam.com> References: <20070322095206.FCA8.JCARLSON@uci.edu> <4602ED46.9050202@ronadam.com> Message-ID: <20070322143830.FCB4.JCARLSON@uci.edu> Ron Adam wrote: > Josiah Carlson wrote: > > But it's not about threads, it is about concurrent execution of code > > (which threads in Python do not allow). The only way to allow this is > > to basically attach a re-entrant lock on every single Python object > > (depending on the platform, perhaps 12 bytes minimum for count, process, > > thread). The sheer volume of the number of acquire/release cycles > > during execution is staggering (think about the number of incref/decref > > operations), and the increase in size of every object by around 12 bytes > > is not terribly appealing. > > > > On the upside, this is possible (change the PyObject_HEAD macro, > > PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to > > actually make it happen is huge, and it would likely result in negative > > performance until sheer concurrency wins out over the acquire/release > > overhead. > > It seems to me some types of operations are more suited for concurrent > operations than others, so maybe new objects that are designed to be > naturally usable in this way could help. Or maybe there's a way to lock > groups of objects at the same time by having them share a lock if they are > related? That is a fine-grained vs. coarse-grained locking argument. There is literature. On the other hand, there is also the pi-calculus: http://scienceblogs.com/goodmath/2007/03/an_experiment_with_calculus_an_1.php Of course, the pi-calculus is not reasonable unless one starts with a lambda calculus and decides to modify it (parallel lisp?), so it isn't all that applicable here. > Thinking out loud of ways a python program may use concurrent processing: > > * Applying a single function concurrently over a list. (A more limited > function object might make this easier.) > * Feeding a single set of arguments concurrently over a list of callables. > * Generators with the semantics of calculating first and waiting on 'yield' > for 'next', so the value is immediately returned. (depends on CPU load) > * Listcomps that perform the same calculation on each item may be a natural > multi-processing structure. These examples are all what are generally referred to as "embarassingly parallel" in literature. One serious issue with programming parallel algorithms generally is that not all algorithms are necessarily parallelizable. Some are, certainly, but not all. The task is to discover those alternate algorithms that *are* parallelizable in such a way to offer gains that are "worth it". Personally, I think that if there were a *cheap* IPC to make cross-process calls not too expensive, many of the examples above that you talk about would be handled easily. - Josiah From greg.ewing at canterbury.ac.nz Fri Mar 23 02:32:30 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 23 Mar 2007 13:32:30 +1200 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4602ED46.9050202@ronadam.com> References: <4602354F.1010003@acm.org> <999187ed0703220633v2ee4c507x1494a8670b52644@mail.gmail.com> <20070322095206.FCA8.JCARLSON@uci.edu> <4602ED46.9050202@ronadam.com> Message-ID: <46032E2E.8020408@canterbury.ac.nz> Ron Adam wrote: > Thinking out loud of ways a python program may use concurrent processing: > > * Applying a single function concurrently over a list. > * Feeding a single set of arguments concurrently over a list of callables. > * Generators with the semantics of calculating first and waiting on 'yield' > * Listcomps that perform the same calculation on each item The solution is obvious -- just add a 'par' statement. Then-we-can-call-it-Poccam-ly, Greg From greg.ewing at canterbury.ac.nz Fri Mar 23 02:38:24 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Fri, 23 Mar 2007 13:38:24 +1200 Subject: [Python-ideas] Python and Concurrency In-Reply-To: References: <4602354F.1010003@acm.org> Message-ID: <46032F90.1090706@canterbury.ac.nz> Guido van Rossum wrote: > I saw Sutter give this talk (or a very similar one) in Oxford > last April, and I'm thoroughly unconvinced that Python is doomed > unless it adds more thread support. I'm unconvinced, too. Python has always relied mostly on C-level code to get grunty things done efficiently. If the C-level code knows how to take advantage of multiple cores, Python applications will benefit, too. As long as Numeric can make use of the other 127 cores, and OpenGL can run my 512-core GPU at full tilt, I'll be happy. :-) -- Greg From rrr at ronadam.com Fri Mar 23 03:21:30 2007 From: rrr at ronadam.com (Ron Adam) Date: Thu, 22 Mar 2007 21:21:30 -0500 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070322143830.FCB4.JCARLSON@uci.edu> References: <20070322095206.FCA8.JCARLSON@uci.edu> <4602ED46.9050202@ronadam.com> <20070322143830.FCB4.JCARLSON@uci.edu> Message-ID: <460339AA.4030505@ronadam.com> Josiah Carlson wrote: > Ron Adam wrote: >> Josiah Carlson wrote: >>> But it's not about threads, it is about concurrent execution of code >>> (which threads in Python do not allow). The only way to allow this is >>> to basically attach a re-entrant lock on every single Python object >>> (depending on the platform, perhaps 12 bytes minimum for count, process, >>> thread). The sheer volume of the number of acquire/release cycles >>> during execution is staggering (think about the number of incref/decref >>> operations), and the increase in size of every object by around 12 bytes >>> is not terribly appealing. >>> >>> On the upside, this is possible (change the PyObject_HEAD macro, >>> PyINCREF, PyDECREF, remove the GIL), but the amount of work necessary to >>> actually make it happen is huge, and it would likely result in negative >>> performance until sheer concurrency wins out over the acquire/release >>> overhead. >> It seems to me some types of operations are more suited for concurrent >> operations than others, so maybe new objects that are designed to be >> naturally usable in this way could help. Or maybe there's a way to lock >> groups of objects at the same time by having them share a lock if they are >> related? > > That is a fine-grained vs. coarse-grained locking argument. There is > literature. > > On the other hand, there is also the pi-calculus: > http://scienceblogs.com/goodmath/2007/03/an_experiment_with_calculus_an_1.php > > Of course, the pi-calculus is not reasonable unless one starts with a > lambda calculus and decides to modify it (parallel lisp?), so it isn't > all that applicable here. Interesting, but I think you are correct, pi-calculus is probably better suited to a special purpose library. Here's some more complete examples of what I was thinking. From a python programmers point of view. The underlying implementation would need to be worked out of course, either with locks, or messages, or by limiting access to mutable objects in some other way. >> Thinking out loud of ways a python program may use concurrent processing: >> >> * Applying a single function concurrently over a list. (A more limited >> function object might make this easier.) (1) xyz = vector(r) forall coords as c: # parallel modify coords 'inplace' with body c = rotate(c, xyz) * 'with' syntax form because 'c' does not outlive the body. >> * Feeding a single set of arguments concurrently over a list of callables. (2) x = 42 result = [None] * len(funcs) forall (funcs, result) as (f, r): r = f(x) >> * Generators with the semantics of calculating first and waiting on 'yield' >> for 'next', so the value is immediately returned. (depends on CPU load) (3) This corresponds nicely to the suggested use of 'future' in the video. def foo(x=42): # Starts when process is created instead of waiting # for f.next() call to start. while 1: future yield x x += 42 f = foo() # start a concurrent process result = f.wait() # waits here for value if it's not ready. >> * Listcomps that perform the same calculation on each item may be a natural >> multi-processing structure. (4) result = [x = x**2 forall x in values] > These examples are all what are generally referred to as "embarassingly > parallel" in literature. One serious issue with programming parallel > algorithms generally is that not all algorithms are necessarily > parallelizable. Some are, certainly, but not all. The task is to > discover those alternate algorithms that *are* parallelizable in such a > way to offer gains that are "worth it". Yes, they are obvious, but why not start with the obvious? It's what most users will understand first and it may be the less obvious stuff can be put in terms of the more obvious "embarrassingly parallel" things. > Personally, I think that if there were a *cheap* IPC to make > cross-process calls not too expensive, many of the examples above that > you talk about would be handled easily. In the video he also talked about avoiding locks. I was thinking a more limited function object (for concurrent uses only) might be used that has: * no global keyword * only access to external immutable objects * no closures. In other words these need to pass 'values' explicitly at call time and return time. (or 'future yield' time) Ron From talin at acm.org Fri Mar 23 08:12:39 2007 From: talin at acm.org (Talin) Date: Fri, 23 Mar 2007 00:12:39 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070322143830.FCB4.JCARLSON@uci.edu> References: <20070322095206.FCA8.JCARLSON@uci.edu> <4602ED46.9050202@ronadam.com> <20070322143830.FCB4.JCARLSON@uci.edu> Message-ID: <46037DE7.5050300@acm.org> Josiah Carlson wrote: > These examples are all what are generally referred to as "embarassingly > parallel" in literature. One serious issue with programming parallel > algorithms generally is that not all algorithms are necessarily > parallelizable. Some are, certainly, but not all. The task is to > discover those alternate algorithms that *are* parallelizable in such a > way to offer gains that are "worth it". A couple of ideas I wanted to explore: -- A base class or perhaps metaclass that makes python objects transactional. This wraps all attribute access so what you see is your transaction's view of the current state of the object. Something like: with atomic(): obj1.attribute = 1 # Other threads can't 'see' the new value until the # transaction commits. value = obj1.attribute 'atomic()' starts a new transaction in your thread, which is stored in thread-local data; Any objects that you mutate become part of the transaction automatically (will have to be careful about built-in mutable objects such as lists). At the end, the transaction either commits, or if there was a conflict, it rolls back and you get to do it over again. This would be useful for large networks of objects, where you want to make large numbers of local changes, where each local change affects an object and perhaps its surrounding objects. What I am describing is very similar to many complex 3D game worlds. ZODB already has a transactional object mechanism, although it's oriented towards database-style transactions and object persistence. -- A way to partition the space of python objects, such that objects in each partition cannot have references outside of the partition without going through some sort of synchronization mechanism, perhaps via some proxy. The idea is to be able to guarantee that shared state is only accessible in certain ways that you can reason about. -- Talin From jimjjewett at gmail.com Fri Mar 23 18:48:52 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 23 Mar 2007 13:48:52 -0400 Subject: [Python-ideas] [Python-checkins] r54539 - in python/trunk: Lib/string.py Misc/NEWS Objects/typeobject.c In-Reply-To: <20070323045846.0C6001E4005@bag.python.org> References: <20070323045846.0C6001E4005@bag.python.org> Message-ID: (Please note that most replies should trim at least one list from the Cc) On 3/23/07, guido.van.rossum wrote: > Note: without the change to string.py, lots of spurious warnings happen. > What's going on there? I assume it was a defensive measure for subclasses of both Template and X, where X has its own metaclass (which might have an __init__ that shouldn't be ignored). This is where cooperative classes get ugly. You could argue that the "correct" code is to not bother making the super call if it would go all the way to object (or even type), but there isn't a good way to spell that. > + # A super call makes no sense since type() doesn't define __init__(). > + # (Or does it? And should type.__init__() accept three args?) > + # super(_TemplateMetaclass, cls).__init__(name, bases, dct) In this particular case, you could define a type.__init__ that did nothing. (If the signature were wrong, type.__new__ would have already caught it. If __new__ and __init__ are seeing something different, then the change was probably intentional.) The problem isn't really limited to type.__init__ though. You'll sometimes see similar patterns for __del__, close, save, etc. The main difference is that they have to either catch an error or check first, since object doesn't have an implementation of those methods. object.__init__ doesn't really do anything either, except check for errors. Getting rid of it should have the same effect as complaining about parameters. The ideal solution (discussion of which probably ought to stay on python-ideas) might be to replace object.__init__ with some sort of PlaceHolder object that raises an error *unless* called through a cooperative super. This PlaceHolder would also be useful for AbstractBaseClasses/Interfaces. PlaceHolder still wouldn't deal with the original concern of verifying that all arguments had already been stripped and used; but the ABCs might be able to. -jJ > Modified: python/trunk/Lib/string.py > ============================================================================== > --- python/trunk/Lib/string.py (original) > +++ python/trunk/Lib/string.py Fri Mar 23 05:58:42 2007 > @@ -108,7 +108,9 @@ > """ > > def __init__(cls, name, bases, dct): > - super(_TemplateMetaclass, cls).__init__(name, bases, dct) > + # A super call makes no sense since type() doesn't define __init__(). > + # (Or does it? And should type.__init__() accept three args?) > + # super(_TemplateMetaclass, cls).__init__(name, bases, dct) > if 'pattern' in dct: > pattern = cls.pattern > else: From jimjjewett at gmail.com Fri Mar 23 19:30:34 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 23 Mar 2007 14:30:34 -0400 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <460339AA.4030505@ronadam.com> References: <20070322095206.FCA8.JCARLSON@uci.edu> <4602ED46.9050202@ronadam.com> <20070322143830.FCB4.JCARLSON@uci.edu> <460339AA.4030505@ronadam.com> Message-ID: On 3/22/07, Ron Adam wrote: > Josiah Carlson wrote: > > Ron Adam wrote: > >> Josiah Carlson wrote: > >>> On the upside, this is possible (change the PyObject_HEAD macro, > >>> PyINCREF, PyDECREF, remove the GIL), but the amount of work The zillions of details are the same details you have to consider when writing a concurrent database. Having python store its objects externally and retrieve them when needed (vs allocating memory and pointers) would be a huge one-time loss, but it might eventually be worthwhile. PyPy calls that "external database" an "object_space", so the language support is already there, and still will be when the hardware makes it worthwhile. > (1) > xyz = vector(r) > forall coords as c: # parallel modify coords 'inplace' with body > c = rotate(c, xyz) > > * 'with' syntax form because 'c' does not outlive the body. Why not just keep using xyz = vector(r) rotate(xyz) and let the compiler take care of it? At most, we would want a way of marking objects as "read-only" or callables as "instant", so that the compiler would know it doesn't have to worry about the definition of "len" changing mid-stream. (Ron suggests something similar at the end of the message, and Talin's Transactional metaclass is related.) > >> * Generators with the semantics of calculating first and waiting on 'yield' > >> for 'next', so the value is immediately returned. (depends on CPU load) On its own, this just makes response time worse. That said, generators are also callables, and the compiler might do a better job if it knew that it wouldn't have to worry about external redefinitions. There may also be some value in some sort of "idletasks" abstraction that says "hey, go ahead and precompute this, but only if you've got nothing better to do." Garbage collection could certainly benefit from this, and translating (or duplicating) string representations into multiple encodings. > In the video he also talked about avoiding locks. I was thinking a more > limited function object (for concurrent uses only) might be used that has: > * no global keyword > * only access to external immutable objects Or at least, it doesn't mutate them itself, and it doesn't promise to use newer versions that get created after the call begins. > * no closures. This level of information should be useful. But in practice, most of my functions already meet this definition. (In theory, they access builtins that could be replaced, etc.) I wouldn't want to mark them all by hand. I'm still inclined to trust the compiler, and just accept that it may eventually be the PyPy translator rather than the CPython interpreter. -jJ From rrr at ronadam.com Fri Mar 23 23:47:33 2007 From: rrr at ronadam.com (Ron Adam) Date: Fri, 23 Mar 2007 17:47:33 -0500 Subject: [Python-ideas] Python and Concurrency In-Reply-To: References: <20070322095206.FCA8.JCARLSON@uci.edu> <4602ED46.9050202@ronadam.com> <20070322143830.FCB4.JCARLSON@uci.edu> <460339AA.4030505@ronadam.com> Message-ID: <46045905.9030907@ronadam.com> Jim Jewett wrote: > On 3/22/07, Ron Adam wrote: >> Josiah Carlson wrote: >> > Ron Adam wrote: >> >> Josiah Carlson wrote: > >> >>> On the upside, this is possible (change the PyObject_HEAD macro, >> >>> PyINCREF, PyDECREF, remove the GIL), but the amount of work > > The zillions of details are the same details you have to consider when > writing a concurrent database. Having python store its objects > externally and retrieve them when needed (vs allocating memory and > pointers) would be a huge one-time loss, but it might eventually be > worthwhile. > > PyPy calls that "external database" an "object_space", so the language > support is already there, and still will be when the hardware makes it > worthwhile. > > >> (1) >> xyz = vector(r) >> forall coords as c: # parallel modify coords 'inplace' with body >> c = rotate(c, xyz) >> >> * 'with' syntax form because 'c' does not outlive the body. > > > Why not just keep using > > xyz = vector(r) > rotate(xyz) > > and let the compiler take care of it? Just how much should the 'python' compiler do? And what about dynamic modeling where things change according to input? (Not just thinking of graphics.) > At most, we would want a way of marking objects as "read-only" or > callables as "instant", so that the compiler would know it doesn't > have to worry about the definition of "len" changing mid-stream. (Ron > suggests something similar at the end of the message, and Talin's > Transactional metaclass is related.) Would it be possible to have a container for which it's contents are read only? (as a side effect of being in the container) Then individual items would not need their own read only attributes. A few other thoughts... Possibly ... when a module is first imported or ran nothing needs to be marked read only. Then when execution falls off the end, *everything* existing at that point is marked read only. If the module was run from as a script, then a main function is executed. These could be optional settings that could be turned on... from __far_future__ import multi-processing __multi-processing__ = True __main__ = "my_main" So then there would be an initiation first pass followed by an execution phase. In the initiation phase it's pretty much just the way things are now except you can't use threads. In the execution phase you can use threads, but you can't change anything created in the initiation phase. (Other controls would still be needed of course.) >> >> * Generators with the semantics of calculating first and waiting on >> 'yield' >> >> for 'next', so the value is immediately returned. (depends on CPU >> load) > > On its own, this just makes response time worse. Yes, you really wouldn't use these for very simple counters. For more complex things, the calling overhead becomes a much smaller percentage of the total, and it would have a bigger effect on smoothing out response time rather than hurting it. > That said, > generators are also callables, and the compiler might do a better job > if it knew that it wouldn't have to worry about external > redefinitions. > > There may also be some value in some sort of "idletasks" abstraction > that says "hey, go ahead and precompute this, but only if you've got > nothing better to do." Having a way to set process priority may be good. (or bad) >Garbage collection could certainly benefit > from this, and translating (or duplicating) string representations > into multiple encodings. > >> In the video he also talked about avoiding locks. I was thinking a more >> limited function object (for concurrent uses only) might be used that >> has: > >> * no global keyword >> * only access to external immutable objects > > Or at least, it doesn't mutate them itself, and it doesn't promise to > use newer versions that get created after the call begins. The difficulty is that even a method call on a global (or parent scope) object can result in it being mutated. So you are either back to using locks and/or transactions on everything. >> * no closures. > > This level of information should be useful. > > But in practice, most of my functions already meet this definition. Mine too. Ron > (In theory, they access builtins that could be replaced, etc.) I > wouldn't want to mark them all by hand. I'm still inclined to trust > the compiler, and just accept that it may eventually be the PyPy > translator rather than the CPython interpreter. > > -jJ > > From talin at acm.org Sun Mar 25 09:59:05 2007 From: talin at acm.org (Talin) Date: Sun, 25 Mar 2007 00:59:05 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4602354F.1010003@acm.org> References: <4602354F.1010003@acm.org> Message-ID: <46062BC9.2020208@acm.org> Thinking more about this, it seems to me that discussions of syntax for doing parallel operations and nifty classes for synchronization are a bit premature. The real question, it seems to me, is how to get Python to operate concurrently at all. Python's GIL cannot be removed without going through and fixing a thousand different places where different threads might access shared variables within the interpreter. Clearly, that's not the kind of work that is going to be done in a month or less. It might be useful to decompose that task further into some digestible chunks. One of these chunks is garbage collection. It seems to me that reference counting as it exists today would is a serious hindrance to concurrency, because it requires writing to an object each time you create a new reference. Instead, it should be possible to pass a reference to an object between threads, without actually modifying the object unless one of the threads actually changes an attribute. There are a number of papers on concurrent garbage collection out there on the web that might serve as a useful starting point. Of course, the .Net CLR and Java VM already have collectors of this type, so maybe those versions of Python already get this for free. I also wonder what other things that the GIL is protecting can be broken out as large, coherent chunks. -- Talin From aahz at pythoncraft.com Sun Mar 25 15:52:42 2007 From: aahz at pythoncraft.com (Aahz) Date: Sun, 25 Mar 2007 06:52:42 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <46062BC9.2020208@acm.org> References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> Message-ID: <20070325135242.GA24610@panix.com> On Sun, Mar 25, 2007, Talin wrote: > > Thinking more about this, it seems to me that discussions of syntax for > doing parallel operations and nifty classes for synchronization are a > bit premature. The real question, it seems to me, is how to get Python > to operate concurrently at all. Maybe that's what it seems to you; to others of us who have been looking at this problem for a while, the real question is how to get a better multi-process control and IPC library in Python, preferably one that is cross-platform. You can investigate that right now, and you don't even need to discuss it with other people. (Despite my oft-stated fondness for threading, I do recognize the problems with threading, and if there were a way to make processes as simple as threads from a programming standpoint, I'd be much more willing to push processes.) -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Typing is cheap. Thinking is expensive." --Roy Smith From jcarlson at uci.edu Sun Mar 25 18:34:58 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 25 Mar 2007 09:34:58 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070325135242.GA24610@panix.com> References: <46062BC9.2020208@acm.org> <20070325135242.GA24610@panix.com> Message-ID: <20070325084247.FCED.JCARLSON@uci.edu> Aahz wrote: > > On Sun, Mar 25, 2007, Talin wrote: > > > > Thinking more about this, it seems to me that discussions of syntax for > > doing parallel operations and nifty classes for synchronization are a > > bit premature. The real question, it seems to me, is how to get Python > > to operate concurrently at all. > > Maybe that's what it seems to you; to others of us who have been looking > at this problem for a while, the real question is how to get a better > multi-process control and IPC library in Python, preferably one that is > cross-platform. You can investigate that right now, and you don't even > need to discuss it with other people. > > (Despite my oft-stated fondness for threading, I do recognize the > problems with threading, and if there were a way to make processes as > simple as threads from a programming standpoint, I'd be much more > willing to push processes.) I generally agree. XML-RPC works pretty well in this regard, though as I talked about a couple months ago, it's transport format encoding and decoding result in overhead during calling that leaves something to be desired. - Josiah From talin at acm.org Sun Mar 25 18:40:43 2007 From: talin at acm.org (Talin) Date: Sun, 25 Mar 2007 09:40:43 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070325135242.GA24610@panix.com> References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> <20070325135242.GA24610@panix.com> Message-ID: <4606A60B.4040402@acm.org> Aahz wrote: > On Sun, Mar 25, 2007, Talin wrote: >> Thinking more about this, it seems to me that discussions of syntax for >> doing parallel operations and nifty classes for synchronization are a >> bit premature. The real question, it seems to me, is how to get Python >> to operate concurrently at all. > > Maybe that's what it seems to you; to others of us who have been looking > at this problem for a while, the real question is how to get a better > multi-process control and IPC library in Python, preferably one that is > cross-platform. You can investigate that right now, and you don't even > need to discuss it with other people. If you mean some sort of inter-process messaging system, there are a number that already exist; I'd look at IPython and py-globus for starters. My feeling is that while such an approach is vastly easier from the standpoint of Python developers, and may be easier from the standpoint of a typical Python programmer, it doesn't actually solve the problem that I'm attempting to address, which is figuring out how to write client-side software that dynamically scales to the number of processors on the system. My view is that while the number of algorithms that we have that can be efficiently parallelized in a fine-grained threading environment is small (compared to the total number of strictly sequential algorithms), the number of algorithms that can be adapted to heavy-weight, coarse-grained processes is much smaller still. For example, it is easy to imagine a quicksort routine where different threads are responsible for sorting various sub-partitions of the array. If this were to be done via processes, the overhead of marshalling and unmarshalling the array elements would completely swamp the benefits of making it concurrent. -- Talin From rrr at ronadam.com Sun Mar 25 19:13:34 2007 From: rrr at ronadam.com (Ron Adam) Date: Sun, 25 Mar 2007 12:13:34 -0500 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <46062BC9.2020208@acm.org> References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> Message-ID: <4606ADBE.1030406@ronadam.com> Talin wrote: > Thinking more about this, it seems to me that discussions of syntax for > doing parallel operations and nifty classes for synchronization are a > bit premature. Yes, but it may help to have a few possible "end user" examples in mind while you work on the problem. The point isn't the exact syntax, but the uses that they imply and what other requirements these examples need in order to work. >The real question, it seems to me, is how to get Python > to operate concurrently at all. Yes, and I agree that it would be better to split the problem into smaller chunks. Maybe start by finding some "easier" to do simple cases first. > Python's GIL cannot be removed without going through and fixing a > thousand different places where different threads might access shared > variables within the interpreter. Clearly, that's not the kind of work > that is going to be done in a month or less. Here is an older discussion on removing the GIL and multiprocessing vs threading. Maybe it will be of help? http://mail.python.org/pipermail/python-dev/2005-September/056423.html > It might be useful to decompose that task further into some digestible > chunks. One of these chunks is garbage collection. It seems to me that > reference counting as it exists today would is a serious hindrance to > concurrency, because it requires writing to an object each time you > create a new reference. Instead, it should be possible to pass a > reference to an object between threads, without actually modifying the > object unless one of the threads actually changes an attribute. I'm not familiar with the details of python's garbage collecting yet. I was thinking the problem may be simplified by having a single mutable container-cell object. But that wouldn't be enough because in python it isn't just a problem of a mutable object changing, but one of a names reference to an object being changed. Could an explicit name space for sharing in threads and processes help? > There are a number of papers on concurrent garbage collection out there > on the web that might serve as a useful starting point. Of course, the > .Net CLR and Java VM already have collectors of this type, so maybe > those versions of Python already get this for free. > > I also wonder what other things that the GIL is protecting can be broken > out as large, coherent chunks. I have no idea, ... Ron From aahz at pythoncraft.com Sun Mar 25 19:39:05 2007 From: aahz at pythoncraft.com (Aahz) Date: Sun, 25 Mar 2007 10:39:05 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4606A60B.4040402@acm.org> References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> <20070325135242.GA24610@panix.com> <4606A60B.4040402@acm.org> Message-ID: <20070325173905.GA15655@panix.com> On Sun, Mar 25, 2007, Talin wrote: > Aahz wrote: >>On Sun, Mar 25, 2007, Talin wrote: >>> >>>Thinking more about this, it seems to me that discussions of syntax for >>>doing parallel operations and nifty classes for synchronization are a >>>bit premature. The real question, it seems to me, is how to get Python >>>to operate concurrently at all. >> >>Maybe that's what it seems to you; to others of us who have been looking >>at this problem for a while, the real question is how to get a better >>multi-process control and IPC library in Python, preferably one that is >>cross-platform. You can investigate that right now, and you don't even >>need to discuss it with other people. > > If you mean some sort of inter-process messaging system, there are a > number that already exist; I'd look at IPython and py-globus for starters. > > My feeling is that while such an approach is vastly easier from the > standpoint of Python developers, and may be easier from the standpoint > of a typical Python programmer, it doesn't actually solve the problem > that I'm attempting to address, which is figuring out how to write > client-side software that dynamically scales to the number of processors > on the system. How not? Keep in mind that if this kind of library becomes part of the Python Standard Library, the standard library can be written to use this multi-process library. > My view is that while the number of algorithms that we have that can be > efficiently parallelized in a fine-grained threading environment is > small (compared to the total number of strictly sequential algorithms), > the number of algorithms that can be adapted to heavy-weight, > coarse-grained processes is much smaller still. Maybe. I'm not convinced, but see below. > For example, it is easy to imagine a quicksort routine where different > threads are responsible for sorting various sub-partitions of the array. > If this were to be done via processes, the overhead of marshalling and > unmarshalling the array elements would completely swamp the benefits of > making it concurrent. The problem, though, is that Threading Doesn't Work for what you're talking about. SMP threading doesn't really scale when you're talking about hundreds of CPUs. This kind of problem really is better handled at the library level: if it's worth splitting, the sort algorithm can figure out how to do that. (Whether it's threading or processes really doesn't matter, the sort algorithm just calls an underlying library to manage it. For example, it could put a lock around the list and the C library releases the GIL to do its work. As long as the overall sort() call was synchronous, it should work.) Generally speaking, it won't be worth splitting for less than a million elements... -- Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/ "Typing is cheap. Thinking is expensive." --Roy Smith From jcarlson at uci.edu Sun Mar 25 20:03:06 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 25 Mar 2007 11:03:06 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <4606A60B.4040402@acm.org> References: <20070325135242.GA24610@panix.com> <4606A60B.4040402@acm.org> Message-ID: <20070325101901.FCF0.JCARLSON@uci.edu> Talin wrote: > > Aahz wrote: > > On Sun, Mar 25, 2007, Talin wrote: > >> Thinking more about this, it seems to me that discussions of syntax for > >> doing parallel operations and nifty classes for synchronization are a > >> bit premature. The real question, it seems to me, is how to get Python > >> to operate concurrently at all. > > > > Maybe that's what it seems to you; to others of us who have been looking > > at this problem for a while, the real question is how to get a better > > multi-process control and IPC library in Python, preferably one that is > > cross-platform. You can investigate that right now, and you don't even > > need to discuss it with other people. > > If you mean some sort of inter-process messaging system, there are a > number that already exist; I'd look at IPython and py-globus for starters. > > My feeling is that while such an approach is vastly easier from the > standpoint of Python developers, and may be easier from the standpoint > of a typical Python programmer, it doesn't actually solve the problem > that I'm attempting to address, which is figuring out how to write > client-side software that dynamically scales to the number of processors > on the system. At some point either the user or the system (Python) needs to figure out that splitting up a sequential task into multiple parallel tasks is productive. On the system end of things, that isn't easy. How much money and time has been poured into C/C++ compiler development, and about all they can auto parallelize (via vector operations) are things like: for (i=0;i My view is that while the number of algorithms that we have that can be > efficiently parallelized in a fine-grained threading environment is > small (compared to the total number of strictly sequential algorithms), > the number of algorithms that can be adapted to heavy-weight, > coarse-grained processes is much smaller still. > > For example, it is easy to imagine a quicksort routine where different > threads are responsible for sorting various sub-partitions of the array. > If this were to be done via processes, the overhead of marshalling and > unmarshalling the array elements would completely swamp the benefits of > making it concurrent. But that algorithm wouldn't be used for sorting data on multiple processors. A variant of mergesort would be used (distribute blocks equally to processors, sort them individually, merge the results - in parallel). But again, all of this relies on two things: 1. a method for executing multiple streams of instructions simultaneously 2. a method of communication between the streams of instructions Without significant work, #1 isn't possible using threads in Python. It is trivial using processes. Without work, #2 isn't "fast" using processes in Python. It is trivial using threads. But here's the thing: with work, #2 can be made fast. Using unix domain sockets (on linux, 3.4 ghz P4 Xeons, DDR2-PC4200 memory (you can get 50% faster memory nowadays)), I've been able to push 400 megs/second between processes. Maybe anonymous or named pipes, or perhaps a shared mmap with some sort of synchronization would allow for IPC to be cross platform and just about as fast. The reason that I (and perhaps others) have been pushing for IPC is because it is easier to solve than the removal of Pythons threading limitations, with many of the same benefits, and even a few extra (being able to distribute processes across different machines). - Josiah From r.m.oudkerk at googlemail.com Sun Mar 25 23:39:01 2007 From: r.m.oudkerk at googlemail.com (Richard Oudkerk) Date: Sun, 25 Mar 2007 22:39:01 +0100 Subject: [Python-ideas] Python and Concurrency Message-ID: Aahz wrote: > Maybe that's what it seems to you; to others of us who have been looking > at this problem for a while, the real question is how to get a better > multi-process control and IPC library in Python, preferably one that is > cross-platform. You can investigate that right now, and you don't even > need to discuss it with other people. > (Despite my oft-stated fondness for threading, I do recognize the > problems with threading, and if there were a way to make processes as > simple as threads from a programming standpoint, I'd be much more > willing to push processes.) The processing package at http://cheeseshop.python.org/pypi/processing is multi-platform and mostly follows the API of threading. It also allows use of 'shared objects' which live in a manager process. For example the following code is almost identical to the equivalent written with threads: from processing import Process, Manager def f(q): for i in range(10): q.put(i*i) q.put('STOP') if __name__ == '__main__': manager = Manager() queue = manager.Queue(maxsize=3) p = Process(target=f, args=[queue]) p.start() result = None while result != 'STOP': result = queue.get() print result p.join() Josiah wrote: > Without work, #2 isn't "fast" using processes in Python. It is trivial > using threads. But here's the thing: with work, #2 can be made fast. > Using unix domain sockets (on linux, 3.4 ghz P4 Xeons, DDR2-PC4200 > memory (you can get 50% faster memory nowadays)), I've been able to push > 400 megs/second between processes. Maybe anonymous or named pipes, or > perhaps a shared mmap with some sort of synchronization would allow for > IPC to be cross platform and just about as fast. The IPC uses sockets or (on Windows) named pipes. Linux and Windows are roughly equal in speed. On a P4 2.5Ghz laptop one can retreive an element from a shared dict about 20,000 times/sec. Not sure if that qualifies as fast enough. Richard From jcarlson at uci.edu Mon Mar 26 01:15:51 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Sun, 25 Mar 2007 16:15:51 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: References: Message-ID: <20070325155417.FCF3.JCARLSON@uci.edu> "Richard Oudkerk" wrote: > Josiah wrote: > > Without work, #2 isn't "fast" using processes in Python. It is trivial > > using threads. But here's the thing: with work, #2 can be made fast. > > Using unix domain sockets (on linux, 3.4 ghz P4 Xeons, DDR2-PC4200 > > memory (you can get 50% faster memory nowadays)), I've been able to push > > 400 megs/second between processes. Maybe anonymous or named pipes, or > > perhaps a shared mmap with some sort of synchronization would allow for > > IPC to be cross platform and just about as fast. > > The IPC uses sockets or (on Windows) named pipes. Linux and Windows > are roughly equal in speed. On a P4 2.5Ghz laptop one can retreive an > element from a shared dict about 20,000 times/sec. Not sure if that > qualifies as fast enough. Depends on what the element is, but I suspect it isn't fast enough. Fairly large native dictionaries seem to run on the order of 1.3 million fetches/second on my 2.8 ghz machine. >>> import time >>> d = dict.fromkeys(xrange(65536)) >>> if 1: ... t = time.time() ... for j in xrange(1000000): ... _ = d[j&65535] ... print 1000000/(time.time()-t) ... 1305482.97346 >>> But really, transferring little bits of data back and forth isn't what is of my concern in terms of speed. My real concern is transferring nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k, 16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to discover the "sweet spot" for a particular implementation, and also allow a person to discover whether or not their system can be used for nontrivial processor loads. - Josiah From gsakkis at rutgers.edu Mon Mar 26 02:07:13 2007 From: gsakkis at rutgers.edu (George Sakkis) Date: Sun, 25 Mar 2007 20:07:13 -0400 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070325155417.FCF3.JCARLSON@uci.edu> References: <20070325155417.FCF3.JCARLSON@uci.edu> Message-ID: <91ad5bf80703251707s220d4346u3e81737ea8f1a162@mail.gmail.com> On 3/25/07, Josiah Carlson wrote: > But really, transferring little bits of data back and forth isn't what > is of my concern in terms of speed. My real concern is transferring > nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k, > 16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to > discover the "sweet spot" for a particular implementation, and also > allow a person to discover whether or not their system can be used for > nontrivial processor loads. Not directly relevant to the discussion, but I attended recently a talk from the main developer of STXXL (http://stxxl.sourceforge.net/), an STL-compatible library for handling huge volumes of data. The keys to efficient processing are support for parallel disks, explicit overlapping between I/O and computation, and I/O pipelining. More details are available at http://i10www.ira.uka.de/dementiev/stxxl/report/. George From r.m.oudkerk at googlemail.com Tue Mar 27 01:24:37 2007 From: r.m.oudkerk at googlemail.com (Richard Oudkerk) Date: Tue, 27 Mar 2007 00:24:37 +0100 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070325155417.FCF3.JCARLSON@uci.edu> References: <20070325155417.FCF3.JCARLSON@uci.edu> Message-ID: On 26/03/07, Josiah Carlson wrote: > But really, transferring little bits of data back and forth isn't what > is of my concern in terms of speed. My real concern is transferring > nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k, > 16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to > discover the "sweet spot" for a particular implementation, and also > allow a person to discover whether or not their system can be used for > nontrivial processor loads. The "20,000 fetches/sec" was just for retreving a "small" object (an integer), so it only really reflects the server overhead. (Sending integer objects directly between processes is maybe 6 times faster.) Fetching string objects of particular sizes from a shared dict gives the following results on the same computer: string size fetches/sec throughput ----------- ----------- ---------- 1 kb 15,000 15 Mb/s 4 kb 13,000 52 Mb/s 16 kb 8,500 130 Mb/s 64 kb 1,800 110 Mb/s 256 kb 196 49 Mb/s 1 Mb 50 50 Mb/s 4 Mb 13 52 Mb/s 16 Mb 3.2 51 Mb/s 64 Mb 0.84 54 Mb/s From jcarlson at uci.edu Tue Mar 27 06:09:04 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Mon, 26 Mar 2007 21:09:04 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: References: <20070325155417.FCF3.JCARLSON@uci.edu> Message-ID: <20070326210239.FCFF.JCARLSON@uci.edu> "Richard Oudkerk" wrote: > On 26/03/07, Josiah Carlson wrote: > > But really, transferring little bits of data back and forth isn't what > > is of my concern in terms of speed. My real concern is transferring > > nontrivial blocks of data; I usually benchmark blocks of sizes: 1k, 4k, > > 16k, 64k, 256k, 1M, 4M, 16M, and 64M. Those are usually pretty good to > > discover the "sweet spot" for a particular implementation, and also > > allow a person to discover whether or not their system can be used for > > nontrivial processor loads. > > The "20,000 fetches/sec" was just for retreving a "small" > object (an integer), so it only really reflects the server > overhead. (Sending integer objects directly between processes > is maybe 6 times faster.) That's a positive sign. > Fetching string objects of particular sizes from a shared dict gives > the following results on the same computer: Those numbers look pretty good. Would I be correct in assuming that there is a speedup sending blocks directly between processes? (though perhaps not the 6x that integer sending gains) I will definitely have to dig deeper, this could be the library that we've been looking for. - Josiah From lists at cheimes.de Tue Mar 27 20:52:57 2007 From: lists at cheimes.de (Christian Heimes) Date: Tue, 27 Mar 2007 20:52:57 +0200 Subject: [Python-ideas] Replace time_t by Py_time_t Message-ID: In the thread "datetime module enhancements" Guido and others said that it is unpythonic to limit timestamp (seconds since Epoch) to an signed int with 32bit. http://permalink.gmane.org/gmane.comp.python.devel/86750 I've made a patch that introduces a new type Py_time_t as a first step to increase the size of time stamps. For now it's just an alias for time_t. typedef time_t Py_time_t; I'm proposing to change time_t in two steps: Python 2.6: Replace every occurrence of time_t by Py_time_t and give third party authors time to change their software Python 2.7 / 3000: Change Py_time_t to a signed 64bit int on all platforms and provide the necessary workaround for platforms with a 32bit time_t. Patch: http://python.org/sf/1689402 Christian From r.m.oudkerk at googlemail.com Wed Mar 28 02:36:32 2007 From: r.m.oudkerk at googlemail.com (Richard Oudkerk) Date: Wed, 28 Mar 2007 01:36:32 +0100 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070326210239.FCFF.JCARLSON@uci.edu> References: <20070325155417.FCF3.JCARLSON@uci.edu> <20070326210239.FCFF.JCARLSON@uci.edu> Message-ID: On 27/03/07, Josiah Carlson wrote: > Those numbers look pretty good. Would I be correct in assuming that > there is a speedup sending blocks directly between processes? (though > perhaps not the 6x that integer sending gains) Yes, sending blocks directly between processes is over 3 times faster for 1k blocks, and twice as fast for 4k blocks, but after that it makes little difference. (This is using the 'processing.connection' sub-package which is partly written in C.) Of course since these blocks are string data you can avoid the pickle translation which makes things get faster still: the peak bandwidth I get is 40,000 x 16k blocks / sec = 630 Mb/s. PS. It would be nice if the standard library had support for sending message oriented data over a connection so that you could just do 'recv()' and 'send()' without worrying about whether the whole message was successfully read/written. You can use 'socket.makefile()' for line oriented text messages but not for binary data. From jcarlson at uci.edu Wed Mar 28 03:14:13 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Tue, 27 Mar 2007 18:14:13 -0700 Subject: [Python-ideas] Python and Concurrency In-Reply-To: References: <20070326210239.FCFF.JCARLSON@uci.edu> Message-ID: <20070327180152.FD0D.JCARLSON@uci.edu> "Richard Oudkerk" wrote: > On 27/03/07, Josiah Carlson wrote: > > Those numbers look pretty good. Would I be correct in assuming that > > there is a speedup sending blocks directly between processes? (though > > perhaps not the 6x that integer sending gains) > > Yes, sending blocks directly between processes is over 3 times faster > for 1k blocks, and twice as fast for 4k blocks, but after that it makes > little difference. (This is using the 'processing.connection' > sub-package which is partly written in C.) I'm surprised that larger objects see little gain from the removal of an encoding/decoding step and transfer. > Of course since these blocks are string data you can avoid the pickle > translation which makes things get faster still: the peak bandwidth I > get is 40,000 x 16k blocks / sec = 630 Mb/s. Very nice. > PS. It would be nice if the standard library had support for sending > message oriented data over a connection so that you could just do > 'recv()' and 'send()' without worrying about whether the whole message > was successfully read/written. You can use 'socket.makefile()' for > line oriented text messages but not for binary data. Well, there's also the problem that sockets, files, and pipes behave differently on Windows. If one is only concerned about sockets, there are various lightly defined protocols that can be simply implemented on top of asyncore/asynchat, among them is the sending of a 32 bit length field in network-endian order, followed by the data to be sent immediately afterwards. Taking some methods and tossing them into a synchronous sockets package wouldn't be terribly difficult (I've done a variant of this for a commercial project). Doing this generally may not find support, as my idea of sharing encoding/decoding/internal state transition/etc in sync/async servers was shot down at least a year ago. - Josiah From r.m.oudkerk at googlemail.com Thu Mar 29 01:10:21 2007 From: r.m.oudkerk at googlemail.com (Richard Oudkerk) Date: Thu, 29 Mar 2007 00:10:21 +0100 Subject: [Python-ideas] Python and Concurrency In-Reply-To: <20070327180152.FD0D.JCARLSON@uci.edu> References: <20070326210239.FCFF.JCARLSON@uci.edu> <20070327180152.FD0D.JCARLSON@uci.edu> Message-ID: On 28/03/07, Josiah Carlson wrote: > Well, there's also the problem that sockets, files, and pipes behave > differently on Windows. Windows named pipes have a native message mode. > If one is only concerned about sockets, there are various lightly > defined protocols that can be simply implemented on top of > asyncore/asynchat, among them is the sending of a 32 bit length field in > network-endian order, followed by the data to be sent immediately > afterwards. That's exactly what I was doing. From greg.ewing at canterbury.ac.nz Thu Mar 29 04:27:10 2007 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Thu, 29 Mar 2007 14:27:10 +1200 Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python and Concurrency) In-Reply-To: <46062BC9.2020208@acm.org> References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> Message-ID: <460B23FE.2010309@canterbury.ac.nz> I was thinking about thread killing, and why we think it's okay to kill OS processes but not threads. Killing an OS process is unlikely to cause other processes with which it's communicating to hang, since closing one end of a pipe or socket causes anything on the other end to get EOF on reading or a signal or error on writing. But the way threads usually communicate, using locks and queues, means that it's easy for a thread to get hung up waiting for something that another thread is supposed to do, but won't, because it's been killed. So I'm wondering if we want something for inter- thread communication that works something like a cross between a queue and a pipe. It knows what threads are connected to it, and if all threads on one end exit or get killed, threads on the other end find out about it. We could call it a Quipe or Sockqueue or Quocket (or if we want to be boring, a Channel). This would probably have to be hooked into the threads implementation at a low level, since it would need to be able to detect the death of a thread by any means, without relying on any cooperation from the user's code. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing at canterbury.ac.nz +--------------------------------------+ From jimjjewett at gmail.com Thu Mar 29 15:29:16 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Thu, 29 Mar 2007 09:29:16 -0400 Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python and Concurrency) In-Reply-To: <460B23FE.2010309@canterbury.ac.nz> References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> <460B23FE.2010309@canterbury.ac.nz> Message-ID: On 3/28/07, Greg Ewing wrote: > I was thinking about thread killing, and why we > think it's okay to kill OS processes but not > threads. [Suggestion for a Queue variant that knows when one end is dead] I think the bigger problem is when threads don't restrict themselves to queues, but just use the same memory directly. If a thread dies in the middle of an "atomic" action, other threads will see corrupted memory. If a process dies then, nobody else would be able to see the memory anyhow. What we really need is a Task object that treats shared memory (perhaps with a small list of specified exceptions) as immutable. If you're willing to rely on style guidelines, then you can already get this today. If you want safety, and efficiency ... that may be harder to do as an addon. -jJ From jcarlson at uci.edu Thu Mar 29 18:32:02 2007 From: jcarlson at uci.edu (Josiah Carlson) Date: Thu, 29 Mar 2007 09:32:02 -0700 Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python and Concurrency) In-Reply-To: <460B23FE.2010309@canterbury.ac.nz> References: <46062BC9.2020208@acm.org> <460B23FE.2010309@canterbury.ac.nz> Message-ID: <20070329092625.FD40.JCARLSON@uci.edu> Greg Ewing wrote: > I was thinking about thread killing, and why we > think it's okay to kill OS processes but not > threads. I don't know how useful the feature would be (I've not had this particular issue before), but one implementation strategy would be to use thread local storage and weak references to the incoming queue of other threads. Getting queues in the proper thread local storage between two threads is a little more tricky when you want it done automatically, but a couple lines of boilerplate and that's fixed. - Josiah From rrr at ronadam.com Thu Mar 29 21:07:21 2007 From: rrr at ronadam.com (Ron Adam) Date: Thu, 29 Mar 2007 14:07:21 -0500 Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python and Concurrency) In-Reply-To: References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> <460B23FE.2010309@canterbury.ac.nz> Message-ID: <460C0E69.9060007@ronadam.com> Jim Jewett wrote: > On 3/28/07, Greg Ewing wrote: >> I was thinking about thread killing, and why we >> think it's okay to kill OS processes but not >> threads. > > [Suggestion for a Queue variant that knows when one end is dead] > > I think the bigger problem is when threads don't restrict themselves > to queues, but just use the same memory directly. If a thread dies in > the middle of an "atomic" action, other threads will see corrupted > memory. If a process dies then, nobody else would be able to see the > memory anyhow. > > What we really need is a Task object that treats shared memory > (perhaps with a small list of specified exceptions) as immutable. I think a set of all of these as tools would be good. They are all different parts of the same elephant. And I don't see why it needs to be a single unified thing. * A 'better' task object for easily creating tasks. + We have a threading object now. (Needs improving.) * A message mechanism (Que-pipe) for getting the status of a task. + In and out message ques for communicating with a thread. + A way to wait on task events (messages) nicely. + A way for exceptions to propagate out of task objects. * Shared memory - + Prevent names from being rebound + Prevent objects from being altered (It isn't just about objects being immutable, but also about name's not being rebound to other objects. Unless you want to pass objects references for every object to every task? Or you trust that any imported code will play nice.) Would the following be a good summery of the issues? (Where the terms used mean) frozen: object can't be altered while frozen locked: name can't be rebound to another object Threading issues: (In order of difficulty) 1. Pass immutable objects back and forth. + Works now. 2. Share immutable objects by-ref. + Works now. 3. Pass mutable "deep" copies back and forth. ? Works now. (but not for all objects?) 4. Pass frozen mutable objects. - Needs freezable/unfreezable mutable objects. (Not the same as making an immutable copy.) 5. Share immutable object in a shared name space. - Needs name locking. 6. Share "frozen" mutable objects in shared name space. - Needs name locking - Needs freezable mutable objects 7. pass mutable objects. - Has conflicts with shared access. 8. Share mutable object by-ref. - Has conflicts. (same as #7.) 9. Share mutable object in shared name space. - Needs name locking. - Has conflicts. If we can make the first 6 of these work, that may be enough. 7,8 and 9 have to do with race conditions and other simultaneous data access issues. Name locking might work like this: Doing 'lock ' and 'unlock ' could just move the name to and from a locked name space in the same scope. Attempting to rebind a name while it's in a locked name space would raise an exception. The point is, rather than attach a lock to each individual name, it may be much easier, faster, and save memory by having a new name spaces for this. You could also pass a locked-name-space reference to a task all at once like we pass locals or globals now. It doesn't identify who locked what, so unlocking a name used by a thread would be in the "if it hurts, don't do that" category. If locking or unlocking a name in outside scopes is disallowed, then knowing who locked what won't be a problem. Freezing would be some way of preventing an object from being changed. It isn't concerned with the name it is referenced with. And it should not require copying the object. Part of that may be made easier by locking names.(?) Frozen objects may be useful in other ways besides threading.(?) Names locked to immutable objects act like constants so they may have other uses as well. Cheers, Ron > If you're willing to rely on style guidelines, then you can already > get this today. > > If you want safety, and efficiency ... that may be harder to do as an addon. From jimjjewett at gmail.com Fri Mar 30 21:49:01 2007 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 30 Mar 2007 15:49:01 -0400 Subject: [Python-ideas] Thread IPC idea: Quipe? Sockqueue? (Re: Python and Concurrency) In-Reply-To: <460C0E69.9060007@ronadam.com> References: <4602354F.1010003@acm.org> <46062BC9.2020208@acm.org> <460B23FE.2010309@canterbury.ac.nz> <460C0E69.9060007@ronadam.com> Message-ID: On 3/29/07, Ron Adam wrote: > Jim Jewett wrote: > > What we really need is a Task object that treats shared memory > > (perhaps with a small list of specified exceptions) as immutable. > * A 'better' task object for easily creating tasks. > + We have a threading object now. (Needs improving.) But the task isn't in any way restricted. Brett's security sandbox might be a useful starting point, if it is fast enough. Otherwise, we'll probably need to stick with microthreading to get things small enough to contain. > * Shared memory - > + Prevent names from being rebound > + Prevent objects from being altered I had thought of the names as being part of a shared dictionary. (Of course, immutable dictionaries aren't really available out-of-the-box now, and I'm not sure I would trust the supposed immutability of anything that wasn't proxied.) > frozen: object can't be altered while frozen > locked: name can't be rebound to another object > 3. Pass mutable "deep" copies back and forth. > ? Works now. (but not for all objects?) Well, anything that can be deep-copied -- unless you also want the mutations to be collected back into a single location. > 4. Pass frozen mutable objects. > - Needs freezable/unfreezable mutable objects. > (Not the same as making an immutable copy.) And there is where it starts to fall apart. Though if you look at the pypy dict and interpreter optimizations, they have started to deal with it through versioning types. http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#multi-dicts http://codespeak.net/pypy/dist/pypy/doc/interpreter-optimizations.html#id23 -jJ