From bmintern at gmail.com Mon Jun 2 17:05:42 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Mon, 2 Jun 2008 11:05:42 -0400 Subject: [Python-ideas] Allow non-callable default_factory for defaultdict Message-ID: <4c0fccce0806020805u6c9f7e9cv5a16d4e791d94863@mail.gmail.com> I have just begun using defaultdict and have found it to be very useful for cleaning up (1) d.setdefault(key, initial_collection) and (2) repeated d.get(key, constant) calls which all use the same constant. (1) is trivial because in most cases, initial_collection is something like "set()" or "[]", and instead I can use "d = defaultdict(set)" or "d = defaultdict(list)" to accomplish the same thing. Now, all those d.setdefault(key...) calls become d[key]. This is much nicer, and I'm sure most of you are familiar with it. (2) is slightly less common, but it still comes up. This can also be handled by defaultdict, using "d = defaultdict(lambda: constant)". Now, all those d.get(key...) calls become d[key]. As with (1), this makes the code significantly nicer, and perhaps some of you have used it. I would like to propose simplifying (2). Instead of using "d = defaultdict(lambda: constant)", it would be nice to be able to use "d = defaultdict(constant)". In 2.5.2: >>> d = defaultdict("missing") Traceback (most recent call last): File "", line 1, in TypeError: first argument must be callable Obviously, defaultdict is already checking its argument (perhaps it's an assert, but still...), so it knows whether the argument is callable or not. My proposal is that if default_factory is callable or None, the behavior of __missing__ remains the same. Otherwise, the behavior of __missing__ is simply to insert default_factory in the dictionary for the key and return it. I can see one drawback to this: there is a risk of people using defaultdict([]) instead of defaultdict(list) with the idea that they will do the same thing. I think this problem can be easily overcome in the defaultdict documentation by specifically mentioning such a case as a gotcha while also using an example with a non-callable that shows how it is similar to using dict.get(...). Would anyone else find such a change to be helpful? Brandon From bmintern at gmail.com Mon Jun 2 13:48:25 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Mon, 2 Jun 2008 07:48:25 -0400 Subject: [Python-ideas] Implement __add__ for set and frozenset Message-ID: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> I would like to propose that the + operator be a synonym for | (union) on sets. IMHO, the succinct power of being able to say sum(list_of_sets, set()) as opposed to import operator reduce(operator.and_, list_of_sets, set()) far outweighs any problems with having two operators for union. The sum paradigm is certainly more readable as well. I realize that a function named "unionall" could be defined trivially, but with a built-in already serving the purpose in a readable and reasonable way, it seems silly to not use it. I anticipate that someone will ask why such a paradigm should be available for union as opposed to the other set operations. My answer is that several standard algorithms rely on a union over a sequence of sets; one example is the construction of the "follow" set in constructing an LL or SLR parser, and I know I've seen others that I cannot remember off the top of my head. There is even a mathematical symbol (big-U) for it akin to the summation sign (uppercase sigma). Any feedback? Brandon From bruce at leapyear.org Tue Jun 3 02:27:04 2008 From: bruce at leapyear.org (Bruce Leban) Date: Mon, 2 Jun 2008 17:27:04 -0700 Subject: [Python-ideas] Allow non-callable default_factory for defaultdict In-Reply-To: <4c0fccce0806020805u6c9f7e9cv5a16d4e791d94863@mail.gmail.com> References: <4c0fccce0806020805u6c9f7e9cv5a16d4e791d94863@mail.gmail.com> Message-ID: I don't see an extra lambda as that complicated. What is more complicated is that when I see the code: d = defaultdict(foo) I (the reader of the code) can't tell whether foo is the default value or the default function. So I better always write (lambda: foo) unless foo is a constant. Doesn't to have enough benefit to change. --- Bruce On Mon, Jun 2, 2008 at 8:05 AM, Brandon Mintern wrote: > I have just begun using defaultdict and have found it to be very > useful for cleaning up (1) d.setdefault(key, initial_collection) and > (2) repeated d.get(key, constant) calls which all use the same > constant. > > (1) is trivial because in most cases, initial_collection is something > like "set()" or "[]", and instead I can use "d = defaultdict(set)" or > "d = defaultdict(list)" to accomplish the same thing. Now, all those > d.setdefault(key...) calls become d[key]. This is much nicer, and I'm > sure most of you are familiar with it. > > (2) is slightly less common, but it still comes up. This can also be > handled by defaultdict, using "d = defaultdict(lambda: constant)". > Now, all those d.get(key...) calls become d[key]. As with (1), this > makes the code significantly nicer, and perhaps some of you have used > it. > > I would like to propose simplifying (2). Instead of using "d = > defaultdict(lambda: constant)", it would be nice to be able to use "d > = defaultdict(constant)". In 2.5.2: > > >>> d = defaultdict("missing") > Traceback (most recent call last): > File "", line 1, in > TypeError: first argument must be callable > > Obviously, defaultdict is already checking its argument (perhaps it's > an assert, but still...), so it knows whether the argument is callable > or not. My proposal is that if default_factory is callable or None, > the behavior of __missing__ remains the same. Otherwise, the behavior > of __missing__ is simply to insert default_factory in the dictionary > for the key and return it. > > I can see one drawback to this: there is a risk of people using > defaultdict([]) instead of defaultdict(list) with the idea that they > will do the same thing. I think this problem can be easily overcome in > the defaultdict documentation by specifically mentioning such a case > as a gotcha while also using an example with a non-callable that shows > how it is similar to using dict.get(...). > > Would anyone else find such a change to be helpful? > > Brandon > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From george.sakkis at gmail.com Tue Jun 3 03:21:32 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Mon, 2 Jun 2008 21:21:32 -0400 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> Message-ID: <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> Regardless of the operator, that's a pretty inefficient way of doing "unionall"; it creates N-1 intermediate result sets that discards them right after they are added. It should be written as: big_u = set() for s in all_sets: big_u.update(s) I wouldn't mind having a standard unionall, but not every 3-line function has to be in the stdlib. George On Mon, Jun 2, 2008 at 7:48 AM, Brandon Mintern wrote: > I would like to propose that the + operator be a synonym for | (union) > on sets. IMHO, the succinct power of being able to say > > sum(list_of_sets, set()) > > as opposed to > > import operator > reduce(operator.and_, list_of_sets, set()) > > far outweighs any problems with having two operators for union. The > sum paradigm is certainly more readable as well. I realize that a > function named "unionall" could be defined trivially, but with a > built-in already serving the purpose in a readable and reasonable way, > it seems silly to not use it. > > I anticipate that someone will ask why such a paradigm should be > available for union as opposed to the other set operations. My answer > is that several standard algorithms rely on a union over a sequence of > sets; one example is the construction of the "follow" set in > constructing an LL or SLR parser, and I know I've seen others that I > cannot remember off the top of my head. There is even a mathematical > symbol (big-U) for it akin to the summation sign (uppercase sigma). > > Any feedback? > > Brandon > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmintern at gmail.com Tue Jun 3 03:21:54 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Mon, 2 Jun 2008 21:21:54 -0400 Subject: [Python-ideas] Allow non-callable default_factory for defaultdict In-Reply-To: References: <4c0fccce0806020805u6c9f7e9cv5a16d4e791d94863@mail.gmail.com> Message-ID: <4c0fccce0806021821n57ee3044icf57fbe8942ab0eb@mail.gmail.com> On Mon, Jun 2, 2008 at 8:27 PM, Bruce Leban wrote: > I don't see an extra lambda as that complicated. What is more complicated is > that when I see the code: > > d = defaultdict(foo) > > I (the reader of the code) can't tell whether foo is the default value or > the default function. So I better always write (lambda: foo) unless foo is a > constant. Doesn't to have enough benefit to change. > > --- Bruce Ahh.... that makes a lot of sense. I hadn't considered it from that angle. Thanks for the good explanation, Brandon From bmintern at gmail.com Tue Jun 3 03:28:44 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Mon, 2 Jun 2008 21:28:44 -0400 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> Message-ID: <4c0fccce0806021828v38cb6c14q9b32bbda1dca5fbe@mail.gmail.com> On Mon, Jun 2, 2008 at 9:21 PM, George Sakkis wrote: > Regardless of the operator, that's a pretty inefficient way of doing > "unionall"; it creates N-1 intermediate result sets that discards them right > after they are added. It should be written as: > > big_u = set() > for s in all_sets: > big_u.update(s) > > I wouldn't mind having a standard unionall, but not every 3-line function > has to be in the stdlib. > > George I thought max was implemented using += (i.e. it usually starts at 0 and uses += on each item in the iterable). If so, implementing ** _iadd_ ** would result in exactly the code you posted. I realize that I said __add__ in the first place, but __iadd__ is really what I meant. Brandon From george.sakkis at gmail.com Tue Jun 3 04:06:51 2008 From: george.sakkis at gmail.com (George Sakkis) Date: Mon, 2 Jun 2008 22:06:51 -0400 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <4c0fccce0806021828v38cb6c14q9b32bbda1dca5fbe@mail.gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <4c0fccce0806021828v38cb6c14q9b32bbda1dca5fbe@mail.gmail.com> Message-ID: <91ad5bf80806021906x3fcf15c0qefacd658aea0929b@mail.gmail.com> On Mon, Jun 2, 2008 at 9:28 PM, Brandon Mintern wrote: > > I thought max was implemented using += (i.e. it usually starts at 0 > and uses += on each item in the iterable). If so, implementing ** > _iadd_ ** would result in exactly the code you posted. I realize that > I said __add__ in the first place, but __iadd__ is really what I > meant. > No, it uses __add__: $ python -c " class Set(set): __iadd__=set.__ior__ sum([Set([1]), Set([2])], Set()) " Traceback (most recent call last): File "", line 3, in TypeError: unsupported operand type(s) for +: 'Set' and 'Set' You can easily see the quadratic behavior of __add__: $ python -m timeit -s "class Set(set): __add__=set.__or__" "sum( (Set(range(i*10, i*10+10)) for i in xrange(100)), Set())" 100 loops, best of 3: 2.4 msec per loop $ python -m timeit -s "class Set(set): __add__=set.__or__" "sum( (Set(range(i*10, i*10+10)) for i in xrange(200)), Set())" 100 loops, best of 3: 8.04 msec per loop $ python -m timeit -s "class Set(set): __add__=set.__or__" "sum( (Set(range(i*10, i*10+10)) for i in xrange(400)), Set())" 10 loops, best of 3: 33.3 msec per loop $ python -m timeit -s "class Set(set): __add__=set.__or__" "sum( (Set(range(i*10, i*10+10)) for i in xrange(800)), Set())" 10 loops, best of 3: 141 msec per loop $ python -m timeit -s "class Set(set): __add__=set.__or__" "sum( (Set(range(i*10, i*10+10)) for i in xrange(1600)), Set())" 10 loops, best of 3: 684 msec per loop George -------------- next part -------------- An HTML attachment was scrubbed... URL: From bmintern at gmail.com Tue Jun 3 04:28:41 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Mon, 2 Jun 2008 22:28:41 -0400 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <91ad5bf80806021906x3fcf15c0qefacd658aea0929b@mail.gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <4c0fccce0806021828v38cb6c14q9b32bbda1dca5fbe@mail.gmail.com> <91ad5bf80806021906x3fcf15c0qefacd658aea0929b@mail.gmail.com> Message-ID: <4c0fccce0806021928r77fd72d7s868b3db51efa8711@mail.gmail.com> On Mon, Jun 2, 2008 at 10:06 PM, George Sakkis wrote: > On Mon, Jun 2, 2008 at 9:28 PM, Brandon Mintern wrote: >> >> I thought max was implemented using += (i.e. it usually starts at 0 >> and uses += on each item in the iterable). If so, implementing ** >> _iadd_ ** would result in exactly the code you posted. I realize that >> I said __add__ in the first place, but __iadd__ is really what I >> meant. > > No, it uses __add__: > > $ python -c " > class Set(set): __iadd__=set.__ior__ > sum([Set([1]), Set([2])], Set()) > " > Traceback (most recent call last): > File "", line 3, in > TypeError: unsupported operand type(s) for +: 'Set' and 'Set' > > You can easily see the quadratic behavior of __add__: [snip] Ouch. Never mind my idea then. I do find that rather strange, though. It seems kind of strange to be able define an initial value but not use it as an accumulator. This means that sum on list, mutable user number classes, etc. is bound to be less efficient than it could be. Perhaps a better proposal would be "change max to use __iadd__ if available, falling back to __add__ if not", and then maybe we can revisit this idea at that time. Honestly, what's wrong with sum being defined as: def sum (iterable, start=0): acc = start for i in iteratble: acc += i return acc Even though I'm making a lot of bad proposals, I sure am learning a lot. Brandon From g.brandl at gmx.net Tue Jun 3 01:15:53 2008 From: g.brandl at gmx.net (Georg Brandl) Date: Tue, 03 Jun 2008 01:15:53 +0200 Subject: [Python-ideas] Allow non-callable default_factory for defaultdict In-Reply-To: <4c0fccce0806020805u6c9f7e9cv5a16d4e791d94863@mail.gmail.com> References: <4c0fccce0806020805u6c9f7e9cv5a16d4e791d94863@mail.gmail.com> Message-ID: Brandon Mintern schrieb: > I can see one drawback to this: there is a risk of people using > defaultdict([]) instead of defaultdict(list) with the idea that they > will do the same thing. I think this problem can be easily overcome in > the defaultdict documentation by specifically mentioning such a case > as a gotcha while also using an example with a non-callable that shows > how it is similar to using dict.get(...). I think this was exactly one of the reasons that defaultdict takes a factory function. Using a list as the default is a very common use case, and here (as opposed to function parameter defaults) we *can* prevent endless streams of programmers falling into a "trap". Also, this is exactly the kind of situation where lambda fits perfectly. Since we have and keep lambda, I see no reason to complicate the API. This should be documented with defaultdict though. I see an example for a constant default value, but it uses itertools.repeat (!?) Georg From tjreedy at udel.edu Mon Jun 2 05:59:47 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Sun, 1 Jun 2008 23:59:47 -0400 Subject: [Python-ideas] Proposal to add new built-in struct (was: Add kwargs to built-in function object) References: <4c0fccce0805221442w5a4c40a4ma0097b42f86558af@mail.gmail.com> <4835FBA4.8020206@canterbury.ac.nz> Message-ID: "Greg Ewing" wrote in message news:4835FBA4.8020206 at canterbury.ac.nz... | Brandon Mintern wrote: | > This is a proposal to add a new built-in named struct: | > | > struct(**kwargs) | > Return a struct object which has the attributes given in kwargs. | | I think I'd prefer 'record', to avoid any potential | confusion with the struct module, which does something | quite different. I agree, perhaps even Record .., but in any case in the collections module. Something like this has been the subject of enough c.l.p posts to make a case for something in the stdlib, but not in builtins. An implementation in Python also serves as a model for variations. From arnodel at googlemail.com Tue Jun 3 19:28:03 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 3 Jun 2008 18:28:03 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> Message-ID: <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> On 3 Jun 2008, at 02:21, George Sakkis wrote: > Regardless of the operator, that's a pretty inefficient way of doing > "unionall"; it creates N-1 intermediate result sets that discards > them right after they are added. It should be written as: > > big_u = set() > for s in all_sets: > big_u.update(s) > > I wouldn't mind having a standard unionall, but not every 3-line > function has to be in the stdlib. > > George > Perhaps it would be nice to have set.union (and set.intersection) to accept more than one argument, i.e. have A = S.union(T, U, V) mean A = S.union(T) A.update(U) A.update(V) As a consequence of Python method implementation, one could write instead: A = set.union(S, T, U, V) B = set.intersection(S, T, U, V) which reads nicely -- Arnaud From grosser.meister.morti at gmx.net Tue Jun 3 19:38:13 2008 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Tue, 03 Jun 2008 19:38:13 +0200 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> Message-ID: <48458185.1090604@gmx.net> Arnaud Delobelle schrieb: > > As a consequence of Python method implementation, one could write instead: > > A = set.union(S, T, U, V) > B = set.intersection(S, T, U, V) > > which reads nicely > I think this, indeed, reads nicely. -panzi From lorgandon at gmail.com Tue Jun 3 19:54:05 2008 From: lorgandon at gmail.com (Imri Goldberg) Date: Tue, 03 Jun 2008 20:54:05 +0300 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <48458185.1090604@gmx.net> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> <48458185.1090604@gmx.net> Message-ID: <4845853D.9000000@gmail.com> +1. I use union and intersection as sum-like functions from time to time, enough to warrant implementation in my "standard utility import". Although I'm not sure about the interface. A sum-like interface which receives a sequence might be better, although they are pretty much equivalent. ------------------------- Imri Goldberg www.algorithm.co.il/blogs www.imri.co.il ------------------------- Insert Signature Here ------------------------- Mathias Panzenb?ck wrote: > Arnaud Delobelle schrieb: > > > > As a consequence of Python method implementation, one could write > instead: > > > > A = set.union(S, T, U, V) > > B = set.intersection(S, T, U, V) > > > > which reads nicely > > > > I think this, indeed, reads nicely. > > -panzi > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From python at rcn.com Tue Jun 3 20:04:46 2008 From: python at rcn.com (Raymond Hettinger) Date: Tue, 3 Jun 2008 11:04:46 -0700 Subject: [Python-ideas] Implement __add__ for set and frozenset References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> Message-ID: > Perhaps it would be nice to have set.union (and set.intersection) to > accept more than one argument, i.e. have > > A = S.union(T, U, V) > > mean > > A = S.union(T) > A.update(U) > A.update(V) Something like this has been on my todo list for a while. Patches are welcome. It should be done for union, intersection, difference, and symmetric difference. Some attempt should be made to optimize the ordering so that a&b&c&d will run from the smallest set to the largest (to minimize the total loop count). Raymond From arnodel at googlemail.com Tue Jun 3 20:32:53 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 3 Jun 2008 19:32:53 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> Message-ID: On 3 Jun 2008, at 19:04, Raymond Hettinger wrote: >> Perhaps it would be nice to have set.union (and set.intersection) >> to accept more than one argument, i.e. have >> A = S.union(T, U, V) >> mean >> A = S.union(T) >> A.update(U) >> A.update(V) > > Something like this has been on my todo list for a while. > > Patches are welcome. It should be done for union, > intersection, difference, and symmetric difference. Difference is not an associative operation though. E.g. A - (B - B) = A but (A - B) - B = A - B Same for symmetric difference (here "^" stands for symmetric difference). E.g. A ^ (B ^ B) = A but (A ^ B) ^ B = A | B -- Arnaud From python at rcn.com Tue Jun 3 20:44:33 2008 From: python at rcn.com (Raymond Hettinger) Date: Tue, 3 Jun 2008 11:44:33 -0700 Subject: [Python-ideas] Implement __add__ for set and frozenset References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

Message-ID: <8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> > Difference is not an associative operation though. E.g. A.difference(B, C, D) means A - B - C - D which can be (((A - B) - C) - D) or (((A - D) - C) - B) or (((A - C) - B) - D) or (((A - C) - D) - B) You can do the subtractions from A in any order. Raymond From arnodel at googlemail.com Tue Jun 3 21:02:24 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 3 Jun 2008 20:02:24 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> Message-ID: On 3 Jun 2008, at 19:44, Raymond Hettinger wrote: > > Difference is not an associative operation though. E.g. > > A.difference(B, C, D) means A - B - C - D > which can be (((A - B) - C) - D) > or (((A - D) - C) - B) > or (((A - C) - B) - D) > or (((A - C) - D) - B) > > You can do the subtractions from A in any order. That's true. However there no way to predict the output of this: >>> set_of_sets = { {1, 2}, {2, 3} } >>> set.difference(*set_of_sets) And A.difference(B, C, D) could be rewritten as A - set.union(B, C, D) which may be just as clear. -- Arnaud From mattias at virtutech.se Tue Jun 3 21:12:15 2008 From: mattias at virtutech.se (=?iso-8859-1?q?Mattias_Engdeg=E5rd?=) Date: Tue, 3 Jun 2008 19:12:15 +0000 (UTC) Subject: [Python-ideas] Implement __add__ for set and frozenset References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

Message-ID: Arnaud Delobelle writes: > A ^ (B ^ B) = A but (A ^ B) ^ B = A | B No, symmetric set difference is associative. From python at rcn.com Tue Jun 3 21:32:18 2008 From: python at rcn.com (Raymond Hettinger) Date: Tue, 3 Jun 2008 12:32:18 -0700 Subject: [Python-ideas] Implement __add__ for set and frozenset References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> Message-ID: <8A789D3D43D1473BA4D2AFB1ABEADBD8@RaymondLaptop1> > A.difference(B, C, D) > > could be rewritten as > > A - set.union(B, C, D) > > which may be just as clear. And grotesquely inefficient. From python at rcn.com Tue Jun 3 21:48:27 2008 From: python at rcn.com (Raymond Hettinger) Date: Tue, 3 Jun 2008 12:48:27 -0700 Subject: [Python-ideas] Implement __add__ for set and frozenset References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> Message-ID: <345C637AD4CD460B8AF1BEB8BB2C6943@RaymondLaptop1> From: "Arnaud Delobelle" > there no way to predict the output of this: > > >>> set_of_sets = { {1, 2}, {2, 3} } > >>> set.difference(*set_of_sets) That's silly. Lot's of functions do odd things with random argument ordering: >>> s = set([9, 3]) >>> int.__sub__(*s) 6 Besides, you can already run the sample fragment in Py2.5: >>> sos = set( [frozenset([1, 2]), frozenset([2, 3])]) >>> frozenset.difference(*sos) frozenset([1]) Raymond From arnodel at googlemail.com Tue Jun 3 21:55:44 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 3 Jun 2008 20:55:44 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

Message-ID: <3D044D23-661F-44F8-A94E-0FC0F9B2394C@gmail.com> On 3 Jun 2008, at 20:12, Mattias Engdeg?rd wrote: > Arnaud Delobelle writes: > >> A ^ (B ^ B) = A but (A ^ B) ^ B = A | B > > No, symmetric set difference is associative. > Sorry, I don't know what came over me. It's obviously associative because an element is in the symmetric difference if it is in an odd numbers of sets. -- Arnaud From arnodel at googlemail.com Tue Jun 3 22:08:05 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 3 Jun 2008 21:08:05 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <8A789D3D43D1473BA4D2AFB1ABEADBD8@RaymondLaptop1> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> <8A789D3D43D1473BA4D2AFB1ABEADBD8@RaymondLaptop1> Message-ID: <9AD272AF-80E2-4124-BBD2-8D6B5489097B@gmail.com> On 3 Jun 2008, at 20:32, Raymond Hettinger wrote: > > >> A.difference(B, C, D) >> could be rewritten as >> A - set.union(B, C, D) >> which may be just as clear. > > > And grotesquely inefficient. Ah yes. Given the other grossly inaccurate statement in that post, I conclude that drinking and posting on python-ideas are incompatible occupations. So I will refrain from any further claim on that subject till tomorrow morning. -- Arnaud From jh at improva.dk Tue Jun 3 21:43:09 2008 From: jh at improva.dk (Jacob Holm) Date: Tue, 03 Jun 2008 21:43:09 +0200 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

Message-ID: <48459ECD.9000408@improva.dk> Arnaud Delobelle wrote: > > Difference is not an associative operation though. E.g. > > A - (B - B) = A but (A - B) - B = A - B > > Same for symmetric difference (here "^" stands for symmetric > difference). E.g. > > A ^ (B ^ B) = A but (A ^ B) ^ B = A | B > Actually, symmetric difference *is* associative. The second '=' above is wrong unless B is a subset of A. Regards Jacob From python at rcn.com Tue Jun 3 22:20:36 2008 From: python at rcn.com (Raymond Hettinger) Date: Tue, 3 Jun 2008 13:20:36 -0700 Subject: [Python-ideas] Implement __add__ for set and frozenset References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1><8A789D3D43D1473BA4D2AFB1ABEADBD8@RaymondLaptop1> <9AD272AF-80E2-4124-BBD2-8D6B5489097B@gmail.com> Message-ID: <1AE4392680834439A81C0CA6B56292C5@RaymondLaptop1> > Given the other grossly inaccurate statement in that post, I > conclude that drinking and posting on python-ideas are incompatible > occupations. So I will refrain from any further claim on that subject > till tomorrow morning. No worries, except that mailman saves a copy, google indexes it, and all of your posts will be visible to your great-great grandchildren for all eternity. Other than that, it's just a casual idea session between friends ;-) Raymond From greg.ewing at canterbury.ac.nz Wed Jun 4 03:57:35 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Wed, 04 Jun 2008 13:57:35 +1200 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <4c0fccce0806021928r77fd72d7s868b3db51efa8711@mail.gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <4c0fccce0806021828v38cb6c14q9b32bbda1dca5fbe@mail.gmail.com> <91ad5bf80806021906x3fcf15c0qefacd658aea0929b@mail.gmail.com> <4c0fccce0806021928r77fd72d7s868b3db51efa8711@mail.gmail.com> Message-ID: <4845F68F.9000002@canterbury.ac.nz> Brandon Mintern wrote: > Perhaps a better proposal would be "change max to use __iadd__ if > available, falling back to __add__ if not" That could change the behaviour of existing code that passes a mutable initial value. -- Greg From bmintern at gmail.com Wed Jun 4 05:36:15 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Tue, 3 Jun 2008 23:36:15 -0400 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <4845F68F.9000002@canterbury.ac.nz> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <4c0fccce0806021828v38cb6c14q9b32bbda1dca5fbe@mail.gmail.com> <91ad5bf80806021906x3fcf15c0qefacd658aea0929b@mail.gmail.com> <4c0fccce0806021928r77fd72d7s868b3db51efa8711@mail.gmail.com> <4845F68F.9000002@canterbury.ac.nz> Message-ID: <4c0fccce0806032036hcaa334lb9028d729255a84a@mail.gmail.com> On Tue, Jun 3, 2008 at 9:57 PM, Greg Ewing wrote: > Brandon Mintern wrote: >> >> Perhaps a better proposal would be "change max to use __iadd__ if >> available, falling back to __add__ if not" Obviously, I meant to say "sum" there instead of "max" (which I'm pretty sure you realized as well) -- I had been using max at the time that I wrote that e-mail. > That could change the behaviour of existing code that > passes a mutable initial value. > > -- > Greg That was my intention, to take advantage of increased efficiency provided by mutable initial values. Unfortunately, I didn't consider the "existing code" problem, but that's not really a problem in Python 3K, is it? For example, sum(lists, []) currently runs in quadratic time (as pointed out by George Sakkis earlier in this thread using an example of sets that implement __add__). If instead, sum was implemented as: def sum (iterable, init=0): for i in iterable: init += i return init Then its behavior would mimic its current behavior for immutable types and other types which do not implement iadd, but for types that allow more efficient value modification, it would be a big win. Is there a good use case for a time when you wouldn't want an initial value to be mutated? In my experience, I've always passed in throwaway initial values anyways, like []. Brandon p.s. Sorry, Greg, for the dupe. Why don't the Python mailing lists generate Reply-To headers? It's pretty annoying to always have to remember to say "Reply to all" instead of simply "Reply". From bmintern at gmail.com Wed Jun 4 05:37:28 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Tue, 3 Jun 2008 23:37:28 -0400 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <345C637AD4CD460B8AF1BEB8BB2C6943@RaymondLaptop1> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> <345C637AD4CD460B8AF1BEB8BB2C6943@RaymondLaptop1> Message-ID: <4c0fccce0806032037p340faa8co6443c2a307f5e767@mail.gmail.com> Just realized that I failed to send this to the list as well: On Tue, Jun 3, 2008 at 3:48 PM, Raymond Hettinger wrote: > That's silly. Lot's of functions do odd things with random argument > ordering: > >>>> s = set([9, 3]) >>>> int.__sub__(*s) > > 6 > > Besides, you can already run the sample fragment in Py2.5: > >>>> sos = set( [frozenset([1, 2]), frozenset([2, 3])]) >>>> frozenset.difference(*sos) > > frozenset([1]) > > > Raymond Right, but that's why these functions do not accept more than two arguments. They are intended to be used as instance methods only. If we're promoting the idea of a set.method(*args) usage, however, the usage should probably be intuitive. Because they are associative, union, intersection, and symmetric_difference are all intuitive and do what is expected no matter what. That is not true of set-difference. In other words, it looks to me like set.method(*args) is trying to say "Take all of elements in these iterables and make one set out of them," or more simply, "Throw all this crap together." Intuitively, it _shouldn't_ matter what order the arguments are in. What is the meaning of taking the set-difference of a bunch of sets? Should we promote an operation that doesn't make any sense? In mathematics, there are symbols for set.union(*args) (big-U) and set.intersection(*args) (big-upside-down-U), because they actually come up in common usage. I'm not aware of any such symbols for other set operations. Now that doesn't necessarily mean we shouldn't support them, but it is certainly something to think about. To take it from another angle, it is easy to define: set.union(*args) - the set of all the elements appearing in at least one of the args set.intersection(*args) - the set of all the elements appearing in every arg set.symmetric_difference(*args) - the set of all the elements appearing in an odd number of arguments but: set.difference(*args) - the set of all elements appearing in the first arg but not any of the rest is fundamentally different. When using set operations, ordering shouldn't even be a consideration. However, A.difference(*args) - the set of all elements in A that do not appear in any of the args is well-defined. For that reason, I say that we should support *args for all set operations, but we should only promote the use of set.method syntax for intersection and union. set.difference doesn't seem well-defined, and set.symmetric_difference doesn't seem very useful (and could lead to usage of set.difference). So... +1 supporting *args for all set operations +1 documenting the usage of set.union(*args) and set.intersection(*args) as unioning/intersecting all of the arguments -1 even mentioning set.difference or set.symmetric_difference in static usage That's my 2c, Brandon From arnodel at googlemail.com Wed Jun 4 10:25:38 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Wed, 4 Jun 2008 09:25:38 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <1AE4392680834439A81C0CA6B56292C5@RaymondLaptop1> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1> <345C637AD4CD460B8AF1BEB8BB2C6943@RaymondLaptop1> <4c0fccce0806032037p340faa8co6443c2a307f5e767@mail.gmail.com> Message-ID: <484736D7.7030206@canterbury.ac.nz> Brandon Mintern wrote: > Intuitively, it > _shouldn't_ matter what order the arguments are in. What is the > meaning of taking the set-difference of a bunch of sets? Should we > promote an operation that doesn't make any sense? Taking a lead from mathematics here, there are symbols for the sum and product of a sequence (big-sigma and big-pi) but not difference or quotient, for similar reasons. -- Greg From arnodel at googlemail.com Thu Jun 5 19:16:47 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Thu, 5 Jun 2008 18:16:47 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <419D785A36CA4D87B6EB808C7D2991C4@RaymondLaptop1> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com>

<8B893CF4FD6B4F1E93F59D5DE4254A58@RaymondLaptop1><8A789D3D43D1473BA4D2AFB1ABEADBD8@RaymondLaptop1><9AD272AF-80E2-4124-BBD2-8D6B5489097B@gmail.com><1AE4392680834439A81C0CA6B56292C5@RaymondLaptop1> <9bfc700a0806040125r6d91271eg944d7dcf6a345990@mail.gmail.com> <419D785A36CA4D87B6EB808C7D2991C4@RaymondLaptop1> Message-ID: On 4 Jun 2008, at 09:45, Raymond Hettinger wrote: > In the case of intersection and intersection_update, if the inputs > are sets or dicts, then they should be processed smallest to > largest. If the inputs are not sets or dicts, then process them in > input order. > > The other six cases should also be processed in-order (left-to-right). > Given that A - X - Y - Z is the same as (A-X) & (A-Y) & (A-Z) If it is a significant optimization to intersect sets from smallest to largest (as opposed to just starting with the smallest one and then intersecting from left to right), then should the same idea be applied to difference, except that obviously you start with the leftmost one and sort the others from largest to smallest? (I am *not* proposing to compute A-X, A-Y, ... and then intersect them!) -- Arnaud From ziade.tarek at gmail.com Sat Jun 7 11:21:16 2008 From: ziade.tarek at gmail.com (=?ISO-8859-1?Q?Tarek_Ziad=E9?=) Date: Sat, 7 Jun 2008 11:21:16 +0200 Subject: [Python-ideas] profiling: manual instrumentation with pystones Message-ID: <94bdd2610806070221x313dd18ehab7e57c55afb1d38@mail.gmail.com> Hello, To remove bottlenecks I usually instrument some functions in my application inside a dedicated test and set speed goals there until they are met. Then I leave the test to avoid speed regressions, when doable, by translating times in pystones. Unless I missed something in the standard library, I feel like there's a missing tool to do it simply: - the timeit module is nice to try out small code snippets but is not really adapted to manually profile the code of an existing application - the profile module is nice to profile an application as a whole but is not very handy to gather statistics on specific functions in their real execution context What about adding a decorator that fills a statistics mapping in memory (time+stones), like this: >=========== import time import sys import logging from test import pystone benchtime, stones = pystone.pystones() def secs_to_kstones(seconds): return (stones*seconds) / 1000 stats = {} def reset_stats(): global stats stats = {} def log_stats(): template = '%s : %.2f kstones, %.3f secondes' for key, v in stats.items(): logging.debug(template % (key, v['stones'], v['time'])) if sys.platform == 'win32': timer = time.clock else: timer = time.time def profile(name='stats', stats=stats): def _profile(function): def __profile(*args, **kw): start_time = timer() try: return function(*args, **kw) finally: total = timer() - start_time kstones = secs_to_kstones(total) stats[name] = {'time': total, 'stones': kstones} return __profile return _profile >=========== This allows instrumenting the application by decorating some functions, either inside the application, either in a dedicated test: >====== def my_test(): my.app.slow_stuff = profile('seem slow')(my.app.slow_stuff) my.app.other_slow_stuff = profile('seem slow too')(my.app.other_slow_stuff) # should not take more than 40k pystones ! assert stats['seem slow too']['profile'] < 40 # let's log them log_stats() >====== Regards, Tarek -- Tarek Ziad? | Association AfPy | www.afpy.org Blog FR | http://programmation-python.org Blog EN | http://tarekziade.wordpress.com/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From arnodel at googlemail.com Tue Jun 10 01:09:13 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 10 Jun 2008 00:09:13 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> Message-ID: <9bfc700a0806091609h1cd9a604qdb64784b121ec6b3@mail.gmail.com> 2008/6/3 Arnaud Delobelle : > > On 3 Jun 2008, at 02:21, George Sakkis wrote: > >> Regardless of the operator, that's a pretty inefficient way of doing >> "unionall"; it creates N-1 intermediate result sets that discards them right >> after they are added. It should be written as: >> >> big_u = set() >> for s in all_sets: >> big_u.update(s) >> >> I wouldn't mind having a standard unionall, but not every 3-line function >> has to be in the stdlib. >> >> George >> > > > Perhaps it would be nice to have set.union (and set.intersection) to accept > more than one argument, i.e. have > > A = S.union(T, U, V) > > mean > > A = S.union(T) > A.update(U) > A.update(V) > > As a consequence of Python method implementation, one could write instead: > > A = set.union(S, T, U, V) > B = set.intersection(S, T, U, V) > > which reads nicely I've written a patch [1] that does that. Following the suggestion of Raymond Hettinger, I've implemented set.intersection by sorting all its sets/frozensets/dicts in increasing order of size first, then iterating over the smallest. It's the first time I try my hand at this so it might not be up to much, but I've made it so I might as well send it :). It's against py3k svn. [1] http://bugs.python.org/issue3069 -- Arnaud From python at rcn.com Tue Jun 10 01:33:48 2008 From: python at rcn.com (Raymond Hettinger) Date: Mon, 9 Jun 2008 16:33:48 -0700 Subject: [Python-ideas] Implement __add__ for set and frozenset References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> <9bfc700a0806091609h1cd9a604qdb64784b121ec6b3@mail.gmail.com> Message-ID: <74E7417E02154082BB0191AC30D8968B@RaymondLaptop1> From: "Arnaud Delobelle" > As a consequence of Python method implementation, one could write instead: >> >> A = set.union(S, T, U, V) >> B = set.intersection(S, T, U, V) >> >> which reads nicely > > I've written a patch [1] that does that. Following the suggestion of > Raymond Hettinger, I've implemented set.intersection by sorting all > its sets/frozensets/dicts in increasing order of size first, then > iterating over the smallest. It's the first time I try my hand at > this so it might not be up to much, but I've made it so I might as > well send it :). It's against py3k svn. > > [1] http://bugs.python.org/issue3069 Thanks. It looks like I beat you to it. But I will go over your code and incorporate some version of the sorting for interections and harvest the tests. Also, I'll go ahead and add you to Misc/ACKS. Raymond From terry at jon.es Tue Jun 10 01:33:07 2008 From: terry at jon.es (Terry Jones) Date: Tue, 10 Jun 2008 01:33:07 +0200 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: Your message at 00:09:13 on Tuesday, 10 June 2008 References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> <9bfc700a0806091609h1cd9a604qdb64784b121ec6b3@mail.gmail.com> Message-ID: <18509.48563.472066.451052@jon.es> >>>>> "Arnaud" == Arnaud Delobelle writes: Arnaud> I've written a patch [1] that does that. Following the suggestion Arnaud> of Raymond Hettinger, I've implemented set.intersection by sorting Arnaud> all its sets/frozensets/dicts in increasing order of size first, Arnaud> then iterating over the smallest. Hi Arnaud I don't know if you'll do any benchmarking on this, but I'd suggest: - First find the set with the smallest size (this is O(n) work). - If that size is sufficiently small and the number of sets is sufficiently large (numbers to be determined by testing), don't sort the sets by size - just go for it. - Else do the sorting. The point being that if the smallest set is already quite small, the size of the intersection is already tightly bounded and you're possibly going to do an expensive sort that's really not needed. The O(n) work to find the smallest is tiny compared to just blindly doing O(n lg n) immediately. Most of the juice you get from moving from small to big sets comes from starting with the smallest. A few benchmarks should give an idea of when to sort. BTW, having a quick look at your diff (not the patched source) it looks like you're testing each of the elements of the smallest set against all other hashtables. I haven't thought about it much, but that seems to partly defeat the purpose of sorting. Speed will depend on the inputs, but I'd have guessed that in general you should be testing each member of the smallest for presence in the next set, short-circuiting if empty, then testing each of the survivors against the next set, etc. That's more of a "vertical" approach than the horizontal one you take across all the hashtables (possibly also with a speed benefit due to locality of reference). Also, why not test first against the iterables that are not hashtables? Wouldn't that be faster in the (common?) case of many sets being passed for intersection? Sorry if this is all clueless - it's just my thinking as I looked at your diff. Regards, Terry From arnodel at googlemail.com Tue Jun 10 02:31:06 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 10 Jun 2008 01:31:06 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <18509.48563.472066.451052@jon.es> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com> <91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com> <9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> <9bfc700a0806091609h1cd9a604qdb64784b121ec6b3@mail.gmail.com> <18509.48563.472066.451052@jon.es> Message-ID: <7FCCBE0E-0B00-4A60-AFDB-8D7E7768A0D4@googlemail.com> On 10 Jun 2008, at 00:33, Terry Jones wrote: >>>>>> "Arnaud" == Arnaud Delobelle writes: > Arnaud> I've written a patch [1] that does that. Following the > suggestion > Arnaud> of Raymond Hettinger, I've implemented set.intersection by > sorting > Arnaud> all its sets/frozensets/dicts in increasing order of size > first, > Arnaud> then iterating over the smallest. > > Hi Arnaud > > I don't know if you'll do any benchmarking on this, but I'd suggest: > > - First find the set with the smallest size (this is O(n) work). > > - If that size is sufficiently small and the number of sets is > sufficiently large (numbers to be determined by testing), don't > sort the > sets by size - just go for it. > > - Else do the sorting. > > The point being that if the smallest set is already quite small, the > size > of the intersection is already tightly bounded and you're possibly > going to > do an expensive sort that's really not needed. The O(n) work to find > the > smallest is tiny compared to just blindly doing O(n lg n) > immediately. Most > of the juice you get from moving from small to big sets comes from > starting > with the smallest. > My first reaction is to agree with this. Just finding the smallest hashtable might be enough, and I first set out to do just that. Raymond Hettinger suggested going from smallest to largest and I decided against having too many code paths, without any real rationale (or rather, I probably had in my mind that it would be used for a small number of big sets, rather than a big number of small sets). > A few benchmarks should give an idea of when to sort. > One problem is that it is difficult to know what is a typical use of this. I imagined that the number of sets would be small compared with their sizes. It would be completely different if one had many small sets. > BTW, having a quick look at your diff (not the patched source) it > looks > like you're testing each of the elements of the smallest set against > all > other hashtables. I haven't thought about it much, but that seems to > partly > defeat the purpose of sorting. Speed will depend on the inputs, but > I'd > have guessed that in general you should be testing each member of the > smallest for presence in the next set, short-circuiting if empty, then > testing each of the survivors against the next set, etc. That's more > of a > "vertical" approach than the horizontal one you take across all the > hashtables (possibly also with a speed benefit due to locality of > reference). > You're right about short-circuiting (with iterables), it was planned but I forgot to put it in. I was working on the patch last week but my son was taken to hospital in an emergency and everything has been a bit of a blur since then. Tonight for the first time I had a bit of time so I decided it was the time to wrap it up and send it, but I think that it was a bit rushed. I still believe short-circuiting has to be done "horizontally" when possible, to use your terminology. If x belongs to the first two sets, then you know that their intersection is not empty, so you might as well test for membership of the third set straight away. Although locality of reference may well be an important factor here, I don't know (last time I looked into these things, memory was fast and processors were slow - a long time ago!). > Also, why not test first against the iterables that are not > hashtables? > Wouldn't that be faster in the (common?) case of many sets being > passed for > intersection? > I first wrote the code for hashtables, then added support for any iterable. What you say is probably correct for many small iterables, but not for few big ones where you would have to go through the tedium of iterating through every element of each iterable (which is done in my implementation anyway because I forgot the short-circuiting!). > Sorry if this is all clueless - it's just my thinking as I looked at > your > diff. No, these are considerations that I should have given more thought to. Because it is the first time I modified a bit of Python code, I think I got bogged down in my problems with technicalities and forgot the bigger picture. -- Arnaud From arnodel at googlemail.com Tue Jun 10 02:45:52 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Tue, 10 Jun 2008 01:45:52 +0100 Subject: [Python-ideas] Implement __add__ for set and frozenset In-Reply-To: <74E7417E02154082BB0191AC30D8968B@RaymondLaptop1> References: <4c0fccce0806020448k21cbe291jcf3870fafceec77@mail.gmail.com><91ad5bf80806021821n31b10ea6iaf89ded52bbb1603@mail.gmail.com><9C9BE917-E8DA-4B96-BCB1-CEFDAC7B5FFD@gmail.com> <9bfc700a0806091609h1cd9a604qdb64784b121ec6b3@mail.gmail.com> <74E7417E02154082BB0191AC30D8968B@RaymondLaptop1> Message-ID: <456DACBE-3F51-4EF6-A7C7-EA588E339298@googlemail.com> On 10 Jun 2008, at 00:33, Raymond Hettinger wrote: > From: "Arnaud Delobelle" >> As a consequence of Python method implementation, one could write >>> instead: >>> >>> A = set.union(S, T, U, V) >>> B = set.intersection(S, T, U, V) >>> >>> which reads nicely >> I've written a patch [1] that does that. Following the suggestion of >> Raymond Hettinger, I've implemented set.intersection by sorting all >> its sets/frozensets/dicts in increasing order of size first, then >> iterating over the smallest. It's the first time I try my hand at >> this so it might not be up to much, but I've made it so I might as >> well send it :). It's against py3k svn. >> [1] http://bugs.python.org/issue3069 > > > Thanks. It looks like I beat you to it. But I will go over your code > and incorporate some version of the sorting for interections and > harvest the tests. Also, I'll go ahead and add you to Misc/ACKS. Thanks! I'm a bit ashamed of the tests though. -- Arnaud From dbpokorny at gmail.com Sat Jun 14 06:52:30 2008 From: dbpokorny at gmail.com (David Pokorny) Date: Fri, 13 Jun 2008 21:52:30 -0700 Subject: [Python-ideas] Sort statement Message-ID: In the spirit of going in the reverse direction of turning print into a function, what does python-ideas think of the sort statement? numlist = [2,5,4] sort numlist sort numlist asc # borrowed from SQL ORDER BY statement sort numlist desc sort by employee.last_name for employee in employee_list # this uses key sorting The main advantage is that it is impossible to make this mistake: x = y.sort() when you really mean x = sorted(y) Cheers, David From talin at acm.org Sat Jun 14 21:43:18 2008 From: talin at acm.org (Talin) Date: Sat, 14 Jun 2008 12:43:18 -0700 Subject: [Python-ideas] Sort statement In-Reply-To: References: Message-ID: <48541F56.8070408@acm.org> David Pokorny wrote: > In the spirit of going in the reverse direction of turning print into > a function, what does python-ideas think of the sort statement? And here I was trying so hard to forget everything I ever learned about COBOL programming... http://www.cs.niu.edu/~abyrnes/csci465/notes/465sort.htm -- Talin From greg.ewing at canterbury.ac.nz Sun Jun 15 01:48:50 2008 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 15 Jun 2008 11:48:50 +1200 Subject: [Python-ideas] Sort statement In-Reply-To: References: Message-ID: <485458E2.8040907@canterbury.ac.nz> David Pokorny wrote: > In the spirit of going in the reverse direction of turning print into > a function, what does python-ideas think of the sort statement? I don't think that sorting is a frequent enough operation in general to justify having its own statement. > The main advantage is that it is impossible to make this mistake: > > x = y.sort() If you make that mistake, you find out about it very quickly, and you learn not to make it again. Also, there are many other methods that have the same characteristic. Would you want to turn all of them into statements as well? -- Greg From bmintern at gmail.com Sun Jun 15 21:46:52 2008 From: bmintern at gmail.com (Brandon Mintern) Date: Sun, 15 Jun 2008 15:46:52 -0400 Subject: [Python-ideas] Sort statement In-Reply-To: <485458E2.8040907@canterbury.ac.nz> References: <485458E2.8040907@canterbury.ac.nz> Message-ID: <4c0fccce0806151246l6ff0e00fgac93de744ff59011@mail.gmail.com> On Sat, Jun 14, 2008 at 7:48 PM, Greg Ewing wrote: > I don't think that sorting is a frequent enough operation > in general to justify having its own statement. Agreed. >> The main advantage is that it is impossible to make this mistake: >> >> x = y.sort() > > If you make that mistake, you find out about it very > quickly, and you learn not to make it again. Yes, but having .sort() return self would also solve this problem without anything as radical as introducing a new keyword and syntax. I'm not saying that this should be done, but I think this would be a much better alternative than the proposed sort syntax. Brandon From cvrebert at gmail.com Mon Jun 16 00:19:26 2008 From: cvrebert at gmail.com (Chris Rebert) Date: Sun, 15 Jun 2008 15:19:26 -0700 Subject: [Python-ideas] Sort statement In-Reply-To: <4c0fccce0806151246l6ff0e00fgac93de744ff59011@mail.gmail.com> References: <485458E2.8040907@canterbury.ac.nz> <4c0fccce0806151246l6ff0e00fgac93de744ff59011@mail.gmail.com> Message-ID: <47c890dc0806151519p221df87u4e3b67b2f0010277@mail.gmail.com> On Sun, Jun 15, 2008 at 12:46 PM, Brandon Mintern wrote: > On Sat, Jun 14, 2008 at 7:48 PM, Greg Ewing wrote: >> I don't think that sorting is a frequent enough operation >> in general to justify having its own statement. > > Agreed. > >>> The main advantage is that it is impossible to make this mistake: >>> >>> x = y.sort() >> >> If you make that mistake, you find out about it very >> quickly, and you learn not to make it again. > > Yes, but having .sort() return self would also solve this problem > without anything as radical as introducing a new keyword and syntax. > I'm not saying that this should be done, but I think this would be a > much better alternative than the proposed sort syntax. But that would be more confusing and make it seem to the newbie that .sort() returns a *new* sorted list rather than sorting the list in-place. Returning None (or not returning anything, which has the same effect) is idiomatic in Python to indicate a method is a mutator. And they'll quickly get a "TypeError: unsubscriptable object" and learn this lesson if they use list.sort() incorrectly. Although I admit, that error message could be improved. At least including the object in question would be better, for instance: TypeError: unsubscriptable object "None" Or perhaps also changing "unsubscriptable" to something more comprehensible to newbies: TypeError: object "None" does not support the subscript operator - Chris R. > > Brandon > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From scott+python-ideas at scottdial.com Mon Jun 16 04:26:39 2008 From: scott+python-ideas at scottdial.com (Scott Dial) Date: Sun, 15 Jun 2008 22:26:39 -0400 Subject: [Python-ideas] Sort statement In-Reply-To: <47c890dc0806151519p221df87u4e3b67b2f0010277@mail.gmail.com> References: <485458E2.8040907@canterbury.ac.nz> <4c0fccce0806151246l6ff0e00fgac93de744ff59011@mail.gmail.com> <47c890dc0806151519p221df87u4e3b67b2f0010277@mail.gmail.com> Message-ID: <4855CF5F.40902@scottdial.com> Chris Rebert wrote: > Although I admit, that error message could be improved. At least > including the object in question would be better, for instance: > TypeError: unsubscriptable object "None" > Or perhaps also changing "unsubscriptable" to something more > comprehensible to newbies: > TypeError: object "None" does not support the subscript operator This error message was discussed on python-dev back in April, but I don't know that anything ever came from it. http://mail.python.org/pipermail/python-dev/2008-April/078744.html It would good if it was at least unified for all objects (which it was not at the time, maybe it is now..) -Scott -- Scott Dial scott at scottdial.com scodial at cs.indiana.edu From pupeno at gmail.com Wed Jun 18 16:49:54 2008 From: pupeno at gmail.com (=?UTF-8?Q?J._Pablo_Fern=C3=A1ndez?=) Date: Wed, 18 Jun 2008 15:49:54 +0100 Subject: [Python-ideas] PyUnit Message-ID: <4287a62c0806180749m444aeb23oad3259419bcc8fb1@mail.gmail.com> Hello, I've made some improvements to PyUnit which adds the concept of skipped test (when, for instance, the setUp fails), and some other stuff. This changes are test-code-wise backward compatible, in the sense that old tests will just work. But since the output of running the tests is different, tools using that output, at any level (reading the text, or running the tests programatically) will fail. I think it would be possible to have a switch somewhere to even then make them backward compatible. Are there any chances of getting this code in Python 3k? I'd need to work on it, polish it, write more tests for it, etc, so I'd like to know if it'd be welcome or not before investing the time. Thank you. -- J. Pablo Fern?ndez (http://pupeno.com) Temporarily using pupeno at gmail.com. From solipsis at pitrou.net Thu Jun 19 12:07:49 2008 From: solipsis at pitrou.net (Antoine Pitrou) Date: Thu, 19 Jun 2008 10:07:49 +0000 (UTC) Subject: [Python-ideas] PyUnit References: <4287a62c0806180749m444aeb23oad3259419bcc8fb1@mail.gmail.com> Message-ID: Hi, > I've made some improvements to PyUnit which adds the concept of > skipped test (when, for instance, the setUp fails), and some other > stuff. This changes are test-code-wise backward compatible, in the > sense that old tests will just work. But since the output of running > the tests is different, tools using that output, at any level (reading > the text, or running the tests programatically) will fail. I think it > would be possible to have a switch somewhere to even then make them > backward compatible. > > Are there any chances of getting this code in Python 3k? It would probably be welcome. There are even open issues for unittest improvement in the bug tracker: http://bugs.python.org/issue2578 http://bugs.python.org/issue2153 http://bugs.python.org/issue1034053 Unfortunately it seems nobody has had the time or motivation to finally solve them. Regards Antoine. From rnd at onego.ru Tue Jun 24 19:41:54 2008 From: rnd at onego.ru (Roman Susi) Date: Tue, 24 Jun 2008 20:41:54 +0300 Subject: [Python-ideas] Small nice addition to rlcompleter Message-ID: <486131E2.3000801@onego.ru> hi! I had that proposal (stated back in 2001 here http://bugs.python.org/issue449227 ), with these main moments to cite myself: <<< I use rlcompleter extensively in interactive Python mode. I think it could be cool if callable objects were added "(" when completed. This way it will be much faster to program, without looking-up __doc__. For example: >>> f.fil will give: >>> f.fileno(_ ("_" is to mark cursor position) and: >>> f.so will (as before) give: >>> f.softspace _ One more illustration: >>> f = open("myfile", "w") >>> f. f.__class__( f.__repr__( f.next( f.__delattr__( f.__setattr__( f.read( f.__doc__ f.__str__( f.readinto( f.__enter__( f.close( f.readline( f.__exit__( f.closed f.readlines( f.__getattribute__( f.encoding f.seek( f.__hash__( f.fileno( f.softspace f.__init__( f.flush( f.tell( f.__iter__( f.isatty( f.truncate( f.__new__( f.mode f.write( f.__reduce__( f.name f.writelines( f.__reduce_ex__( f.newlines f.xreadlines( >>> f. - nice to remember which attributes are methods and which aren't. >>> There are patches recently made by Manuel Murad?s and Facundo Batista for 2.6 (so I assume some people are interested in the patch, not only me), but I have no idea how (and if) it ever gets into Python? 2.6 and 3k. Its a very small feature to make a PEP, but how then? I hope rlcompleter will not get obsoleted before the patch is accepted ;-) Thanks! Regards, Roman From facundobatista at gmail.com Tue Jun 24 19:58:46 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 24 Jun 2008 14:58:46 -0300 Subject: [Python-ideas] Small nice addition to rlcompleter In-Reply-To: <486131E2.3000801@onego.ru> References: <486131E2.3000801@onego.ru> Message-ID: 2008/6/24 Roman Susi : > There are patches recently made by Manuel Murad?s and Facundo Batista > for 2.6 (so I assume some people are interested in the patch, not only > me), but I have no idea how (and if) it ever gets into Python? 2.6 and 3k. Manuel did all the work, not me, I'll just handling this because of the last Python Bug Day. If everything is ok, I should be working on this this week. Regards, -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From santagada at gmail.com Tue Jun 24 20:12:04 2008 From: santagada at gmail.com (Leonardo Santagada) Date: Tue, 24 Jun 2008 15:12:04 -0300 Subject: [Python-ideas] Small nice addition to rlcompleter In-Reply-To: <486131E2.3000801@onego.ru> References: <486131E2.3000801@onego.ru> Message-ID: On 24/06/2008, at 14:41, Roman Susi wrote: > Its a very small feature to make a PEP, but how then? I hope > rlcompleter > will not get obsoleted before the patch is accepted ;-) Why not make a pep about something bigger then? I think python needs a more complete interactive interpreter... something that would work right after installing python. I think the language strives to be easy and with a smooth learning curve, this could probably help. -- Leonardo Santagada From facundobatista at gmail.com Tue Jun 24 20:30:22 2008 From: facundobatista at gmail.com (Facundo Batista) Date: Tue, 24 Jun 2008 15:30:22 -0300 Subject: [Python-ideas] Small nice addition to rlcompleter In-Reply-To: References: <486131E2.3000801@onego.ru> Message-ID: 2008/6/24 Leonardo Santagada : > Why not make a pep about something bigger then? I think python needs a more > complete interactive interpreter... something that would work right after > installing python. I think the language strives to be easy and with a smooth > learning curve, this could probably help. +1. Note, though, that it's not as easy as it sounds. For example, note that the very useful and simple behaviour of doing up-arrow and bringing the last line, is not handled by Python code, but by the external library readline. My point is: you can propose a lot of things (I surely will love autocompletion and better block management), but how would you achieve that in a multiplatform way? Thanks! -- . Facundo Blog: http://www.taniquetil.com.ar/plog/ PyAr: http://www.python.org/ar/ From rnd at onego.ru Tue Jun 24 21:04:49 2008 From: rnd at onego.ru (Roman Susi) Date: Tue, 24 Jun 2008 22:04:49 +0300 Subject: [Python-ideas] Small nice addition to rlcompleter In-Reply-To: References: <486131E2.3000801@onego.ru> Message-ID: <48614551.3040407@onego.ru> Facundo Batista wrote: > 2008/6/24 Leonardo Santagada : > >> Why not make a pep about something bigger then? I think python needs a more >> complete interactive interpreter... something that would work right after >> installing python. I think the language strives to be easy and with a smooth >> learning curve, this could probably help. > > +1. > > Note, though, that it's not as easy as it sounds. For example, note > that the very useful and simple behaviour of doing up-arrow and > bringing the last line, is not handled by Python code, but by the > external library readline. pyreadline? Pure Python shell with capabilities of plugins and kind of WSGI but for shell functions, bringing Python shell to anywhere (Unix CLI, smartphone, IDE, webconsole, IRC, ...) with the same abstraction level? Over XML? Cool idea, isnt' it? (security alittle bit of concern though) It may redefine the whole idea of what is Programming language's interactive shell. Then if Python ShAPI will be established, readline can be reused in a form of plugin... > My point is: you can propose a lot of things (I surely will love > autocompletion and better block management), but how would you achieve > that in a multiplatform way? > > Thanks! > Regards, Roman From taleinat at gmail.com Wed Jun 25 06:55:20 2008 From: taleinat at gmail.com (Tal Einat) Date: Wed, 25 Jun 2008 07:55:20 +0300 Subject: [Python-ideas] Small nice addition to rlcompleter In-Reply-To: References: <486131E2.3000801@onego.ru> Message-ID: <7afdee2f0806242155r1f54dbe9oca96cb38962c6649@mail.gmail.com> Roman Susi wrote: > I think it could be cool if callable objects were added > "(" > when completed. Facundo Batista wrote: > Leonardo Santagada: > > > Why not make a pep about something bigger then? I think python needs a > more > > complete interactive interpreter... something that would work right after > > installing python. I think the language strives to be easy and with a > smooth > > learning curve, this could probably help. > > +1. > > Note, though, that it's not as easy as it sounds. For example, note > that the very useful and simple behaviour of doing up-arrow and > bringing the last line, is not handled by Python code, but by the > external library readline. > > My point is: you can propose a lot of things (I surely will love > autocompletion and better block management), but how would you achieve > that in a multiplatform way? IDLE? Which has auto-completion, BTW, and for which I wrote a patch two years ago which adds () after a completed callable, placing the cursor in between these parenthesis, and bringing up the callable's call-tip while it's at it (without obscuring the current line - yay GUI!). The patch was never posted to the Python issue tracker because I thought there was no interest, but it would be easy to do so. (more rambling ahead...) The annoying bit about my implementation was that I had to use the right arrow key in order to move past the closing parenthesis. This could be overcome by just adding the opening '(' as suggested above, or perhaps by making closing the parenthesis by typing ')' simply "overwrite" the existing ')' character (with good recognition of when you're just typing a ')' in a string or closing an inner pair of parenthesis, of course). While I was at it, I also made it complete dict keys (only complete-able keys like strings and numbers) and auto-magically add [] after completed dicts (with the cursor placed in between), which I found to be surprisingly useful in interactive work. - Tal P.S. Thanks to Shai Geva for suggesting that I implement the above mentioned features. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pupeno at gmail.com Wed Jun 25 11:16:28 2008 From: pupeno at gmail.com (=?UTF-8?Q?J._Pablo_Fern=C3=A1ndez?=) Date: Wed, 25 Jun 2008 10:16:28 +0100 Subject: [Python-ideas] Small nice addition to rlcompleter In-Reply-To: <7afdee2f0806242155r1f54dbe9oca96cb38962c6649@mail.gmail.com> References: <486131E2.3000801@onego.ru> <7afdee2f0806242155r1f54dbe9oca96cb38962c6649@mail.gmail.com> Message-ID: <4287a62c0806250216l7cc19936id4b5d7c234a1c5de@mail.gmail.com> Something else you could add is that pressing backspace just after the completion, deleting the open paren will also automatically delete the closing paren. I would also add tab as a key to jump beyond the closing paren. On 6/25/08, Tal Einat wrote: > Roman Susi wrote: > >> I think it could be cool if callable objects were added >> "(" >> when completed. > > > Facundo Batista wrote: > >> Leonardo Santagada: >> >> > Why not make a pep about something bigger then? I think python needs a >> more >> > complete interactive interpreter... something that would work right >> > after >> > installing python. I think the language strives to be easy and with a >> smooth >> > learning curve, this could probably help. >> >> +1. >> >> Note, though, that it's not as easy as it sounds. For example, note >> that the very useful and simple behaviour of doing up-arrow and >> bringing the last line, is not handled by Python code, but by the >> external library readline. >> >> My point is: you can propose a lot of things (I surely will love >> autocompletion and better block management), but how would you achieve >> that in a multiplatform way? > > > IDLE? > > Which has auto-completion, BTW, and for which I wrote a patch two years ago > which adds () after a completed callable, placing the cursor in between > these parenthesis, and bringing up the callable's call-tip while it's at it > (without obscuring the current line - yay GUI!). The patch was never posted > to the Python issue tracker because I thought there was no interest, but it > would be easy to do so. > > (more rambling ahead...) > > The annoying bit about my implementation was that I had to use the right > arrow key in order to move past the closing parenthesis. This could be > overcome by just adding the opening '(' as suggested above, or perhaps by > making closing the parenthesis by typing ')' simply "overwrite" the > existing ')' character (with good recognition of when you're just typing a > ')' in a string or closing an inner pair of parenthesis, of course). > > While I was at it, I also made it complete dict keys (only complete-able > keys like strings and numbers) and auto-magically add [] after completed > dicts (with the cursor placed in between), which I found to be surprisingly > useful in interactive work. > > > - Tal > > P.S. Thanks to Shai Geva for suggesting that I implement the above mentioned > features. > -- Sent from Gmail for mobile | mobile.google.com J. Pablo Fern?ndez (http://pupeno.com) Temporarily using pupeno at gmail.com. From andrew at atoulou.se Wed Jun 25 18:53:37 2008 From: andrew at atoulou.se (Andrew Toulouse) Date: Wed, 25 Jun 2008 09:53:37 -0700 Subject: [Python-ideas] Small nice addition to rlcompleter In-Reply-To: <4287a62c0806250216l7cc19936id4b5d7c234a1c5de@mail.gmail.com> References: <486131E2.3000801@onego.ru> <7afdee2f0806242155r1f54dbe9oca96cb38962c6649@mail.gmail.com> <4287a62c0806250216l7cc19936id4b5d7c234a1c5de@mail.gmail.com> Message-ID: Yeah, I think tab-to-complete as well as closeparen-to-complete are fairly standard, and I'd appreciate those. Incidentally, would this be a generalized framework so we could, say, implement a DSL and a shell within python, or would it specifically be a python shell? Obviously, Python itself already has a pretty interactive interpreter in the form of IPython... --Andy On Wed, Jun 25, 2008 at 2:16 AM, J. Pablo Fern?ndez wrote: > Something else you could add is that pressing backspace just after the > completion, deleting the open paren will also automatically delete the > closing paren. > I would also add tab as a key to jump beyond the closing paren. > > > > On 6/25/08, Tal Einat wrote: > > Roman Susi wrote: > > > >> I think it could be cool if callable objects were added > >> "(" > >> when completed. > > > > > > Facundo Batista wrote: > > > >> Leonardo Santagada: > >> > >> > Why not make a pep about something bigger then? I think python needs a > >> more > >> > complete interactive interpreter... something that would work right > >> > after > >> > installing python. I think the language strives to be easy and with a > >> smooth > >> > learning curve, this could probably help. > >> > >> +1. > >> > >> Note, though, that it's not as easy as it sounds. For example, note > >> that the very useful and simple behaviour of doing up-arrow and > >> bringing the last line, is not handled by Python code, but by the > >> external library readline. > >> > >> My point is: you can propose a lot of things (I surely will love > >> autocompletion and better block management), but how would you achieve > >> that in a multiplatform way? > > > > > > IDLE? > > > > Which has auto-completion, BTW, and for which I wrote a patch two years > ago > > which adds () after a completed callable, placing the cursor in between > > these parenthesis, and bringing up the callable's call-tip while it's at > it > > (without obscuring the current line - yay GUI!). The patch was never > posted > > to the Python issue tracker because I thought there was no interest, but > it > > would be easy to do so. > > > > (more rambling ahead...) > > > > The annoying bit about my implementation was that I had to use the right > > arrow key in order to move past the closing parenthesis. This could be > > overcome by just adding the opening '(' as suggested above, or perhaps by > > making closing the parenthesis by typing ')' simply "overwrite" the > > existing ')' character (with good recognition of when you're just typing > a > > ')' in a string or closing an inner pair of parenthesis, of course). > > > > While I was at it, I also made it complete dict keys (only complete-able > > keys like strings and numbers) and auto-magically add [] after completed > > dicts (with the cursor placed in between), which I found to be > surprisingly > > useful in interactive work. > > > > > > - Tal > > > > P.S. Thanks to Shai Geva for suggesting that I implement the above > mentioned > > features. > > > > -- > Sent from Gmail for mobile | mobile.google.com > > J. Pablo Fern?ndez (http://pupeno.com) > Temporarily using pupeno at gmail.com. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From chriscederstrom at gmail.com Wed Jun 25 23:55:08 2008 From: chriscederstrom at gmail.com (Christopher Cederstrom) Date: Wed, 25 Jun 2008 14:55:08 -0700 Subject: [Python-ideas] (no subject) Message-ID: <86c08fc80806251455g59e34cf0la4e7db5c00a1a55a@mail.gmail.com> remove chriscederstrom at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From brett at python.org Thu Jun 26 00:29:12 2008 From: brett at python.org (Brett Cannon) Date: Wed, 25 Jun 2008 15:29:12 -0700 Subject: [Python-ideas] (no subject) In-Reply-To: <86c08fc80806251455g59e34cf0la4e7db5c00a1a55a@mail.gmail.com> References: <86c08fc80806251455g59e34cf0la4e7db5c00a1a55a@mail.gmail.com> Message-ID: On Wed, Jun 25, 2008 at 2:55 PM, Christopher Cederstrom wrote: > remove chriscederstrom at gmail.com > You need to go to http://mail.python.org/mailman/listinfo/python-ideas to unsubscribe. -Brett From dbpokorny at gmail.com Thu Jun 26 06:36:25 2008 From: dbpokorny at gmail.com (David Pokorny) Date: Wed, 25 Jun 2008 21:36:25 -0700 Subject: [Python-ideas] Starred expression in right-hand side Message-ID: Since PEP 3132 gives us: >>> x = [1,2,3] >>> a, *b = x >>> a 1 >>> b [2, 3] it seems natural that we should be able to do it the other way too: (doesn't actually work) >>> a, b = 1, [2,3] >>> x = [a,*b] >>> x [1, 2, 3] This is essentially itertools.chain, but of course it isn't nearly as much fun: >>> [n for n in itertools.chain([1],[2,3])] [1, 2, 3] Now you might be thinking, "yeah, that's cool, but you don't really need it" but this actually came up in practice: I have a function that has a certain behavior for standard types, but when it sees a type it doesn't recognize, it calls a protocol function (like __iter__ or __next__) and expects to receive an iterable whose 0th element is a string. Now I'm probably not going to use itertools.chain because the only members of itertools I can reliably remember are count() and izip(), so my next alternative (which is perfectly acceptable) is >>> a, b = 1, [2,3] >>> x = [a] + b This isn't too bad, but it is slightly less clear than x = [a,*b] and much less efficient when b is long. It seems so simple...makes me think it came up before and I missed it. David From arnodel at googlemail.com Thu Jun 26 08:06:38 2008 From: arnodel at googlemail.com (Arnaud Delobelle) Date: Thu, 26 Jun 2008 07:06:38 +0100 Subject: [Python-ideas] Starred expression in right-hand side In-Reply-To: References: Message-ID: <33A8FEFC-7700-43A0-BA3E-865A09B10F01@googlemail.com> On 26 Jun 2008, at 05:36, David Pokorny wrote: > Since PEP 3132 gives us: > >>>> x = [1,2,3] >>>> a, *b = x >>>> a > 1 >>>> b > [2, 3] > > it seems natural that we should be able to do it the other way too: > > (doesn't actually work) >>>> a, b = 1, [2,3] >>>> x = [a,*b] >>>> x > [1, 2, 3] > > This is essentially itertools.chain, but of course it isn't nearly > as much fun: > >>>> [n for n in itertools.chain([1],[2,3])] > [1, 2, 3] > > Now you might be thinking, "yeah, that's cool, but you don't really > need it" but this actually came up in practice: I have a function that > has a certain behavior for standard types, but when it sees a type it > doesn't recognize, it calls a protocol function (like __iter__ or > __next__) and expects to receive an iterable whose 0th element is a > string. Now I'm probably not going to use itertools.chain because the > only members of itertools I can reliably remember are count() and > izip(), so my next alternative (which is perfectly acceptable) is > >>>> a, b = 1, [2,3] >>>> x = [a] + b > > This isn't too bad, but it is slightly less clear than x = [a,*b] and > much less efficient when b is long. What makes you think it is much less efficient? -- Arnaud From guido at python.org Thu Jun 26 15:49:21 2008 From: guido at python.org (Guido van Rossum) Date: Thu, 26 Jun 2008 06:49:21 -0700 Subject: [Python-ideas] Starred expression in right-hand side In-Reply-To: References: Message-ID: There is even a patch for this. http://bugs.python.org/issue2292 It is a very complex piece of code though and due to lack of time will not make it into 3.0. (And no, if someone picks up the work now I still won't let it into 3.0 -- we need to stabilize the release. There will always be 3.1.) On Wed, Jun 25, 2008 at 9:36 PM, David Pokorny wrote: > Since PEP 3132 gives us: > >>>> x = [1,2,3] >>>> a, *b = x >>>> a > 1 >>>> b > [2, 3] > > it seems natural that we should be able to do it the other way too: > > (doesn't actually work) >>>> a, b = 1, [2,3] >>>> x = [a,*b] >>>> x > [1, 2, 3] > > This is essentially itertools.chain, but of course it isn't nearly as much fun: > >>>> [n for n in itertools.chain([1],[2,3])] > [1, 2, 3] > > Now you might be thinking, "yeah, that's cool, but you don't really > need it" but this actually came up in practice: I have a function that > has a certain behavior for standard types, but when it sees a type it > doesn't recognize, it calls a protocol function (like __iter__ or > __next__) and expects to receive an iterable whose 0th element is a > string. Now I'm probably not going to use itertools.chain because the > only members of itertools I can reliably remember are count() and > izip(), so my next alternative (which is perfectly acceptable) is > >>>> a, b = 1, [2,3] >>>> x = [a] + b > > This isn't too bad, but it is slightly less clear than x = [a,*b] and > much less efficient when b is long. > > It seems so simple...makes me think it came up before and I missed it. > > David > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (home page: http://www.python.org/~guido/) From dbpokorny at gmail.com Thu Jun 26 20:42:21 2008 From: dbpokorny at gmail.com (dbpokorny at gmail.com) Date: Thu, 26 Jun 2008 11:42:21 -0700 (PDT) Subject: [Python-ideas] Starred expression in right-hand side In-Reply-To: <33A8FEFC-7700-43A0-BA3E-865A09B10F01@googlemail.com> References: <33A8FEFC-7700-43A0-BA3E-865A09B10F01@googlemail.com> Message-ID: On Jun 25, 11:06 pm, Arnaud Delobelle wrote: > What makes you think it is much less efficient? AFAIK the intermediate list is discarded immediately after being constructed. I think the main thing the generalized star expression has going for it is aesthetic value, "cool" factor, and less keyboard typing. David From andre.roberge at gmail.com Mon Jun 30 03:41:39 2008 From: andre.roberge at gmail.com (Andre Roberge) Date: Sun, 29 Jun 2008 22:41:39 -0300 Subject: [Python-ideas] Reducing colon uses to increase readability Message-ID: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> Hi everyone, In Python, the humble colon (:) has multiples uses: 1. as a signal to indentation increase, signaling a block of code, such as 1a) for function or class definitions 1b) for while/for/if/elif/else blocks 1c) for try/except/finally blocks In these cases, the majority opinion (to which I subscribe) is that using a colon increases readability. I am NOT suggesting to removing the colon in those instances. However, the colon has also some other uses. 2. in slices [a:b:c] 3. in dict assignments {a:b} 4. in lambda assignments (lambda x: x+1) I would argue that, in these last three examples, there might be better choices. (some of these choices have been inspired by reading http://www.resolversystems.com/documentation/index.php/Differences_Between_Resolver_One%27s_Formula_Language_and_Python_Expressions) I don't expect the following suggestions to immediately convince everyone (or anyone!) ... but, at least they will be on record. Slices: --------- I would argue that, the usual slicing notation would be more readable if it were as follows: [a -> b; c] Thus [1:10:2] would become [1 -> 10; 2] [1:10] would become [1 -> 10] The "shorter" combinations would not gain in terms of readability; they would be as follows: [ :10 : 2] would become [10; 2] [ :10] would become [10;] [:: -1] would become [; -1] [:] would become [;] If such a change were to be made, an second slicing notation, *with a different meaning*, could be introduced: [a => b; c] This would be an inclusive range, i.e. [a => b] is equivalent to [a -> b+1] dict assignments ------------------------ Here again, I would argue that using "->" instead of ":" would make the code more readable - at least for beginners. numbers = {'one' -> 1, 'two' -> 2} instead of numbers = {'one': 1, 'two': 2} lambda assignments --------------------------- Once again, same choice. lambda x -> x+1 is, I think, more readable than lambda x: x+1 (but perhaps the last two [dicts and lambda] largely depends on the font choice...) ====== Other considerations: If "->" were to be adopted for dict or lambda assignments, then the "naturalness" of their choice for slices would be reduced. An alternative might be inspired from the mathematical notation [a, ..., b; c] I realize that this is "much" longer than [a: b: c]. Final comment: I have seen other alternatives for simple slices suggested in the past such as [a..b] and [a...b] which would be the equivalent of [a->b] and [a=>b]; however, the extra "." might sometimes be difficult to read, whereas the difference between "->" and "=>" is much easier to see. Cheers, Andr? From tjreedy at udel.edu Mon Jun 30 06:01:01 2008 From: tjreedy at udel.edu (Terry Reedy) Date: Mon, 30 Jun 2008 00:01:01 -0400 Subject: [Python-ideas] Reducing colon uses to increase readability In-Reply-To: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> References: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> Message-ID: Andre Roberge wrote: > In Python, the humble colon (:) has multiples uses: > > 1. as a signal to indentation increase, signaling a block of code, [snip] > 2. in slices [a:b:c] > 3. in dict assignments {a:b} > 4. in lambda assignments (lambda x: x+1) > > I would argue that, in these last three examples, there might be better choices. It is a bit late in Python's career to make such changes, which would break nearly all substantial programs for at best a small visual gain. -> is slightly harder to type than : and to me uglier. Any new use of ';' has to neither conflict with its current use nor introduce ambiguities that would push Python out of its current LL(1) (I believe it is) grammar class. 'key: item' comports with 'keyword-or-phrase: explanation' constructions in English. lambda expressions abbreviate def statements: def name(args): return expression => lambda args: expression The ':' separates header and body in both. I agree that slices could have used something else, but.... don't hold your breath for a code-breaking change now. Terry Jan Reedy From andre.roberge at gmail.com Mon Jun 30 14:59:31 2008 From: andre.roberge at gmail.com (Andre Roberge) Date: Mon, 30 Jun 2008 09:59:31 -0300 Subject: [Python-ideas] Reducing colon uses to increase readability In-Reply-To: References: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> Message-ID: <7528bcdd0806300559o257a3b27nded986dfd358a5d8@mail.gmail.com> On Mon, Jun 30, 2008 at 1:01 AM, Terry Reedy wrote: > Andre Roberge wrote: [snip] > Any new use of ';' has to neither conflict with its current use nor > introduce ambiguities that would push Python out of its current LL(1) (I > believe it is) grammar class. > Thanks for the information; I've learned something new. [snip] > I agree that slices could have used something else, but.... > > don't hold your breath for a code-breaking change now. > I wasn't... I was thinking that, if any of these ideas were to be seen to have some merit, they would make their way in around Python 3.8 (10 to 12 years from now ;-) Andr? > Terry Jan Reedy > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From grosser.meister.morti at gmx.net Mon Jun 30 15:13:33 2008 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Mon, 30 Jun 2008 15:13:33 +0200 Subject: [Python-ideas] Reducing colon uses to increase readability In-Reply-To: <7528bcdd0806300559o257a3b27nded986dfd358a5d8@mail.gmail.com> References: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> <7528bcdd0806300559o257a3b27nded986dfd358a5d8@mail.gmail.com> Message-ID: <4868DBFD.3010706@gmx.net> The only place where I think : could be problematic is slicing. Other than that I don't see any problem. maybe this syntax would be better? sequence[start..end,step] sequence[start..,step] sequence[..end,step] sequence[start..end] ... Well, but the , would be problematic. How to distinguish between the tuple ((start..end),step) and the slice object (start..end,step)? So this syntax isn't a good idea either. However, I think ".." is much better than ":". But changing this syntax is way to problematic. This should have been done/thought about before the syntax was introduced. Now it's to late anyway. (And the current syntax isn't *that* bad.) -panzi From qrczak at knm.org.pl Mon Jun 30 15:22:16 2008 From: qrczak at knm.org.pl (=?UTF-8?Q?Marcin_=E2=80=98Qrczak=E2=80=99_Kowalczyk?=) Date: Mon, 30 Jun 2008 15:22:16 +0200 Subject: [Python-ideas] Reducing colon uses to increase readability In-Reply-To: <4868DBFD.3010706@gmx.net> References: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> <7528bcdd0806300559o257a3b27nded986dfd358a5d8@mail.gmail.com> <4868DBFD.3010706@gmx.net> Message-ID: <3f4107910806300622w2b1986a4l8620cfb6afa2f85c@mail.gmail.com> 2008/6/30 Mathias Panzenb?ck : > Well, but the , would be problematic. How to distinguish between the tuple > ((start..end),step) and the slice object (start..end,step)? This is not problematic, as you just showed: ((start..end),step) is a tuple, (start..end,step) is a slice. -- Marcin Kowalczyk qrczak at knm.org.pl http://qrnik.knm.org.pl/~qrczak/ From rrr at ronadam.com Mon Jun 30 17:59:33 2008 From: rrr at ronadam.com (Ron Adam) Date: Mon, 30 Jun 2008 10:59:33 -0500 Subject: [Python-ideas] Reducing colon uses to increase readability In-Reply-To: References: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> Message-ID: <486902E5.2020708@ronadam.com> Terry Reedy wrote: > I agree that slices could have used something else, but.... Well there is always the slice object. slice(start, stop, step) Maybe if slice was more interchangeable with range or xrange, or if range objects could be used in place of slice objects? A few results form Python 2.5: >>> s = slice(10, 20, 3) >>> range(s) Traceback (most recent call last): File "", line 1, in TypeError: range() integer end argument expected, got slice. >>> range(100)[s] [10, 13, 16, 19] >>> xrange(s) Traceback (most recent call last): File "", line 1, in TypeError: an integer is required >>> xrange(100)[s] Traceback (most recent call last): File "", line 1, in TypeError: sequence index must be integer, not 'slice' >>> list(xrange(100))[s] [10, 13, 16, 19] From leif.walsh at gmail.com Mon Jun 30 23:34:07 2008 From: leif.walsh at gmail.com (Leif Walsh) Date: Mon, 30 Jun 2008 14:34:07 -0700 Subject: [Python-ideas] Reducing colon uses to increase readability In-Reply-To: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> References: <7528bcdd0806291841y7dc202f3y79176461a5d07165@mail.gmail.com> Message-ID: On Sun, Jun 29, 2008 at 6:41 PM, Andre Roberge wrote: > In Python, the humble colon (:) has multiples uses: > > 1. as a signal to indentation increase, signaling a block of code, such as [snip] > However, the colon has also some other uses. > > 2. in slices [a:b:c] I agree this isn't the clearest it could be. > 3. in dict assignments {a:b} This mirrors a number of existing languages, including English, and, more or less (depending on your priorities) important, JSON. It's always been comforting that valid Python structures (in fact, simply printed) are instantly valid JSON (for the most part...with ints and strings and such). > 4. in lambda assignments (lambda x: x+1) This (correctly) mirrors a standard function definition. the only way it could be closer is with something like lambda(x): x+1, but then this is not what's up for debate. > I would argue that, in these last three examples, there might be better choices. > (some of these choices have been inspired by reading > http://www.resolversystems.com/documentation/index.php/Differences_Between_Resolver_One%27s_Formula_Language_and_Python_Expressions) Are you suggesting this because you work with both languages? This e-mail seems a bit self-serving, because of the inclusion of someone's in-house language spec. > Slices: > --------- > > I would argue that, the usual slicing notation would be more readable > if it were as follows: > > [a -> b; c] > > Thus > [1:10:2] would become [1 -> 10; 2] > [1:10] would become [1 -> 10] > > The "shorter" combinations would not gain in terms of readability; > they would be as follows: > > [ :10 : 2] would become [10; 2] > [ :10] would become [10;] > [:: -1] would become [; -1] > [:] would become [;] > > > If such a change were to be made, an second slicing notation, *with a > different meaning*, could be introduced: > > [a => b; c] > > This would be an inclusive range, i.e. > > [a => b] is equivalent to [a -> b+1] I am always very wary of multiple-character symbols. They are harder to type, harder to read, harder to parse (in a compiler or an editor), and open the language up to an unbounded number of (dare I say it?) Perlisms. That said, I'm not sure 'arrows' are even the right approach for slices. Slices should be thought of as ranges, which usually lend themselves to ellipses. I remember (loosely, from a long time ago) Ruby having '..' and '...' as exclusive and inclusive ranges, and I really liked that. With regard to the third item in a slice, the increment value, I almost never use it, because it seems to make code a lot harder to read clearly. If I feel the need to use it, it's usually a good indicator that I need to restructure my code, and if it's absolutely necessary, I'll typically just iterate over the list with a for loop so that I can understand what I was doing when I come back. If my half-suggestion of ellipses were taken up, I'd say that the colon could stay as the separator between the second and third arguments (and, as someone said already, the semi-colon introduces some weird parsing problems and possible ambiguities). > dict assignments > ------------------------ > > Here again, I would argue that using "->" instead of ":" would make > the code more readable - at least for beginners. > > numbers = {'one' -> 1, 'two' -> 2} instead of > numbers = {'one': 1, 'two': 2} Like I said before, the colon is a widely-accepted way to separate keys and values in a dict. The only strange case I can see with this is something like: functions = {'plus': lambda x, y: x+y, 'minus': lambda x, y: x-y} In fact, I'm not sure if this _is_ legal python, so before running it, I'd just parenthesize out the lambda expressions to be sure anyway, and this clears everything up nicely: functions = {'plus': (lambda x, y: x+y), 'minus': (lambda x, y: x-y)} > lambda assignments > --------------------------- > > Once again, same choice. > > lambda x -> x+1 > is, I think, more readable than > lambda x: x+1 > > (but perhaps the last two [dicts and lambda] largely depends on the > font choice...) As a pseudo-mathematician (and a recent student of Erlang), this is quite appealing, for a few reasons. First, let me say that the obvious "f(x) -> x**2 shows up all over math" is not the correct reason to say this is correct notation for functions. Python functions are procedures, not expressions (as they are in Erlang and Haskell, where the arrow-notation is commonplace). As such, a colon separating the function's name from its definition makes perfect sense, as this is the way we write English all the time, and I've seen more than one professor write pseudocode just like this. However, lambda functions _are_ single-expressions, not blocks. This leads me to believe that the arrow could be a good delimiter (except for my above statement that multiple-character symbols suck). Unfortunately for the arrow, it seems that priority in Python syntax is given to consistency within itself, rather than consistency with the outside world, so the fact that "lambda x: x**2" is consistent with "def sq(x): x**2" probably pulls more weight. Let me just say that putting something like the arrow (especially if we ever allow non-ASCII characters into the syntax) in lambda expressions would not be totally distasteful to me. > ====== > Other considerations: > > If "->" were to be adopted for dict or lambda assignments, then the > "naturalness" of their choice for slices would be reduced. An > alternative might be inspired from the mathematical notation > > [a, ..., b; c] > > I realize that this is "much" longer than [a: b: c]. > > Final comment: > > I have seen other alternatives for simple slices suggested in the past such as > [a..b] and [a...b] which would be the equivalent of [a->b] and [a=>b]; > however, the extra "." might sometimes be difficult to read, whereas > the difference between "->" and "=>" is much easier to see. You're right. This is one of the reasons I hate Ruby. Yet another reason to ignore your suggestion for slices :-). -- Cheers, Leif