From cf.natali at gmail.com Mon Apr 1 11:21:12 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Mon, 1 Apr 2013 11:21:12 +0200 Subject: [Python-ideas] [Python-Dev] A bit about the GIL In-Reply-To: References: Message-ID: > I know this may be tiresome by now, so feel free to ignore, but I'd like to > share with the list an idea about the GIL, more specifically the reference > counting mechanism. > > Simply put, make the reference counter a sharded one. That is, separate it > into several subcounters, in this case one for each thread. Yeah, that's known as sloppy counters, see e.g. http://pdos.csail.mit.edu/papers/linux:osdi10.pdf, section 4.3. Actually you don't need a per-thread counter, only per-cpu (see sched_getcpu()/getcpu()), although I'm not sure it'd be as fast as using thread-register like Trent does. > Unfortunately, in a crude test of mine there is already a severe performance > degradation, and that is without rwlocks. I've used a basic linked list, > and changed the INCREF/DECREF macros to functions to accommodate the extra > logic so it may not be the best approach (too many dereferences). I think that's a dead-end. Extra indirection and cache miss will kill performance. Also, such counters are mostly useful for data structures which are only seldom deallocated (because reconciling the counters is expensive), and with garbage collection, you tend to have many short-lived objects (generational hypothesis). Finally, that'll increase the size of objects consequently (since you need a counter per core/thread), which is bad for cache. > Does this makes sense to anyone? You can try ;-) cf P.S.: Starting a thread on the GIL on April fools' day is a great idea! From cf.natali at gmail.com Mon Apr 1 11:35:34 2013 From: cf.natali at gmail.com (=?ISO-8859-1?Q?Charles=2DFran=E7ois_Natali?=) Date: Mon, 1 Apr 2013 11:35:34 +0200 Subject: [Python-ideas] [Python-Dev] A bit about the GIL In-Reply-To: References: Message-ID: > Actually you don't need a per-thread counter, only per-cpu (see > sched_getcpu()/getcpu()), although I'm not sure it'd be as fast as > using thread-register like Trent does. Of course, per-cpu only holds for kernel-space, in userland you could get preempted... So you'd need O(number of threads) counters per object (and allocating them on demand will probably kill performance). cf From trent at snakebite.org Mon Apr 1 19:46:20 2013 From: trent at snakebite.org (Trent Nelson) Date: Mon, 1 Apr 2013 13:46:20 -0400 Subject: [Python-ideas] [Python-Dev] A bit about the GIL In-Reply-To: References: Message-ID: <20130401174620.GA3878@snakebite.org> On Sun, Mar 31, 2013 at 04:14:11PM -0700, Alfredo Solano Mart?nez wrote: > Hi, > > I know this may be tiresome by now, so feel free to ignore, but I'd like to > share with the list an idea about the GIL, more specifically the reference > counting mechanism. I've been making pretty good progress with my pyparallel work. See the initial slides here: http://speakerdeck.com/trent/parallelizing-the-python-interpreter-an-alternate-approach-to-async And follow-up thread starting here: http://mail.python.org/pipermail/python-dev/2013-March/124690.html I've since set up a separate mailing list for it here: http://lists.snakebite.net/mailman/listinfo/pyparallel/ And switched to bitbucket.org for the primary repo (I still commit to hg.python.org/sandbox/trent too, though): https://bitbucket.org/tpn/pyparallel TL;DR version: I've come up with a way to exploit multiple cores without impeding the performance of "single-threaded execution", and also coupled it with a host of async facilities that allow you to write Twisted/Tulip style protocols but have callbacks automatically execute across all cores. (In the case of client/server facilities, one neat feature is that it'll automatically switch between sync and async socket methods based on concurrent clients and available cores; there is a non-negligible overhead to doing async IO versus blocking IO -- if you have 64 cores and only 32 clients, there's no reason not to attempt sync send and recvs; this will maximize throughput. As soon as the client count exceeds available cores, it'll automatically do async sends/recvs for everything, improving concurrency (at the expense of throughput). The chargen example in the slides is a perfect example of this in action.) > Simply put, make the reference counter a sharded one. That is, separate it > into several subcounters, in this case one for each thread. > > The logic would then be something like this: > - when increasing the refcount, a thread writes only to its own subcounter, > creating one first if necessary. > - similarly, when decreasing the refcount, there is no need to access other > subcounters until that subcounter reaches zero. > - when a subcounter gets to zero, delete it, and read the other subcounters > to check if it was the last one. > - delete the object only if there are no more subcounters. > > Contention could then be reduced to a minimum, since a thread only needs to > read other subcounters when its own reaches zero or wants the total value. > Depending on the implementation it might help with false sharing too, as > subcounters may or may not be in the same cache-line. > > Unfortunately, in a crude test of mine there is already a severe performance > degradation, and that is without rwlocks. I've used a basic linked list, > and changed the INCREF/DECREF macros to functions to accommodate the extra > logic so it may not be the best approach (too many dereferences). > > Does this makes sense to anyone? My personal (and extremely biased now that I've gained some momentum with pyparallel) opinion is that trying to solve the free threading problem is wrong. That is, allowing existing Python code written to use threading.Threads() to execute concurrently across all cores. The only way that could ever be achieved is with the introduction of fine grained locking or STM-type facilities; both of which seriously impede single-threaded performance. You might find this interesting, too: http://hg.python.org/sandbox/trent.peps/file/6de5ed566af9/pep-async.txt I wrote that the weekend before I started actually coding on pyparallel. I think the only person that's seen/read it is Guido (and maybe Antoine). It goes into detail regarding the rationale for the design decisions I took (especially re: avoiding fine grained locks/STM etc). Trent. From ericsnowcurrently at gmail.com Mon Apr 1 20:17:15 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Mon, 1 Apr 2013 12:17:15 -0600 Subject: [Python-ideas] ProtoPEP: A Standardized Marker for Mirroring the TTY to Other Devices Message-ID: I received this proposal from an interested party and am passing it along for consideration. I am concerned about a conflict of interest as the author has ties to the paper industry, and thus could stand to benefit financially from the proposal, but am willing to give them the benefit of the doubt. -eric #################################### PEP: 4XX Title: A Standardized Marker for Mirroring the TTY to Other Devices Version: $Revision$ Last-Modified: $Date$ Author: Michael G. Scott BDFL-Delegate: Barry Warsaw Status: Draft Type: Standards Track Content-Type: text/plain Created: 01-Apr-2013 Post-History: Abstract The history of computing offers a progression of text I/O devices, most notably the Teletype machine [1]. This device had such an impact that we still identify our terminals by "TTY". Presently the terminal text I/O in Python is facilitated through sys.stdin/stdout/stderr. This is a proposal to facilitate mirroring the data passing across these three objects to another device on the system such as a printer. The API to the do so would be through a new builtins special name: __mifflin__. Rationale Dumping the TTY to a printer is a natural desire for anyone who knows anything about computer history and an unrecognized longing for everyone else. Though perhaps no one remembers it this way, there was a strong outcry to the introduction of monitors and keyboards as a replacement for the teletype machine. Everyone loved the teletype. The lamentable fact that monitors and keyboards won out is due to the efforts of the monitor-and-keyboard lobby and particularly to the pathological (but successfully concealed) fear of teletypes of one Thomas Watson, Jr. [2]. Why did people love the teletype? Hard copies are much more endearing. Consider how lame e-books are. A monitor will never give you that fresh ink smell, that textured feel of paper in your hands. You can't make an airplane out of a kindle or wad up your monitor into a ball and throw it into a trash can. People try to accomplish the teletype's satisfying audible response to each keystroke with fancy "retro" keyboards, but they will never be satisfied until they hear the original. Furthermore, look to the airline industry, a paragon of stability and consistency. In critical situations they mirror their TTYs to printers. This is because the hard copies are perfect for chronological review, documentation, and reliable backup. If it's good enough for them, it should be good enough for us. Usage Much as you would with __import__, you will bind builtins.__mifflin__ to a Printer object (see below). By default __mifflin__ will be set to sys.printer (see below). io.Printer and sys.printer A new type will be added to the io module called Printer. Printer may be initialized with a file-like object that exposes an underlying printer device (by default the system printer). Printer will have a write() method that takes a string and writes it out to the underlying system printer. A new attribute will be added to the sys module named "printer". sys.printer will be a Printer object, with a default of a wrapper around the system printer. References [1] http://en.wikipedia.org/wiki/Tty_(Unix) [2] https://en.wikipedia.org/wiki/Thomas_Watson,_Jr. Copyright This document has been placed in the public domain. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 From rosuav at gmail.com Mon Apr 1 21:11:35 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 2 Apr 2013 06:11:35 +1100 Subject: [Python-ideas] ProtoPEP: A Standardized Marker for Mirroring the TTY to Other Devices In-Reply-To: References: Message-ID: On Tue, Apr 2, 2013 at 5:17 AM, Eric Snow wrote: > The history of computing offers a progression of text I/O devices, > most notably the Teletype machine [1]. This device had such an impact > that we still identify our terminals by "TTY". Presently the terminal > text I/O in Python is facilitated through sys.stdin/stdout/stderr. > This is a proposal to facilitate mirroring the data passing across > these three objects to another device on the system such as a printer. > The API to the do so would be through a new builtins special name: > __mifflin__. This would vastly improve the grokkability of Python. Currently, the most fundamental operation in programming is sadly misnamed: print("Hello, world!") When this PEP is accepted, as it most surely should be, the print function will actually send content to a printer. I have seen a number of people[1] extremely confused and even put off programming by the way in which the "print" command does not print, just as in REXX the "say" command does not produce sound. Finally this terrible lack will be cured, once and for all. [1] The fine print down the bottom reminds you that zero is a number. ChrisA From trent at snakebite.org Tue Apr 2 08:09:42 2013 From: trent at snakebite.org (Trent Nelson) Date: Tue, 2 Apr 2013 02:09:42 -0400 Subject: [Python-ideas] [Python-Dev] A bit about the GIL In-Reply-To: References: <20130401174620.GA3878@snakebite.org> Message-ID: <20130402060942.GB3753@snakebite.org> On Mon, Apr 01, 2013 at 04:29:59PM -0700, Alfredo Solano Mart?nez wrote: > > I've been making pretty good progress with my pyparallel work. See > > the initial slides here: > > > > http://speakerdeck.com/trent/parallelizing-the-python-interpreter-an-alternate-approach-to-async > > Really interesting stuff, thanks for the link. Is there a video of the > presentation available? Just the slides for now, unfortunately. > > I've since set up a separate mailing list for it here: > > > > http://lists.snakebite.net/mailman/listinfo/pyparallel/ > > > > And switched to bitbucket.org for the primary repo (I still commit > > to hg.python.org/sandbox/trent too, though): > > > > https://bitbucket.org/tpn/pyparallel > > Will take a look at all that. Any ETA for the Linux version? Nothing formal, no. The plan is to work out all the kinks on Windows, then step back, figure out the best way to abstract the API, then attack the POSIX implementation. (There are two aspects to the work; the parallel stuff, which is the changes to the interpreter to allow multiple threads to run CPython internals concurrently, and the async stuff, which will be heavily tied to the best IO multiplex option on the underlying platform (IOCP on AIX, event ports on Solaris, kqueue on *BSD, epoll on Linux, poll on everything else). The parallel stuff is pretty platform agnostic, which is nice. (Aside from the thread/register trick; but it appears as though most contemporary ISAs have some way of doing the same thing.)) > > You might find this interesting, too: > > > > http://hg.python.org/sandbox/trent.peps/file/6de5ed566af9/pep-async.txt > > There's a lot of nice ideas there. It reminded me of the typical MPI > workflow, with the main thread as the master process (the GIL acting > as the barrier) doing the scatter and gather to the other processes. > I really liked the part about not doing any reference counting and > just nuking everything after its done (it's the only way to be sure) You know, I re-read that PEP last night for the first time since I wrote it. I found it quite amusing -- some things are completely wacky, but quite a lot of it is pretty close to how everything is now. I had zero experience with CPython nitty-gritty internals when I wrote that, which is pretty evident from some of the things I'm suggesting. The "no refcounting and nuke everything when done" aspect has worked surprisingly well. Shared-nothing code executing in a parallel thread absolutely flies. Mallocs are basically free, frees are no-ops, no reference counting and no garbage collection; everything gets released in a single call when we're done. Factor in the heap snapshot/reset/rollback stuff in the IO loop and it's extremely cache friendly too. Definitely very pleased with how all of that stuff is shaping up. > Alfredo Regards, Trent. From ethan at stoneleaf.us Tue Apr 2 21:25:17 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 02 Apr 2013 12:25:17 -0700 Subject: [Python-ideas] str-type enumerations Message-ID: <515B309D.8070301@stoneleaf.us> I'm not trying to beat this dead horse (really!) but back when the enumeration discussions were going fast and furious one of the things asked for was enumerations of type str, to which others replied that strings named themselves. Well, I now have a good case where I don't want to use the string that names itself: 'Mn$(1,6)' I would match rather write item_code! :) -- ~Ethan~ From abarnert at yahoo.com Wed Apr 3 05:21:55 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Tue, 2 Apr 2013 20:21:55 -0700 Subject: [Python-ideas] ProtoPEP: A Standardized Marker for Mirroring the TTY to Other Devices In-Reply-To: References: Message-ID: In AppleScript, "print" is part of the Standard Suite of methods that every object is supposed to handle, and it actually means "print". Sent from a random iPhone On Apr 1, 2013, at 12:11, Chris Angelico wrote: > On Tue, Apr 2, 2013 at 5:17 AM, Eric Snow wrote: >> The history of computing offers a progression of text I/O devices, >> most notably the Teletype machine [1]. This device had such an impact >> that we still identify our terminals by "TTY". Presently the terminal >> text I/O in Python is facilitated through sys.stdin/stdout/stderr. >> This is a proposal to facilitate mirroring the data passing across >> these three objects to another device on the system such as a printer. >> The API to the do so would be through a new builtins special name: >> __mifflin__. > > This would vastly improve the grokkability of Python. Currently, the > most fundamental operation in programming is sadly misnamed: > > print("Hello, world!") > > When this PEP is accepted, as it most surely should be, the print > function will actually send content to a printer. I have seen a number > of people[1] extremely confused and even put off programming by the > way in which the "print" command does not print, just as in REXX the > "say" command does not produce sound. Finally this terrible lack will > be cured, once and for all. > > [1] The fine print down the bottom reminds you that zero is a number. > > ChrisA > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From trent at snakebite.org Wed Apr 3 11:49:43 2013 From: trent at snakebite.org (Trent Nelson) Date: Wed, 3 Apr 2013 05:49:43 -0400 Subject: [Python-ideas] [Python-Dev] A bit about the GIL In-Reply-To: References: <20130401174620.GA3878@snakebite.org> <20130402060942.GB3753@snakebite.org> Message-ID: <20130403094942.GB6519@snakebite.org> On Tue, Apr 02, 2013 at 05:40:07PM -0700, Alfredo Solano Mart?nez wrote: > > (There are two aspects to the work; the parallel stuff, which is the > > changes to the interpreter to allow multiple threads to run CPython > > internals concurrently, and the async stuff, which will be heavily > > tied to the best IO multiplex option on the underlying platform > > (IOCP on AIX, event ports on Solaris, kqueue on *BSD, epoll on > > Linux, poll on everything else). The parallel stuff is pretty > > platform agnostic, which is nice. (Aside from the thread/register > > trick; but it appears as though most contemporary ISAs have some > > way of doing the same thing.)) > > That's a lot of things to do. Do you have a work breakdown structure or > are you still putting the pieces together? Work breakdown structure? That's far too organized ;-) I have an end goal in mind and I'm just slowly hacking my way towards it (at least for the Windows work). > > The "no refcounting and nuke everything when done" aspect has > > worked surprisingly well. Shared-nothing code executing in a > > parallel thread absolutely flies. Mallocs are basically free, > > frees are no-ops, no reference counting and no garbage > > collection; everything gets released in a single call when we're > > done. > > Glad to hear it, it's hard to make things simple. Actually, I have to > say the GPU analogy is very good, with all but the main core acting as > vector processors -and thus providing a sort of programmable pipeline > for it- while the main core becomes the CPU. I would go definitely > for that in future slides. The GPU analogy seemed like a good idea when I was writing the PEP, but the implementation has taken a slightly different path. There is far less emphasis on the notion of vectorized/SIMD-style work; in fact, I haven't implemented any of the 'parallel' type functions yet (like a parallel map/reduce, or equivalents to the parallel stuff exposed by multiprocessing). That'll be all stuff to tackle down the track. > In the case of the GPUs the copying of data from memory to card is > usually a bottleneck, is there a big hit in performance here too? Well, as the current implementation doesn't really have anything that reflects the GPU vector analogy in that draft PEP, no, not really ;-) (I should probably clarify again that the PEP I cited was hacked out in a weekend before I started a lick of coding. The requirements section is definitely useful, as it elicits the constraints I used to drive my design decisions, but all of the sections that allude to implementation details (like binding a thread to each core via thread affinity, not having access to globals, introducing new op- codes to achieve the parallel functionality) don't necessarily map to how I've implemented things now. Once I've finished the work on Windows I'll do an updated PEP.) Trent. From asolano at icai.es Thu Apr 4 03:01:20 2013 From: asolano at icai.es (=?UTF-8?Q?Alfredo_Solano_Mart=C3=ADnez?=) Date: Thu, 4 Apr 2013 03:01:20 +0200 Subject: [Python-ideas] [Python-Dev] A bit about the GIL In-Reply-To: <20130403094942.GB6519@snakebite.org> References: <20130401174620.GA3878@snakebite.org> <20130402060942.GB3753@snakebite.org> <20130403094942.GB6519@snakebite.org> Message-ID: > Work breakdown structure? That's far too organized ;-) I have an > end goal in mind and I'm just slowly hacking my way towards it (at > least for the Windows work). That's called pioneering! > The GPU analogy seemed like a good idea when I was writing the PEP, > but the implementation has taken a slightly different path. There > is far less emphasis on the notion of vectorized/SIMD-style work; in > fact, I haven't implemented any of the 'parallel' type functions yet > (like a parallel map/reduce, or equivalents to the parallel stuff > exposed by multiprocessing). > > That'll be all stuff to tackle down the track. My mistake, I guess I got carried away. > Well, as the current implementation doesn't really have anything > that reflects the GPU vector analogy in that draft PEP, no, not > really ;-) Point taken :) > (I should probably clarify again that the PEP I cited was hacked out > in a weekend before I started a lick of coding. The requirements > section is definitely useful, as it elicits the constraints I used > to drive my design decisions, but all of the sections that allude > to implementation details (like binding a thread to each core via > thread affinity, not having access to globals, introducing new op- > codes to achieve the parallel functionality) don't necessarily map > to how I've implemented things now. Once I've finished the work on > Windows I'll do an updated PEP.) > > Trent. Thanks for the clarification. No need to hurry, indeed. Alfredo From eliben at gmail.com Thu Apr 4 05:23:48 2013 From: eliben at gmail.com (Eli Bendersky) Date: Wed, 3 Apr 2013 20:23:48 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: <515B309D.8070301@stoneleaf.us> References: <515B309D.8070301@stoneleaf.us> Message-ID: On Tue, Apr 2, 2013 at 12:25 PM, Ethan Furman wrote: > I'm not trying to beat this dead horse (really!) but back when the > enumeration discussions were going fast and furious one of the things asked > for was enumerations of type str, to which others replied that strings > named themselves. > > Well, I now have a good case where I don't want to use the string that > names itself: 'Mn$(1,6)' I would match rather write item_code! :) > Hi Ethan, The latest incarnation of flufl.enum that went through the round of discussions during PyCon allows string values in enumerations, if you want them: >>> from flufl.enum import Enum >>> class WeirdNames(Enum): ... item_code = 'Mn$(1,6)' ... other_code = '#$%#$^' ... >>> WeirdNames.item_code >>> i = WeirdNames.item_code >>> i >>> i.value 'Mn$(1,6)' >>> print(i) WeirdNames.item_code >>> I haven't updated PEP 435 to reflect this yet, hope to do so in the next day or two. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From wolfgang.maier at biologie.uni-freiburg.de Thu Apr 4 12:33:31 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Thu, 4 Apr 2013 10:33:31 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?zip=5Fstrict=28=29_or_similar_in_itertoo?= =?utf-8?q?ls_=3F?= Message-ID: Dear all, the itertools documentation has the grouper() recipe, which returns consecutive tuples of a specified length n from an iterable. To do this, it uses zip_longest(). While this is an elegant and fast solution, my problem is that I sometimes don't want my tuples to be filled with a fillvalue (which happens if len(iterable) % n != 0), but I would prefer an error instead. This is important, for example, when iterating over the contents of a file and you want to make sure that it's not truncated. I was wondering whether itertools, in addition to the built-in zip() and zip_longest(), shouldn't provide something like zip_strict(), which would raise an Error, if its arguments aren't of equal length. zip_strict() could then be used in an alternative grouper() recipe. By the way, right now, I am using the following workaround for this problem: def iblock(iterable, bsize, strict=False): """Return consecutive lists of bsize items from an iterable. If strict is True, raises a ValueError if the size of the last block in iterable is smaller than bsize. If strict is False, it returns the truncated list instead.""" it=iter(iterable) i=[it]*(bsize-1) while True: try: result=[next(it)] except StopIteration: # iterator exhausted, end the generator break for e in i: try: result.append(next(e)) except StopIteration: # iterator exhausted after returning at least one item, # but before returning bsize items if strict: raise ValueError("only %d value(s) left in iterator, expected %d" % (len(result),bsize)) else: pass yield result , which works well, but is about 3-4 times slower than the grouper() recipe. If you have alternative, faster solutions that I wasn't thinking of, I'd be very interested to here about them. Best, Wolfgang From wolfgang.maier at biologie.uni-freiburg.de Thu Apr 4 12:42:21 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Thu, 4 Apr 2013 10:42:21 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?zip=5Fstrict=28=29_or_similar_in_itertoo?= =?utf-8?q?ls_=3F?= References: Message-ID: Wolfgang Maier writes: > , which works well, but is about 3-4 times slower than the grouper() recipe. > If you have alternative, faster solutions that I wasn't thinking of, I'd be > very interested to here about them. > > Best, > Wolfgang > ok, I wasn't remembering the timing results correctly: it's about 8 times slower than grouper. From asolano at icai.es Thu Apr 4 13:49:54 2013 From: asolano at icai.es (=?UTF-8?Q?Alfredo_Solano_Mart=C3=ADnez?=) Date: Thu, 4 Apr 2013 13:49:54 +0200 Subject: [Python-ideas] zip_strict() or similar in itertools ? In-Reply-To: References: Message-ID: Hi, Have you tried using a marker as fill value and then look for it to raise the exception? The membership operator is quite decent, IIRC. Alfredo On Thu, Apr 4, 2013 at 12:42 PM, Wolfgang Maier wrote: > Wolfgang Maier writes: > >> , which works well, but is about 3-4 times slower than the grouper() recipe. >> If you have alternative, faster solutions that I wasn't thinking of, I'd be >> very interested to here about them. >> >> Best, >> Wolfgang >> > > ok, I wasn't remembering the timing results correctly: it's about 8 times > slower than grouper. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From wolfgang.maier at biologie.uni-freiburg.de Thu Apr 4 14:04:42 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Thu, 4 Apr 2013 12:04:42 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?zip=5Fstrict=28=29_or_similar_in_itertoo?= =?utf-8?q?ls_=3F?= References: Message-ID: Alfredo Solano Mart?nez writes: > > Hi, > > Have you tried using a marker as fill value and then look for it to > raise the exception? The membership operator is quite decent, IIRC. > > Alfredo > Sure, that would be the alternative, but it's not a very general solution since you would have to figure out a fill marker that can never be part of the specific iterable. What's worse is that you're retrieving several elements per iteration, and those different elements may have different properties requiring different markers. For example, in a file every first line might be an arbitrary string, every second a number, every third could optionally be blank, and so on. So I guess, catching the problem early and raising an error right then, is a simpler and clearer solution. Wolfgang From __peter__ at web.de Thu Apr 4 14:24:54 2013 From: __peter__ at web.de (Peter Otten) Date: Thu, 04 Apr 2013 14:24:54 +0200 Subject: [Python-ideas] zip_strict() or similar in itertools ? References: Message-ID: Wolfgang Maier wrote: > Dear all, > the itertools documentation has the grouper() recipe, which returns > consecutive tuples of a specified length n from an iterable. To do this, > it uses zip_longest(). While this is an elegant and fast solution, my > problem is that I sometimes don't want my tuples to be filled with a > fillvalue (which happens if len(iterable) % n != 0), but I would prefer an > error instead. This is important, for example, when iterating over the > contents of a file and you want to make sure that it's not truncated. > I was wondering whether itertools, in addition to the built-in zip() and > zip_longest(), shouldn't provide something like zip_strict(), which would > raise an Error, if its arguments aren't of equal length. > zip_strict() could then be used in an alternative grouper() recipe. > > By the way, right now, I am using the following workaround for this > problem: > > def iblock(iterable, bsize, strict=False): > """Return consecutive lists of bsize items from an iterable. > > If strict is True, raises a ValueError if the size of the last block > in iterable is smaller than bsize. If strict is False, it returns the > truncated list instead.""" > > it=iter(iterable) > i=[it]*(bsize-1) > while True: > try: > result=[next(it)] > except StopIteration: > # iterator exhausted, end the generator > break > for e in i: > try: > result.append(next(e)) > except StopIteration: > # iterator exhausted after returning at least one item, > # but before returning bsize items > if strict: > raise ValueError("only %d value(s) left in iterator, > expected %d" % (len(result),bsize)) > else: > pass > yield result > > , which works well, but is about 3-4 times slower than the grouper() > recipe. If you have alternative, faster solutions that I wasn't thinking > of, I'd be very interested to here about them. > > Best, > Wolfgang A simple approach is def strict_grouper(items, size, strict): fillvalue = object() args = [iter(items)]*size chunks = zip_longest(*args, fillvalue=fillvalue) prev = next(chunks) for chunk in chunks: yield prev prev = chunk if prev[-1] is fillvalue: if strict: raise ValueError else: prev = prev[:prev.index(fillvalue)] yield prev If that's fast enough it might be a candidate for the recipes section. A partial solution I wrote a while a go is http://code.activestate.com/recipes/497006-zip_exc-a-lazy-zip-that-ensures-that-all-iterables/ From __peter__ at web.de Thu Apr 4 14:32:28 2013 From: __peter__ at web.de (Peter Otten) Date: Thu, 04 Apr 2013 14:32:28 +0200 Subject: [Python-ideas] zip_strict() or similar in itertools ? References: Message-ID: Peter Otten wrote: > prev = prev[:prev.index(fillvalue)] To be bullet-proof that needs to check object identity instead of equality: while prev[-1] is fillvalue: prev = prev[:-1] From asolano at icai.es Thu Apr 4 14:35:21 2013 From: asolano at icai.es (=?UTF-8?Q?Alfredo_Solano_Mart=C3=ADnez?=) Date: Thu, 4 Apr 2013 14:35:21 +0200 Subject: [Python-ideas] zip_strict() or similar in itertools ? In-Reply-To: References: Message-ID: > Sure, that would be the alternative, but it's not a very general solution > since you would have to figure out a fill marker that can never be part of > the specific iterable. > What's worse is that you're retrieving several > elements per iteration, and those different elements may have different > properties requiring different markers. For example, in a file every first > line might be an arbitrary string, every second a number, every third could > optionally be blank, and so on. So I guess, catching the problem early and > raising an error right then, is a simpler and clearer solution. > > Wolfgang Indeed, the question is still open. I was talking about the speed penalty of your interim solution. About the selection of a marker, what about a custom class? # None of your data will be this class Marker(): pass # Same as the docs recipes def grouper(n, iterable, fillvalue=None): args = [iter(iterable)] * n return itertools.zip_longest(*args, fillvalue=fillvalue) # And then do something like for t in grouper(3, 'ABCDEFG', Marker): if Marker in t: print('Marker) # or raise ValueError, ... Alfredo From wolfgang.maier at biologie.uni-freiburg.de Thu Apr 4 14:49:18 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Thu, 4 Apr 2013 12:49:18 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?zip=5Fstrict=28=29_or_similar_in_itertoo?= =?utf-8?q?ls_=3F?= References: Message-ID: Peter Otten <__peter__ at ...> writes: > > Peter Otten wrote: > > > prev = prev[:prev.index(fillvalue)] > > To be bullet-proof that needs to check object identity instead of equality: > > while prev[-1] is fillvalue: > prev = prev[:-1] > That's a clever way!! Thanks, I'll try that. Wolfgang From wolfgang.maier at biologie.uni-freiburg.de Thu Apr 4 15:07:28 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Thu, 4 Apr 2013 13:07:28 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?zip=5Fstrict=28=29_or_similar_in_itertoo?= =?utf-8?q?ls_=3F?= References: Message-ID: Alfredo Solano Mart?nez writes: > # None of your data will be this > class Marker(): pass > > # Same as the docs recipes > def grouper(n, iterable, fillvalue=None): > args = [iter(iterable)] * n > return itertools.zip_longest(*args, fillvalue=fillvalue) > > # And then do something like > for t in grouper(3, 'ABCDEFG', Marker): > if Marker in t: print('Marker) # or raise ValueError, ... > > Alfredo > Thanks for sharing this! It's the same basic idea as in Peter's strict_grouper solution, which integrates the whole thing in one function. Wolfgang From wolfgang.maier at biologie.uni-freiburg.de Thu Apr 4 15:15:55 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Thu, 4 Apr 2013 13:15:55 +0000 (UTC) Subject: [Python-ideas] =?utf-8?q?zip=5Fstrict=28=29_or_similar_in_itertoo?= =?utf-8?q?ls_=3F?= References: Message-ID: Wolfgang Maier writes: Turns out that Peter's solution (using a class instance as the marker, and managing to get away with a test for it only once after exhaustion of the iterator) is impressively fast indeed: def strict_grouper(items, size, strict): fillvalue = object() args = [iter(items)]*size chunks = zip_longest(*args, fillvalue=fillvalue) prev = next(chunks) for chunk in chunks: print (prev) yield prev prev = chunk if prev[-1] is fillvalue: if strict: raise ValueError else: while prev[-1] is fillvalue: prev = prev[:-1] yield prev beats my old, clumsy approach by a speed factor of ~5, i.e., it's less than a factor 2 slower than the grouper() recipe, but raises the error I wanted! Certainly good enough for me, and, yes, I think it would make a nice itertools recipe. Thanks for your help, Wolfgang From eliben at gmail.com Thu Apr 4 15:56:54 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 4 Apr 2013 06:56:54 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: References: <515B309D.8070301@stoneleaf.us> Message-ID: On Wed, Apr 3, 2013 at 8:23 PM, Eli Bendersky wrote: > On Tue, Apr 2, 2013 at 12:25 PM, Ethan Furman wrote: > >> I'm not trying to beat this dead horse (really!) but back when the >> enumeration discussions were going fast and furious one of the things asked >> for was enumerations of type str, to which others replied that strings >> named themselves. >> >> Well, I now have a good case where I don't want to use the string that >> names itself: 'Mn$(1,6)' I would match rather write item_code! :) >> > > Hi Ethan, > > The latest incarnation of flufl.enum that went through the round of > discussions during PyCon allows string values in enumerations, if you want > them: > > >>> from flufl.enum import Enum > >>> class WeirdNames(Enum): > ... item_code = 'Mn$(1,6)' > ... other_code = '#$%#$^' > ... > >>> WeirdNames.item_code > > >>> i = WeirdNames.item_code > >>> i > > >>> i.value > 'Mn$(1,6)' > >>> print(i) > WeirdNames.item_code > >>> > > I haven't updated PEP 435 to reflect this yet, hope to do so in the next > day or two. > The PEP is up-to-date now, and mentions string-valued enums as well. Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 4 17:11:52 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 04 Apr 2013 08:11:52 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: References: <515B309D.8070301@stoneleaf.us> Message-ID: <515D9838.90809@stoneleaf.us> On 04/04/2013 06:56 AM, Eli Bendersky wrote: > On Wed, Apr 3, 2013 at 8:23 PM, Eli Bendersky wrote: >> On Tue, Apr 2, 2013 at 12:25 PM, Ethan Furman wrote: >> >>> I'm not trying to beat this dead horse (really!) but back when the enumeration discussions were going fast and >>> furious one of the things asked for was enumerations of type str, to which others replied that strings named >>> themselves. >>> >>> Well, I now have a good case where I don't want to use the string that names itself: 'Mn$(1,6)' I would match >>> rather write item_code! :) >> >> The latest incarnation of flufl.enum that went through the round of discussions during PyCon allows string values in >> enumerations, if you want them: >> >> >>> from flufl.enum import Enum >> >>> class WeirdNames(Enum): >> ... item_code = 'Mn$(1,6)' >> ... other_code = '#$%#$^' >> ... >> >>> WeirdNames.item_code >> >> >>> i = WeirdNames.item_code >> >>> i >> >> >>> i.value >> 'Mn$(1,6)' >> >>> print(i) >> WeirdNames.item_code >> >> I haven't updated PEP 435 to reflect this yet, hope to do so in the next day or two. > > The PEP is up-to-date now, and mentions string-valued enums as well. Wow -- looks like flufl.enum has come a long way! Cool. My use case for the str enum is to use it as a dict key for a custom mapping to a Business Basic file; this means that the str value will be pulled apart and disected to see exactly where in a fixed-length field it needs to pull data from (in the example above it would be the first six characters as BB is 1-based). Will a str-based enum handle that, or will the custom mapping have to be updated to check for a str or an enum, and if an enum use the .value? -- ~Ethan~ From eliben at gmail.com Thu Apr 4 18:06:13 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 4 Apr 2013 09:06:13 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: <515D9838.90809@stoneleaf.us> References: <515B309D.8070301@stoneleaf.us> <515D9838.90809@stoneleaf.us> Message-ID: On Thu, Apr 4, 2013 at 8:11 AM, Ethan Furman wrote: > On 04/04/2013 06:56 AM, Eli Bendersky wrote: > >> On Wed, Apr 3, 2013 at 8:23 PM, Eli Bendersky wrote: >> >>> On Tue, Apr 2, 2013 at 12:25 PM, Ethan Furman wrote: >>> >>> I'm not trying to beat this dead horse (really!) but back when the >>>> enumeration discussions were going fast and >>>> furious one of the things asked for was enumerations of type str, to >>>> which others replied that strings named >>>> themselves. >>>> >>>> Well, I now have a good case where I don't want to use the string that >>>> names itself: 'Mn$(1,6)' I would match >>>> rather write item_code! :) >>>> >>> >>> The latest incarnation of flufl.enum that went through the round of >>> discussions during PyCon allows string values in >>> enumerations, if you want them: >>> >>> >>> from flufl.enum import Enum >>> >>> class WeirdNames(Enum): >>> ... item_code = 'Mn$(1,6)' >>> ... other_code = '#$%#$^' >>> ... >>> >>> WeirdNames.item_code >>> >>> >>> i = WeirdNames.item_code >>> >>> i >>> >>> >>> i.value >>> 'Mn$(1,6)' >>> >>> print(i) >>> WeirdNames.item_code >>> >>> I haven't updated PEP 435 to reflect this yet, hope to do so in the next >>> day or two. >>> >> >> The PEP is up-to-date now, and mentions string-valued enums as well. >> > > Wow -- looks like flufl.enum has come a long way! Cool. > > My use case for the str enum is to use it as a dict key for a custom > mapping to a Business Basic file; this means that the str value will be > pulled apart and disected to see exactly where in a fixed-length field it > needs to pull data from (in the example above it would be the first six > characters as BB is 1-based). > > Will a str-based enum handle that, or will the custom mapping have to be > updated to check for a str or an enum, and if an enum use the .value? I'm not entirely sure what you mean here, Ethan. What I do know is that enumeration values are hashable, so they can be used as keys in dictionaries. Actually, __hash__ is object.__hash__ for enum values. You can look at the full code here: http://bazaar.launchpad.net/~barry/flufl.enum/trunk/view/head:/flufl/enum/_enum.py Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Apr 4 18:23:34 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 04 Apr 2013 09:23:34 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: References: <515B309D.8070301@stoneleaf.us> <515D9838.90809@stoneleaf.us> Message-ID: <515DA906.9070509@stoneleaf.us> On 04/04/2013 09:06 AM, Eli Bendersky wrote: > > On Thu, Apr 4, 2013 at 8:11 AM, Ethan Furman wrote: > > On 04/04/2013 06:56 AM, Eli Bendersky wrote: > > On Wed, Apr 3, 2013 at 8:23 PM, Eli Bendersky wrote: > > On Tue, Apr 2, 2013 at 12:25 PM, Ethan Furman wrote: > > I'm not trying to beat this dead horse (really!) but back when the enumeration discussions were going > fast and > furious one of the things asked for was enumerations of type str, to which others replied that strings named > themselves. > > Well, I now have a good case where I don't want to use the string that names itself: 'Mn$(1,6)' I would > match > rather write item_code! :) > > > The latest incarnation of flufl.enum that went through the round of discussions during PyCon allows string > values in > enumerations, if you want them: > > >>> from flufl.enum import Enum > >>> class WeirdNames(Enum): > ... item_code = 'Mn$(1,6)' > ... other_code = '#$%#$^' > ... > >>> WeirdNames.item_code > > >>> i = WeirdNames.item_code > >>> i > > >>> i.value > 'Mn$(1,6)' > >>> print(i) > WeirdNames.item_code > > I haven't updated PEP 435 to reflect this yet, hope to do so in the next day or two. > > > The PEP is up-to-date now, and mentions string-valued enums as well. > > > Wow -- looks like flufl.enum has come a long way! Cool. > > My use case for the str enum is to use it as a dict key for a custom mapping to a Business Basic file; this means > that the str value will be pulled apart and disected to see exactly where in a fixed-length field it needs to pull > data from (in the example above it would be the first six characters as BB is 1-based). > > Will a str-based enum handle that, or will the custom mapping have to be updated to check for a str or an enum, and > if an enum use the .value? > > > I'm not entirely sure what you mean here, Ethan. What I do know is that enumeration values are hashable, so they can be > used as keys in dictionaries. In the example above 'Mn$' is the field, and '(1,6)' are the first six charecters in the field. So the custom mapping has to parse the key passed to in in order to return the proper value; it looks something like this: def __getitem__(self, key): real_key = key[:3] field = self.dict[real_key] first, length = key[3:][1:-1].split(',') first = first - 1 last = first + length data = field[first:last] return data If key is a str-based enum, will this work? -- ~Ethan~ From random832 at fastmail.us Thu Apr 4 19:14:24 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Thu, 04 Apr 2013 13:14:24 -0400 Subject: [Python-ideas] str-type enumerations In-Reply-To: <515DA906.9070509@stoneleaf.us> References: <515B309D.8070301@stoneleaf.us> <515D9838.90809@stoneleaf.us> <515DA906.9070509@stoneleaf.us> Message-ID: <1365095664.4762.140661213395081.045D8620@webmail.messagingengine.com> On Thu, Apr 4, 2013, at 12:23, Ethan Furman wrote: > In the example above 'Mn$' is the field, and '(1,6)' are the first six > charecters in the field. So the custom mapping > has to parse the key passed to in in order to return the proper value; it > looks something like this: Why is this a string instead of a tuple ('Mn$',1,6) or a class specifically designed for this? From eliben at gmail.com Thu Apr 4 19:22:35 2013 From: eliben at gmail.com (Eli Bendersky) Date: Thu, 4 Apr 2013 10:22:35 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: <515DA906.9070509@stoneleaf.us> References: <515B309D.8070301@stoneleaf.us> <515D9838.90809@stoneleaf.us> <515DA906.9070509@stoneleaf.us> Message-ID: On Thu, Apr 4, 2013 at 9:23 AM, Ethan Furman wrote: > On 04/04/2013 09:06 AM, Eli Bendersky wrote: > > >> On Thu, Apr 4, 2013 at 8:11 AM, Ethan Furman wrote: >> >> On 04/04/2013 06:56 AM, Eli Bendersky wrote: >> >> On Wed, Apr 3, 2013 at 8:23 PM, Eli Bendersky wrote: >> >> On Tue, Apr 2, 2013 at 12:25 PM, Ethan Furman wrote: >> >> I'm not trying to beat this dead horse (really!) but back >> when the enumeration discussions were going >> fast and >> furious one of the things asked for was enumerations of >> type str, to which others replied that strings named >> themselves. >> >> Well, I now have a good case where I don't want to use >> the string that names itself: 'Mn$(1,6)' I would >> match >> rather write item_code! :) >> >> >> The latest incarnation of flufl.enum that went through the >> round of discussions during PyCon allows string >> values in >> enumerations, if you want them: >> >> >>> from flufl.enum import Enum >> >>> class WeirdNames(Enum): >> ... item_code = 'Mn$(1,6)' >> ... other_code = '#$%#$^' >> ... >> >>> WeirdNames.item_code >> >> >>> i = WeirdNames.item_code >> >>> i >> >> >>> i.value >> 'Mn$(1,6)' >> >>> print(i) >> WeirdNames.item_code >> >> I haven't updated PEP 435 to reflect this yet, hope to do so >> in the next day or two. >> >> >> The PEP is up-to-date now, and mentions string-valued enums as >> well. >> >> >> Wow -- looks like flufl.enum has come a long way! Cool. >> >> My use case for the str enum is to use it as a dict key for a custom >> mapping to a Business Basic file; this means >> that the str value will be pulled apart and disected to see exactly >> where in a fixed-length field it needs to pull >> data from (in the example above it would be the first six characters >> as BB is 1-based). >> >> Will a str-based enum handle that, or will the custom mapping have to >> be updated to check for a str or an enum, and >> if an enum use the .value? >> >> >> I'm not entirely sure what you mean here, Ethan. What I do know is that >> enumeration values are hashable, so they can be >> used as keys in dictionaries. >> > > In the example above 'Mn$' is the field, and '(1,6)' are the first six > charecters in the field. So the custom mapping has to parse the key passed > to in in order to return the proper value; it looks something like this: > > def __getitem__(self, key): > real_key = key[:3] > field = self.dict[real_key] > first, length = key[3:][1:-1].split(',') > first = first - 1 > last = first + length > data = field[first:last] > return data > > If key is a str-based enum, will this work? No. Enum allows having string values, that's it. The object itself is a EnumValue, though, it's not "isinstance(str)". You can easily use "key.value" there, though, as key.value *is* the actual value assigned to the enum value. That's a strange use case, though :-) Eli -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Thu Apr 4 19:24:34 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 4 Apr 2013 10:24:34 -0700 Subject: [Python-ideas] [Python-Dev] relative import circular problem In-Reply-To: References: Message-ID: Redirecting to python-ideas. On Thu, Apr 4, 2013 at 9:26 AM, Richard Oudkerk wrote: > On 04/04/2013 4:17pm, Guido van Rossum wrote: >> >> I don't really see what we could change to avoid breaking code in any >> particular case -- the burden is up to the library to do it right. I >> don't see a reason to forbid any of this either. > > > How about having a form of relative import which only works for submodules. > For instance, instead of > > from . import moduleX > > write > > import .moduleX > > which is currently a SyntaxError. I think this could be implemented as > > moduleX = importlib.import_module('.moduleX', __package__) We considered that when relative import was designed and rejected it, because it violates the expectation that after "import " you can use exactly "" in your code. -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Thu Apr 4 19:51:46 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 04 Apr 2013 10:51:46 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: <1365095664.4762.140661213395081.045D8620@webmail.messagingengine.com> References: <515B309D.8070301@stoneleaf.us> <515D9838.90809@stoneleaf.us> <515DA906.9070509@stoneleaf.us> <1365095664.4762.140661213395081.045D8620@webmail.messagingengine.com> Message-ID: <515DBDB2.8080109@stoneleaf.us> On 04/04/2013 10:14 AM, random832 at fastmail.us wrote: > On Thu, Apr 4, 2013, at 12:23, Ethan Furman wrote: >> In the example above 'Mn$' is the field, and '(1,6)' are the first six >> charecters in the field. So the custom mapping >> has to parse the key passed to in in order to return the proper value; it >> looks something like this: > > Why is this a string instead of a tuple ('Mn$',1,6) or a class > specifically designed for this? A tuple would be no less painful to write, and it /is/ in a class specifically designed for this: the custom mapping object. -- ~Ethan~ From ethan at stoneleaf.us Thu Apr 4 21:52:40 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 04 Apr 2013 12:52:40 -0700 Subject: [Python-ideas] str-type enumerations In-Reply-To: References: <515B309D.8070301@stoneleaf.us> <515D9838.90809@stoneleaf.us> <515DA906.9070509@stoneleaf.us> Message-ID: <515DDA08.80001@stoneleaf.us> On 04/04/2013 10:22 AM, Eli Bendersky wrote: > > On Thu, Apr 4, 2013 at 9:23 AM, Ethan Furman wrote: > > In the example above 'Mn$' is the field, and '(1,6)' are the first six characters in the field. So the custom > mapping has to parse the key passed to in in order to return the proper value; it looks something like this: > > def __getitem__(self, key): > real_key = key[:3] > field = self.dict[real_key] > first, length = key[3:][1:-1].split(',') > first = first - 1 > last = first + length > data = field[first:last] > return data > > If key is a str-based enum, will this work? > > No. Enum allows having string values, that's it. The object itself is a EnumValue, though, it's not "isinstance(str)". > > That's a strange use case, though :-) Yeah, well, Python is a glue language, and sometimes glue is messy. ;) -- ~Ethan~ From p at google-groups-2013.dobrogost.net Thu Apr 4 23:57:38 2013 From: p at google-groups-2013.dobrogost.net (Piotr Dobrogost) Date: Thu, 4 Apr 2013 23:57:38 +0200 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" Message-ID: Hi! Having str(container) calling str(item) and not repr(item) sounds like the right thing to do. However, PEP 3140 was rejected on the basis of the following statement of Guido: "Let me just save everyone a lot of time and say that I'm opposed to this change, and that I believe that it would cause way too much disturbance to be accepted this close to beta." Thu, 29 May 2008 12:32:04 -0700( http://www.mail-archive.com/python-3000 at python.org/msg13686.html) Does anyone know what's the reason Guido was opposed to this change? Is there any chance to revive this PEP? Regards, Piotr Dobrogost -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Fri Apr 5 00:36:33 2013 From: guido at python.org (Guido van Rossum) Date: Thu, 4 Apr 2013 15:36:33 -0700 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: Lots of reasons. E.g. would you really like this outcome? >>> a = 'foo, bar' >>> b = [a] >>> print(b) [foo, bar] >>> Plus of course there really would be tons of backwards compatibility issues. On Thu, Apr 4, 2013 at 2:57 PM, Piotr Dobrogost

wrote: > Hi! > > Having str(container) calling str(item) and not repr(item) sounds like the > right thing to do. However, PEP 3140 was rejected on the basis of the > following statement of Guido: > > "Let me just save everyone a lot of time and say that I'm opposed to > this change, and that I believe that it would cause way too much > disturbance to be accepted this close to beta." > > > Thu, 29 May 2008 12:32:04 -0700 > (http://www.mail-archive.com/python-3000 at python.org/msg13686.html) > > Does anyone know what's the reason Guido was opposed to this change? > Is there any chance to revive this PEP? > > > Regards, > Piotr Dobrogost > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) From ned at nedbatchelder.com Fri Apr 5 02:26:59 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Thu, 04 Apr 2013 20:26:59 -0400 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: <515E1A53.8070801@nedbatchelder.com> On 4/4/2013 5:57 PM, Piotr Dobrogost wrote: > Hi! > > Having str(container) calling str(item) and not repr(item) sounds like > the right thing to do. However, PEP 3140 was rejected on the basis of > the following statement of Guido: > > "Let me just save everyone a lot of time and say that I'm opposed to > this change, and that I believe that it would cause way too much > disturbance to be accepted this close to beta." > > Thu, 29 May 2008 12:32:04 -0700 > > (http://www.mail-archive.com/python-3000 at python.org/msg13686.html) > > Does anyone know what's the reason Guido was opposed to this change? > Is there any chance to revive this PEP? > repr() is for geeks, str() is for civilians. Since str(list) prints square brackets and commas, it's for geeks anyway, so it prints the repr() of its contents. If you want a for-civilian output of a list, you have to construct it yourself. --Ned. > > Regards, > Piotr Dobrogost > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From phong at phong.org Fri Apr 5 03:21:11 2013 From: phong at phong.org (Tom Schumm) Date: Thu, 04 Apr 2013 21:21:11 -0400 Subject: [Python-ideas] An iterable version of find/index for strings? Message-ID: <7653229.S9PYj1kSdO@orcus> Should Python strings (and byte arrays, and other iterables for that matter) have an iterator form of find/rfind (or index/rindex)? I've found myself wanting one on occasion, and having more iterable things seems to be the direction the language is moving. Currently, looping over the instances of a substring in a larger string is a bit awkward. You have to keep track of where you are, and you either have have to watch for the -1 sentinel value or catch the ValueError. A "for idx in ..." construction would just be cleaner. You could use re.finditer, but a string method seems a more lightweight/efficient/obvious. The best name I can think of would be "finditer()" like re.finditer(). Using "ifind" (like izip) would be confusing, because it could be mistaken for case- insensitive find. I thought of "iterfind" like the old dict.iteritems, and ElementTree.iterfind but "iterrfind" (iterable rfind) is unattractive. I also think "find" is a more obvious verb than "index". I've got a simple Python implementation on gist: https://gist.github.com/fwiffo/5233377 It includes an option to include overlapping instences, which may not be necessary (it's not present in e.g. re.finditer). I could imagine it as a method on str/unicode/bytes/list/tuple objects, or maybe as a function in itertools. -- Tom Schumm http://www.fwiffo.com/ From python at mrabarnett.plus.com Fri Apr 5 04:37:05 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 05 Apr 2013 03:37:05 +0100 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <7653229.S9PYj1kSdO@orcus> References: <7653229.S9PYj1kSdO@orcus> Message-ID: <515E38D1.9010609@mrabarnett.plus.com> On 05/04/2013 02:21, Tom Schumm wrote: > Should Python strings (and byte arrays, and other iterables for that matter) > have an iterator form of find/rfind (or index/rindex)? I've found myself > wanting one on occasion, and having more iterable things seems to be the > direction the language is moving. > > Currently, looping over the instances of a substring in a larger string is a > bit awkward. You have to keep track of where you are, and you either have have > to watch for the -1 sentinel value or catch the ValueError. A "for idx in ..." > construction would just be cleaner. You could use re.finditer, but a string > method seems a more lightweight/efficient/obvious. > > The best name I can think of would be "finditer()" like re.finditer(). Using > "ifind" (like izip) would be confusing, because it could be mistaken for case- > insensitive find. I thought of "iterfind" like the old dict.iteritems, and > ElementTree.iterfind but "iterrfind" (iterable rfind) is unattractive. I also > think "find" is a more obvious verb than "index". > As you say, there's iteritems in Python 2. The os module has listdir, which returns a list; it has been suggested that an (non-list) iterable version should be added, and the obvious name in that case would be iterdir. The trend appears to be towards iterfind. > I've got a simple Python implementation on gist: > https://gist.github.com/fwiffo/5233377 > > It includes an option to include overlapping instences, which may not be > necessary (it's not present in e.g. re.finditer). > ... but it _is_ present in the regex module! :-) > I could imagine it as a method on str/unicode/bytes/list/tuple objects, or > maybe as a function in itertools. > An alternative would be to write a generator for it. You say "on occasion", but is that often enough to justify adding it to the language? From dreamingforward at gmail.com Fri Apr 5 05:08:04 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Thu, 4 Apr 2013 20:08:04 -0700 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: > Having str(container) calling str(item) and not repr(item) sounds like the > right thing to do. However, PEP 3140 was rejected on the basis of the > following statement of Guido: Strangely, this argument seems to get to a issue found in LISP regarding quoting a thing vs. the thing itself. Python doesn't recognize really that distinction in its object model -- though it does have a way to go the *other* direction: type(thing). I think Guido's intuition is correct -- there is no logically "correct" way to do this. The issue makes me think of a __deepstr__() method or something that one could implement if one wanted to print the contents of a container, but this doesn't really work well either. This is a problem with "everything is an object" model: no recognition is made between a distinction that is very high up the "object taxonomy": container vs. atomic elements. FWIW, MarkJ Tacoma, Wash From phong at phong.org Fri Apr 5 05:09:30 2013 From: phong at phong.org (Tom Schumm) Date: Thu, 04 Apr 2013 23:09:30 -0400 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <515E38D1.9010609@mrabarnett.plus.com> References: <7653229.S9PYj1kSdO@orcus> <515E38D1.9010609@mrabarnett.plus.com> Message-ID: <4670137.JVTsvP12m9@orcus> On Friday, April 05, 2013 03:37:05 AM MRAB wrote: > As you say, there's iteritems in Python 2. The os module has listdir, > which returns a list; it has been suggested that an (non-list) iterable > version should be added, and the obvious name in that case would be > iterdir. The trend appears to be towards iterfind. I agree that consistency would be best, and I'm not precious about the name. :) > You say "on occasion", but is that often enough to justify adding it to > the language? And that's why I ask; does anybody else want it? I've used it a few times, but there are some string methods I've never used even once. -- Tom Schumm http://www.fwiffo.com/ From raymond.hettinger at gmail.com Fri Apr 5 09:42:43 2013 From: raymond.hettinger at gmail.com (Raymond Hettinger) Date: Fri, 5 Apr 2013 00:42:43 -0700 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <7653229.S9PYj1kSdO@orcus> References: <7653229.S9PYj1kSdO@orcus> Message-ID: <44DD9060-CD77-4BBA-8748-8AA026AF1E02@gmail.com> On Apr 4, 2013, at 6:21 PM, Tom Schumm wrote: > Should Python strings (and byte arrays, and other iterables for that matter) > have an iterator form of find/rfind (or index/rindex)? I've found myself > wanting one on occasion, +1 from me. As you say, the current pattern is awkward. Iterators are much more natural for this task and would lead to cleaner, faster code. Raymond -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Fri Apr 5 12:11:45 2013 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Fri, 5 Apr 2013 12:11:45 +0200 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <7653229.S9PYj1kSdO@orcus> References: <7653229.S9PYj1kSdO@orcus> Message-ID: 2013/4/5 Tom Schumm : > Should Python strings (and byte arrays, and other iterables for that matter) > have an iterator form of find/rfind (or index/rindex)? I've found myself > wanting one on occasion, and having more iterable things seems to be the > direction the language is moving. > > Currently, looping over the instances of a substring in a larger string is a > bit awkward. You have to keep track of where you are, and you either have have > to watch for the -1 sentinel value or catch the ValueError. A "for idx in ..." > construction would just be cleaner. You could use re.finditer, but a string > method seems a more lightweight/efficient/obvious. > > The best name I can think of would be "finditer()" like re.finditer(). Using > "ifind" (like izip) would be confusing, because it could be mistaken for case- > insensitive find. I thought of "iterfind" like the old dict.iteritems, and > ElementTree.iterfind but "iterrfind" (iterable rfind) is unattractive. I also > think "find" is a more obvious verb than "index". > > I've got a simple Python implementation on gist: > https://gist.github.com/fwiffo/5233377 > > It includes an option to include overlapping instences, which may not be > necessary (it's not present in e.g. re.finditer). +1. > I could imagine it as a method on str/unicode/bytes/list/tuple objects, or > maybe as a function in itertools. I would personally prefer the former. --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From brett at python.org Fri Apr 5 12:24:53 2013 From: brett at python.org (Brett Cannon) Date: Fri, 5 Apr 2013 06:24:53 -0400 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <7653229.S9PYj1kSdO@orcus> References: <7653229.S9PYj1kSdO@orcus> Message-ID: FYI there is already a propposal for split: http://bugs.python.org/issue17343. Getting that approved would help move towards getting iterators for other relevant methods such as find and index. On Thu, Apr 4, 2013 at 9:21 PM, Tom Schumm wrote: > Should Python strings (and byte arrays, and other iterables for that > matter) > have an iterator form of find/rfind (or index/rindex)? I've found myself > wanting one on occasion, and having more iterable things seems to be the > direction the language is moving. > > Currently, looping over the instances of a substring in a larger string is > a > bit awkward. You have to keep track of where you are, and you either have > have > to watch for the -1 sentinel value or catch the ValueError. A "for idx in > ..." > construction would just be cleaner. You could use re.finditer, but a string > method seems a more lightweight/efficient/obvious. > > The best name I can think of would be "finditer()" like re.finditer(). > Using > "ifind" (like izip) would be confusing, because it could be mistaken for > case- > insensitive find. I thought of "iterfind" like the old dict.iteritems, and > ElementTree.iterfind but "iterrfind" (iterable rfind) is unattractive. I > also > think "find" is a more obvious verb than "index". > > I've got a simple Python implementation on gist: > https://gist.github.com/fwiffo/5233377 > > It includes an option to include overlapping instences, which may not be > necessary (it's not present in e.g. re.finditer). > > I could imagine it as a method on str/unicode/bytes/list/tuple objects, or > maybe as a function in itertools. > > -- > Tom Schumm > http://www.fwiffo.com/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sat Apr 6 01:46:34 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 06 Apr 2013 12:46:34 +1300 Subject: [Python-ideas] [Python-Dev] relative import circular problem In-Reply-To: References: Message-ID: <515F625A.5060104@canterbury.ac.nz> Guido van Rossum wrote: > > On Thu, Apr 4, 2013 at 9:26 AM, Richard Oudkerk wrote: > >> import .moduleX > > We considered that when relative import was designed and rejected it, > because it violates the expectation that after "import " > you can use exactly "" in your code. How about requiring an "as" clause, then? -- Greg From g.rodola at gmail.com Sat Apr 6 14:50:16 2013 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 6 Apr 2013 14:50:16 +0200 Subject: [Python-ideas] itertools.chunks() Message-ID: def chunks(total, step): assert total >= step while total > step: yield step; total -= step; if total: yield total >>> chunks(12, 4) [4, 4, 4] >>> chunks(13, 4) [4, 4, 4, 1] I'm not sure how appropriate "chunks" is as a name for such a function. Anyway, I wrote that because in a unit test I had to create a file of a precise size, like this: FILESIZE = (10 * 1024 * 1024) + 423 # 10MB and 423 bytes with open(TESTFN, 'wb') as f: for csize in chunks(FILESIZE, 262144): f.write(b'x' * csize) Now I wonder, would it make sense to have something like this into itertools module? --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From carlopires at gmail.com Sat Apr 6 16:21:59 2013 From: carlopires at gmail.com (Carlo Pires) Date: Sat, 6 Apr 2013 11:21:59 -0300 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: Message-ID: +1 Very useful function. 2013/4/6 Giampaolo Rodol? > def chunks(total, step): > assert total >= step > while total > step: > yield step; > total -= step; > if total: > yield total > > >>> chunks(12, 4) > [4, 4, 4] > >>> chunks(13, 4) > [4, 4, 4, 1] > > > I'm not sure how appropriate "chunks" is as a name for such a function. > Anyway, I wrote that because in a unit test I had to create a file of > a precise size, like this: > > FILESIZE = (10 * 1024 * 1024) + 423 # 10MB and 423 bytes > with open(TESTFN, 'wb') as f: > for csize in chunks(FILESIZE, 262144): > f.write(b'x' * csize) > > Now I wonder, would it make sense to have something like this into > itertools module? > > > --- Giampaolo > https://code.google.com/p/pyftpdlib/ > https://code.google.com/p/psutil/ > https://code.google.com/p/pysendfile/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Carlo Pires -------------- next part -------------- An HTML attachment was scrubbed... URL: From hsoft at hardcoded.net Sat Apr 6 17:21:21 2013 From: hsoft at hardcoded.net (Virgil Dupras) Date: Sat, 06 Apr 2013 11:21:21 -0400 Subject: [Python-ideas] shutil.trash() Message-ID: <51603D71.6070706@hardcoded.net> Hi all, A while ago, I've developed this library, send2trash ( https://bitbucket.org/hsoft/send2trash ), which can send files to trash on Mac OS X, Windows, and any platform that conforms to FreeDesktop. The current version uses ctypes, but earlier versions were straight C modules. I was wondering if you think this has a place in the stdlib, maybe as "shutil.trash()"? Virgil Dupras From python at mrabarnett.plus.com Sat Apr 6 19:50:23 2013 From: python at mrabarnett.plus.com (MRAB) Date: Sat, 06 Apr 2013 18:50:23 +0100 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: Message-ID: <5160605F.7030108@mrabarnett.plus.com> On 06/04/2013 13:50, Giampaolo Rodol? wrote: > def chunks(total, step): > assert total >= step > while total > step: > yield step; > total -= step; > if total: > yield total > Why shouldn't total be less than step? def chunks(total, step): while total >= step: yield step total -= step if total > 0: yield total >>>> chunks(12, 4) > [4, 4, 4] >>>> chunks(13, 4) > [4, 4, 4, 1] > > > I'm not sure how appropriate "chunks" is as a name for such a function. > Anyway, I wrote that because in a unit test I had to create a file of > a precise size, like this: > > FILESIZE = (10 * 1024 * 1024) + 423 # 10MB and 423 bytes > with open(TESTFN, 'wb') as f: > for csize in chunks(FILESIZE, 262144): > f.write(b'x' * csize) > > Now I wonder, would it make sense to have something like this into > itertools module? > From g.rodola at gmail.com Sat Apr 6 20:02:30 2013 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 6 Apr 2013 20:02:30 +0200 Subject: [Python-ideas] itertools.chunks() In-Reply-To: <5160605F.7030108@mrabarnett.plus.com> References: <5160605F.7030108@mrabarnett.plus.com> Message-ID: 2013/4/6 MRAB : > On 06/04/2013 13:50, Giampaolo Rodol? wrote: >> >> def chunks(total, step): >> assert total >= step >> while total > step: >> yield step; >> total -= step; >> if total: >> yield total >> > Why shouldn't total be less than step? > > def chunks(total, step): > while total >= step: > yield step > total -= step > if total > 0: > yield total I agree the assert statement can be removed. --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From jbvsmo at gmail.com Sat Apr 6 20:30:07 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Sat, 6 Apr 2013 15:30:07 -0300 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: <5160605F.7030108@mrabarnett.plus.com> Message-ID: Isn't it just divmod(total, step) Jo?o Bernardo 2013/4/6 Giampaolo Rodol? > 2013/4/6 MRAB : > > On 06/04/2013 13:50, Giampaolo Rodol? wrote: > >> > >> def chunks(total, step): > >> assert total >= step > >> while total > step: > >> yield step; > >> total -= step; > >> if total: > >> yield total > >> > > Why shouldn't total be less than step? > > > > def chunks(total, step): > > while total >= step: > > yield step > > total -= step > > if total > 0: > > yield total > > I agree the assert statement can be removed. > > --- Giampaolo > https://code.google.com/p/pyftpdlib/ > https://code.google.com/p/psutil/ > https://code.google.com/p/pysendfile/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Sat Apr 6 20:53:37 2013 From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=) Date: Sat, 6 Apr 2013 20:53:37 +0200 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: <5160605F.7030108@mrabarnett.plus.com> Message-ID: 2013/4/6 Jo?o Bernardo : > Isn't it just > > divmod(total, step) > > Jo?o Bernardo Not really: >>> list(chunks(13, 4)) [4, 4, 4, 1] >>> divmod(13, 4) (3, 1) Literally chunks() keeps yielding 'step' until 'total' is reached and makes sure the last yielded item has the correct remainder. --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From nathan at cmu.edu Sat Apr 6 21:06:22 2013 From: nathan at cmu.edu (Nathan Schneider) Date: Sat, 6 Apr 2013 15:06:22 -0400 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: <5160605F.7030108@mrabarnett.plus.com> Message-ID: On Sat, Apr 6, 2013 at 2:53 PM, Giampaolo Rodol? wrote: > 2013/4/6 Jo?o Bernardo : > > Isn't it just > > > > divmod(total, step) > > > > Jo?o Bernardo > > Not really: > > >>> list(chunks(13, 4)) > [4, 4, 4, 1] > >>> divmod(13, 4) > (3, 1) > > I think what Jo?o means is you can do: def chunks(total, step): a,b = divmod(total,step) return [step]*a + [b] >>> chunks(13,4) [4, 4, 4, 1] Or, to avoid necessarily constructing the list all at once: def chunks(total, step): a,b = divmod(total,step) return itertools.chain(itertools.repeat(step,a), [b]) >>> list(chunks(13,4)) [4, 4, 4, 1] Nathan > Literally chunks() keeps yielding 'step' until 'total' is reached and > makes sure the last yielded item has the correct remainder. > > --- Giampaolo > https://code.google.com/p/pyftpdlib/ > https://code.google.com/p/psutil/ > https://code.google.com/p/pysendfile/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wolfgang.maier at biologie.uni-freiburg.de Fri Apr 5 11:11:18 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Fri, 5 Apr 2013 09:11:18 +0000 (UTC) Subject: [Python-ideas] An iterable version of find/index for strings? References: <7653229.S9PYj1kSdO@orcus> Message-ID: Tom Schumm writes: > > Should Python strings (and byte arrays, and other iterables for that > matter) have an iterator form of find/rfind (or index/rindex)? +1 as well. As you say, it's a logical thing to have, and there don't seem to be any disadvantages to it. Wolfgang From tjreedy at udel.edu Fri Apr 5 19:46:06 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Fri, 05 Apr 2013 13:46:06 -0400 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: On 4/4/2013 11:08 PM, Mark Janssen wrote: >> Having str(container) calling str(item) and not repr(item) sounds like the >> right thing to do. However, PEP 3140 was rejected on the basis of the >> following statement of Guido: > > Strangely, this argument seems to get to a issue found in LISP > regarding quoting a thing vs. the thing itself. Python doesn't > recognize really that distinction in its object model Python quotes code by switching from expression to statement syntax. (Lambda also quotes expressions, but not, to the chagrin of some, statements.) > -- though it does have a way to go the *other* direction: type(thing). > I think Guido's intuition is correct -- there is no logically > "correct" way to do this. The issue makes me think of a > __deepstr__() method or something that one could implement if one > wanted to print the contents of a container, but this doesn't really > work well either. > > This is a problem with "everything is an object" model: no > recognition is made between a distinction that is very high up the > "object taxonomy": container vs. atomic elements. Python does not have non-object atomic data elements. len(ob) returning a value instead of raising an exception distinguishes concrete collections from everything else. iter(ob) returning instead of raising distinguishes collections (concrete + abstract or virtual) from everything else. -- Terry Jan Reedy From storchaka at gmail.com Sat Apr 6 22:13:05 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sat, 06 Apr 2013 23:13:05 +0300 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: Message-ID: On 06.04.13 15:50, Giampaolo Rodol? wrote: > def chunks(total, step): > assert total >= step > while total > step: > yield step; > total -= step; > if total: > yield total For integers this is: (min(step, total - i) for i in range(0, total, step)) From greg at krypto.org Sun Apr 7 00:25:54 2013 From: greg at krypto.org (Gregory P. Smith) Date: Sat, 6 Apr 2013 15:25:54 -0700 Subject: [Python-ideas] shutil.trash() In-Reply-To: <51603D71.6070706@hardcoded.net> References: <51603D71.6070706@hardcoded.net> Message-ID: Is it widely used? I think it sounds useful for someone but is the kind of thing that should be fine as an extension module on PyPI for most people's needs. It seems like the kind of functionality that would go along with a GUI library. Other software is unlikely to care about an OSes concept of trash and simply rm/del/unlink things. otherwise, yes, shutil is a reasonable place if it were to be added. On Sat, Apr 6, 2013 at 8:21 AM, Virgil Dupras wrote: > Hi all, > > A while ago, I've developed this library, send2trash ( > https://bitbucket.org/hsoft/**send2trash), which can send files to trash on Mac OS X, Windows, and any platform > that conforms to FreeDesktop. > > The current version uses ctypes, but earlier versions were straight C > modules. > > I was wondering if you think this has a place in the stdlib, maybe as > "shutil.trash()"? > > Virgil Dupras > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dreamingforward at gmail.com Sun Apr 7 02:09:42 2013 From: dreamingforward at gmail.com (Mark Janssen) Date: Sat, 6 Apr 2013 17:09:42 -0700 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: >> This is a problem with "everything is an object" model: no >> recognition is made between a distinction that is very high up the >> "object taxonomy": container vs. atomic elements. > > Python does not have non-object atomic data elements. Yeah. I think this is where we made a mistake -- we went too far into "purity" and away from Tim's Zen wisdom of "practicality". We don't need ints as objects. Once Python did this (Python 2.6?), we had to gain an obscure system of __new__ in addition to __init__, and the nice clean conceptual model got obfuscated. (Why do some objects need it and others don't?) The only nice thing about it is big-nums and the complete abstraction that python provides of it, but I believe, now, that it's a premature optimization. The only reason we do it is to make the very nice illusion of arbitrarily large arithmetic. But in practice, this is only used to impress others, not for much real programming. Frankly, I have to admit that longs did just fine, with their automatic conversion from ints when things got too big. Let's update the OOP paradigm and accept we can't *totally* get away from the machine and differentiate between atomic types like integers and let containers be the "first" Object. MarkJ Tacoma, Washington From steve at pearwood.info Sun Apr 7 02:54:07 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 07 Apr 2013 10:54:07 +1000 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: <5160C3AF.7070003@pearwood.info> On 07/04/13 10:09, Mark Janssen wrote: >>> This is a problem with "everything is an object" model: no >>> recognition is made between a distinction that is very high up the >>> "object taxonomy": container vs. atomic elements. >> >> Python does not have non-object atomic data elements. > > Yeah. I think this is where we made a mistake -- we went too far into > "purity" and away from Tim's Zen wisdom of "practicality". We don't > need ints as objects. Once Python did this (Python 2.6?), More like Python 0.1, or at least 0.9 which is the oldest version I have. Data values have always been implemented as objects in Python, even in the early days before the class/type integration. > we had to > gain an obscure system of __new__ in addition to __init__, and the > nice clean conceptual model got obfuscated. On the contrary, the "everything is an object" system is nice and clean. Having some values be objects and some values not be means that you have two distinct models, boxed and unboxed values, like in Java. > (Why do some objects need it and others don't?) Why do some objects need __len__ and some objects don't? It depends on the object and what it is supposed to do. __new__ is the constructor that actually builds the object. __init__ is the initializer. Yes, we could drop the initializer, and only have a constructor. But because the constructor does more, it is trickier to use, especially for beginners. E.g. there is no self, because the instance doesn't exist yet! That makes it harder to use correctly. Dropping __init__ and keeping only __new__ will make the language worse. On the other hand, we cannot drop __new__, and just keep __init__. That's what the old-style "classic classes" did, and not having __new__ is a major pain. 90% of the time you don't need __new__ and __init__ will do the job, but when you need it, you really need it, and __init__ is no substitute. So having both __new__ and __init__ available is a big Win for useability. > The only nice thing about it is big-nums and > the complete abstraction that python provides of it, but I believe, > now, that it's a premature optimization. > > The only reason we do it is to make the very nice illusion of > arbitrarily large arithmetic. But in practice, this is only used to > impress others, not for much real programming. Frankly, I have to > admit that longs did just fine, with their automatic conversion from > ints when things got too big. Do you realise that those longs that you call "fine" are precisely the same as the ints you're complaining about? > Let's update the OOP paradigm and accept we can't *totally* get away > from the machine and differentiate between atomic types like integers > and let containers be the "first" Object. Let's not. -- Steven From ned at nedbatchelder.com Sun Apr 7 04:35:25 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 06 Apr 2013 22:35:25 -0400 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: <5160DB6D.3030007@nedbatchelder.com> On 4/6/2013 8:09 PM, Mark Janssen wrote: > Let's update the OOP paradigm and accept we can't*totally* get away > from the machine and differentiate between atomic types like integers > and let containers be the "first" Object. Mark, so I can understand your mindset better, what do you mean by "let's update the OOP paradigm"? Do you mean, 1) "let's change Python in the next release," or 2) "let's see if we can imagine a different way of doing things, even though it won't ever change the Python language in actuality," or 3) something in between? This change you mention here is fundamental enough that realistically, #2 is the only interpretation I can see. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From grosser.meister.morti at gmx.net Sun Apr 7 06:19:41 2013 From: grosser.meister.morti at gmx.net (=?ISO-8859-1?Q?Mathias_Panzenb=F6ck?=) Date: Sun, 07 Apr 2013 06:19:41 +0200 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: Message-ID: <5160F3DD.6000606@gmx.net> A function like this is useful, but I don't agree with the name. This name implies to me that it actually yields chunks and not chunk sizes. Maybe call it chunk_sizes? I don't know. Also I find myself often writing helper functions like these: def chunked(sequence,size): i = 0 while True: j = i i += size chunk = sequence[j:i] if not chunk: return yield chunk def chunked_stream(stream,size): while True: chunk = stream.read(size) if not chunk: return yield chunk Maybe these functions should be in the stdlib? Too trivial? On 04/06/2013 02:50 PM, Giampaolo Rodol? wrote: > def chunks(total, step): > assert total >= step > while total > step: > yield step; > total -= step; > if total: > yield total > >>>> chunks(12, 4) > [4, 4, 4] >>>> chunks(13, 4) > [4, 4, 4, 1] > > > I'm not sure how appropriate "chunks" is as a name for such a function. > Anyway, I wrote that because in a unit test I had to create a file of > a precise size, like this: > > FILESIZE = (10 * 1024 * 1024) + 423 # 10MB and 423 bytes > with open(TESTFN, 'wb') as f: > for csize in chunks(FILESIZE, 262144): > f.write(b'x' * csize) > > Now I wonder, would it make sense to have something like this into > itertools module? > > > --- Giampaolo > https://code.google.com/p/pyftpdlib/ > https://code.google.com/p/psutil/ > https://code.google.com/p/pysendfile/ From steve at pearwood.info Sun Apr 7 08:12:49 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 07 Apr 2013 16:12:49 +1000 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: Message-ID: <51610E61.1000706@pearwood.info> On 06/04/13 23:50, Giampaolo Rodol? wrote: > def chunks(total, step): > assert total >= step > while total > step: > yield step; > total -= step; > if total: > yield total [...] > Now I wonder, would it make sense to have something like this into > itertools module? Since it doesn't operate on iterators, I don't think it belongs in itertools. It can also be implemented like this: def chunks(total, step): a, b = divmod(total, step) for i in range(a): yield step if b: yield b which is probably also less likely to go wrong if you pass float arguments. -- Steven From abarnert at yahoo.com Sun Apr 7 11:15:16 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 7 Apr 2013 02:15:16 -0700 (PDT) Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: <1365326116.92717.YahooMailNeo@web184706.mail.ne1.yahoo.com> From: Mark Janssen >Sent: Saturday, April 6, 2013 5:09 PM > > >>>> This is a problem with "everything is an object" model:? no >>>> recognition is made between a distinction that is very high up the >>>> "object taxonomy":? container vs. atomic elements. >>> >>> Python does not have non-object atomic data elements. > >> Yeah.? I think this is where we made a mistake -- we went too far into >>?"purity" and away from Tim's Zen wisdom of "practicality".? We don't >>?need ints as objects.? Once Python did this (Python 2.6?), we had to >>?gain an obscure system of __new__ in addition to __init__, and the >>?nice clean conceptual model got obfuscated.? (Why do some objects need >>?it and others don't?)? The only nice thing about it is big-nums and >>?the complete abstraction that python provides of it, but I believe, >>?now, that it's a premature optimization. >> >> The only reason we do it is to make the very nice illusion of >>?arbitrarily large arithmetic.? But in practice, this is only used to >>?impress others, not for much real programming.? Frankly, I have to >>?admit that longs did just fine, with their automatic conversion from >>?ints when things got too big. > > >Before the int/long unification, Python had two different integral types, which were both objects. The int object used a C long for internal storage, but it was an immutable, garbage-collected, object, with methods, represented in CPython as a PyObject* with the appropriate slots and so on and so on. > > >The only practical cost of that merger was a very minor performance cost. (And one that's unlikely to matter in any real program?if you need to do lots of arithmetic very quickly, Python ints were far too slow without numpy or similar; for anything else, longs were more than fast enough.) > > >> Let's update the OOP paradigm and accept we can't *totally* get away > >> from the machine and differentiate between atomic types like integers >> and let containers be the "first" Object. > > > >How would that even help solve the problem? You'd still need to be able to make the exact same container-vs.-non-container distinction for all of the non-numeric objects that can go into containers. And whatever solution you come up with for those objects will work just as well for numbers as objects. So, adding non-object numbers just means you have to solve two problems instead of one. (Actually, three, because now you either need to design explicit boxing, or come up with a way to let containers, and variables for that matter, hold both objects and non-objects.) > > >To put it another way, if containers are the "first" object, then what are dates, quaternions, slices, files, functions, GUI windows, client connections, context managers, exceptions, regexes, zlib compression states, threads, processes, ?? And even if you _could_ implement all of those as non-object atomic types, what about?user-defined application objects like cars, cats, and employees? > > >There are only four answers to this problem: (1) nothing is an object (Haskell, Scheme); (2) everything is an object (Python, Ruby); (3) only some things are objects, and only objects can go into containers (Java); (4) only some things are objects, and there are completely different kinds of containers for objects and non-objects (ObjC). It's hard to see how either (3) or (4) is simpler than (2), or more practical (other than for low-level performance). From abarnert at yahoo.com Sun Apr 7 11:51:17 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 7 Apr 2013 02:51:17 -0700 (PDT) Subject: [Python-ideas] itertools.chunks() In-Reply-To: <5160F3DD.6000606@gmx.net> References: <5160F3DD.6000606@gmx.net> Message-ID: <1365328277.91357.YahooMailNeo@web184705.mail.ne1.yahoo.com> > From: Mathias Panzenb?ck > Sent: Saturday, April 6, 2013 9:19 PM > > Also I find myself often writing helper functions like these: > > def chunked(sequence,size): > ??? i = 0 > ??? while True: > ??? ??? j = i > ??? ??? i += size > ??? ??? chunk = sequence[j:i] > ??? ??? if not chunk: > ??? ??? ??? return > ??? ??? yield chunk The grouper function in the itertools recipes does the same thing, except that it works for any iterable, not just sequences (and it fills out the last group with an optional fillvalue). > def chunked_stream(stream,size): > ??? while True: > ??? ??? chunk = stream.read(size) > ??? ??? if not chunk: > ??? ??? ??? return > ??? ??? yield chunk This is just iter(partial(stream.read, size), ''). > Maybe these functions should be in the stdlib? Too trivial? I personally agree that grouper, and some of the other itertools recipes, should actually be included in the module, so you could just import itertools and call grouper instead of having to copy the 3 lines of code into dozens of different programs. But I personally deal with that by just installing more-itertools off PyPI. As for the original suggestion: > On 04/06/2013 02:50 PM, Giampaolo Rodol? wrote: >> def chunks(total, step): >> ? ? ? assert total >= step >> ? ? ? while total > step: >> ? ? ? ? ? yield step; >> ? ? ? ? ? total -= step; >> ? ? ? if total: >> ? ? ? ? ? yield total I honestly don't think this is very useful. For one thing, if you really need it, it's equivalent to a trivial genexp: min(step, total - chunkstart) for chunkstart in range(0, total, step) For another, I think most obvious uses for it would be better done at a higher level. For example: >> FILESIZE = (10 * 1024 * 1024) + 423? # 10MB and 423 bytes >> with open(TESTFN, 'wb') as f: >> ? ? ? for csize in chunks(FILESIZE, 262144): >> ? ? ? ? ? f.write(b'x' * csize) First, is the memory cost of?f.write(b'x' * FILESIZE) really an issue for your program??If so, aren't you better off creating an mmap and filling it with x??And if you want to do it with itertools, can't you just chunk repeat(b'x') instead of explicitly generating the lengths and multiplying them? Besides, the logic here is actually a bit hidden. You create a FILESIZE which is 10MB and 423 bytes, and then you use a function that writes the 10MB in groups of 256KB and then writes the 423 bytes. Why not just keep it simple? with open(TESTFN, 'wb') as f: ? ? for _ in range(10 * 1024 / 256): ? ? ? ? f.write(b'x' * (256*1024)) ? ? f.write(b'x' * 423) From wolfgang.maier at biologie.uni-freiburg.de Sun Apr 7 11:37:50 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Sun, 7 Apr 2013 11:37:50 +0200 Subject: [Python-ideas] itertools.chunks() Message-ID: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> >Also I find myself often writing helper functions like these: > >def chunked(sequence,size): > i = 0 > while True: > j = i > i += size > chunk = sequence[j:i] > if not chunk: > return > yield chunk This is just an alternate version of the grouper recipe from the itertools documentation, just that grouper should be way faster and will also work with iterators. We just had a thread on variations of the grouper recipe under *zip_strict() or similar in itertools*. Wolfgang From ubershmekel at gmail.com Sun Apr 7 13:07:03 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Sun, 7 Apr 2013 14:07:03 +0300 Subject: [Python-ideas] shutil.trash() In-Reply-To: References: <51603D71.6070706@hardcoded.net> Message-ID: On Sun, Apr 7, 2013 at 1:25 AM, Gregory P. Smith wrote: > I think it sounds useful for someone but is the kind of thing that should > be fine as an extension module on PyPI for most people's needs. It seems > like the kind of functionality that would go along with a GUI library. > Other software is unlikely to care about an OSes concept of trash and > simply rm/del/unlink things. > > I agree as well. It'd be a wonderful addition to PyPI but not the stdlib. Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From breamoreboy at yahoo.co.uk Sun Apr 7 14:24:01 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sun, 07 Apr 2013 13:24:01 +0100 Subject: [Python-ideas] itertools.chunks() In-Reply-To: <1365328277.91357.YahooMailNeo@web184705.mail.ne1.yahoo.com> References: <5160F3DD.6000606@gmx.net> <1365328277.91357.YahooMailNeo@web184705.mail.ne1.yahoo.com> Message-ID: On 07/04/2013 10:51, Andrew Barnert wrote: > > I personally agree that grouper, and some of the other itertools recipes, should actually be included in the module, so you could just import itertools and call grouper instead of having to copy the 3 lines of code into dozens of different programs. But I personally deal with that by just installing more-itertools off PyPI. > For those who aren't aware there's always https://pypi.python.org/pypi/itertools_recipes/0.1 or even https://pypi.python.org/pypi/more-itertools/2.2 -- If you're using GoogleCrap? please read this http://wiki.python.org/moin/GoogleGroupsPython. Mark Lawrence From oscar.j.benjamin at gmail.com Sun Apr 7 16:35:41 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Sun, 7 Apr 2013 15:35:41 +0100 Subject: [Python-ideas] itertools.chunks() In-Reply-To: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> References: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> Message-ID: On 7 April 2013 10:37, Wolfgang Maier wrote: >>Also I find myself often writing helper functions like these: >> >>def chunked(sequence,size): >> i = 0 >> while True: >> j = i >> i += size >> chunk = sequence[j:i] >> if not chunk: >> return >> yield chunk > > This is just an alternate version of the grouper recipe from the itertools > documentation, just that grouper should be way faster and will also work > with iterators. It's not quite the same as grouper as it doesn't use fill values; I've never found that I wanted fill values in this situation. Also I'm not sure why you think that grouper would be "way faster". If sequence is a concrete sequence with efficient random access (e.g. a list or tuple) then grouper will just be extracting slices from it. If it can do that faster than the sequence.__getslice__ method then there's probably something wrong with the implementation of sequence. I've written a generator function like the above before and it was intended for numpy ndarrays. Since ndarray slices are views into the original array, using a slice is a zero copy operation. This means that using slices has time complexity of O(number of chunks) rather than O(number of elements) for grouper. It also has a constant memory requirement rather than O(chunk size) for grouper. Oscar From guido at python.org Sun Apr 7 17:41:44 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Apr 2013 08:41:44 -0700 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: <1365326116.92717.YahooMailNeo@web184706.mail.ne1.yahoo.com> References: <1365326116.92717.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: Everyone, please stop trying to reason with Mark. This is a pointless distraction. For reference: http://mail.python.org/pipermail/python-ideas/2013-March/020034.html -- --Guido van Rossum (python.org/~guido) From ndbecker2 at gmail.com Sun Apr 7 18:56:35 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Sun, 07 Apr 2013 12:56:35 -0400 Subject: [Python-ideas] mixing tabs and spaces Message-ID: I was noticing the complaint about mixing tabs and spaces on python.general list. It occurs to me, that a simple solution would be to allow a comment at the top of the file to specify a tab-space equivalency. something similar to the coding magic comment: # -*- coding: -*- Perhaps: # -*- tab: 8 -*- From guido at python.org Sun Apr 7 19:43:13 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Apr 2013 10:43:13 -0700 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: References: Message-ID: There is a simpler solution. Just don't use tabs. On Sun, Apr 7, 2013 at 9:56 AM, Neal Becker wrote: > I was noticing the complaint about mixing tabs and spaces on python.general > list. It occurs to me, that a simple solution would be to allow a comment at > the top of the file to specify a tab-space equivalency. > > something similar to the coding magic comment: > # -*- coding: -*- > > Perhaps: > > # -*- tab: 8 -*- > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) From ethan at stoneleaf.us Sun Apr 7 20:07:26 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 07 Apr 2013 11:07:26 -0700 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: <1365326116.92717.YahooMailNeo@web184706.mail.ne1.yahoo.com> Message-ID: <5161B5DE.9060607@stoneleaf.us> On 04/07/2013 08:41 AM, Guido van Rossum wrote: > Everyone, please stop trying to reason with Mark. This is a pointless > distraction. For reference: > http://mail.python.org/pipermail/python-ideas/2013-March/020034.html I think you had an off-by-one error. ;) Here's the message from Mark: http://mail.python.org/pipermail/python-ideas/2013-March/020032.html -- ~Ethan~ From guido at python.org Sun Apr 7 21:21:02 2013 From: guido at python.org (Guido van Rossum) Date: Sun, 7 Apr 2013 12:21:02 -0700 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: <5161B5DE.9060607@stoneleaf.us> References: <1365326116.92717.YahooMailNeo@web184706.mail.ne1.yahoo.com> <5161B5DE.9060607@stoneleaf.us> Message-ID: No. Read the last sentence. --Guido van Rossum (sent from Android phone) On Apr 7, 2013 11:17 AM, "Ethan Furman" wrote: > On 04/07/2013 08:41 AM, Guido van Rossum wrote: > >> Everyone, please stop trying to reason with Mark. This is a pointless >> distraction. For reference: >> http://mail.python.org/**pipermail/python-ideas/2013-**March/020034.html >> > > I think you had an off-by-one error. ;) > > Here's the message from Mark: > > http://mail.python.org/**pipermail/python-ideas/2013-**March/020032.html > > -- > ~Ethan~ > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Mon Apr 8 02:02:11 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 08 Apr 2013 09:02:11 +0900 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: References: Message-ID: <8761zx99fg.fsf@uwakimon.sk.tsukuba.ac.jp> Guido van Rossum writes: > There is a simpler solution. Just don't use tabs. Is there a reason tabs used for "syntactically significant" indentation can't be made a warning, or even a syntax error? As far as I can see they're just an attractive nuisance. I guess that ship sailed with Python 3.0, though. From phong at phong.org Mon Apr 8 02:07:52 2013 From: phong at phong.org (Tom Schumm) Date: Sun, 07 Apr 2013 20:07:52 -0400 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: <8761zx99fg.fsf@uwakimon.sk.tsukuba.ac.jp> References: <8761zx99fg.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <2603022.rvORr4v052@orcus> On Monday, April 08, 2013 09:02:11 AM Stephen J. Turnbull wrote: > Guido van Rossum writes: > > There is a simpler solution. Just don't use tabs. > > Is there a reason tabs used for "syntactically significant" > indentation can't be made a warning, or even a syntax error? As far > as I can see they're just an attractive nuisance. Invoke python with the -t or -tt switch. I use "#!/usr/bin/python -tt" out of habit. -- Tom Schumm http://www.fwiffo.com/ From ned at nedbatchelder.com Mon Apr 8 02:49:52 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 07 Apr 2013 20:49:52 -0400 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: <8761zx99fg.fsf@uwakimon.sk.tsukuba.ac.jp> References: <8761zx99fg.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <51621430.7030403@nedbatchelder.com> On 4/7/2013 8:02 PM, Stephen J. Turnbull wrote: > Guido van Rossum writes: > > > There is a simpler solution. Just don't use tabs. > > Is there a reason tabs used for "syntactically significant" > indentation can't be made a warning, or even a syntax error? As far > as I can see they're just an attractive nuisance. > > I guess that ship sailed with Python 3.0, though. Luckily, before the ship sailed, they loaded a new feature onto it: mixed whitespace is always an error in Python 3, as if it were run with -tt. --Ned. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From ubershmekel at gmail.com Mon Apr 8 08:43:38 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 8 Apr 2013 09:43:38 +0300 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: References: <7653229.S9PYj1kSdO@orcus> Message-ID: On Fri, Apr 5, 2013 at 12:11 PM, Wolfgang Maier < wolfgang.maier at biologie.uni-freiburg.de> wrote: > Tom Schumm writes: > > > > > Should Python strings (and byte arrays, and other iterables for that > > matter) have an iterator form of find/rfind (or index/rindex)? > > +1 as well. > As you say, it's a logical thing to have, and there don't seem to be any > disadvantages to it. > > Wolfgang > > > I think there is a disadvantage: * It adds complexity to the str/bytes API. * These features exist in the `re` module, TSBOOWTDI. * Strings are usually short and always entirely in memory - the iterator requirement isn't commonplace. Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Mon Apr 8 08:58:09 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Sun, 7 Apr 2013 23:58:09 -0700 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: References: <7653229.S9PYj1kSdO@orcus> Message-ID: <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> On Apr 7, 2013, at 23:43, Yuval Greenfield wrote: > On Fri, Apr 5, 2013 at 12:11 PM, Wolfgang Maier wrote: >> Tom Schumm writes: >> >> > >> > Should Python strings (and byte arrays, and other iterables for that >> > matter) have an iterator form of find/rfind (or index/rindex)? >> >> +1 as well. >> As you say, it's a logical thing to have, and there don't seem to be any >> disadvantages to it. >> >> Wolfgang > > > I think there is a disadvantage: > > * It adds complexity to the str/bytes API. > * These features exist in the `re` module, TSBOOWTDI. Yes, but regular expressions shouldn't be the one way to do a simple text search! > * Strings are usually short and always entirely in memory - the iterator requirement isn't commonplace. This, I think, is a better point. If you need iterfind, there's a good chance you're going to want to replace the string with an mmap, an iterator around read, something that generates the string on the fly, etc. There will be _some_ programs for which str.iterfind is more useful than a generic iterfind function, but maybe not that many... > > Yuval > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From wolfgang.maier at biologie.uni-freiburg.de Mon Apr 8 07:31:50 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Mon, 8 Apr 2013 05:31:50 +0000 (UTC) Subject: [Python-ideas] itertools.chunks() References: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> Message-ID: Oscar Benjamin writes: > > On 7 April 2013 10:37, Wolfgang Maier > wrote: > >>Also I find myself often writing helper functions like these: > >> > >>def chunked(sequence,size): > >> i = 0 > >> while True: > >> j = i > >> i += size > >> chunk = sequence[j:i] > >> if not chunk: > >> return > >> yield chunk > > > > This is just an alternate version of the grouper recipe from the itertools > > documentation, just that grouper should be way faster and will also work > > with iterators. > > It's not quite the same as grouper as it doesn't use fill values; I've > never found that I wanted fill values in this situation. > > Also I'm not sure why you think that grouper would be "way faster". If > sequence is a concrete sequence with efficient random access (e.g. a > list or tuple) then grouper will just be extracting slices from it. If > it can do that faster than the sequence.__getslice__ method then > there's probably something wrong with the implementation of sequence. > > I've written a generator function like the above before and it was > intended for numpy ndarrays. Since ndarray slices are views into the > original array, using a slice is a zero copy operation. This means > that using slices has time complexity of O(number of chunks) rather > than O(number of elements) for grouper. It also has a constant memory > requirement rather than O(chunk size) for grouper. > > Oscar > Hi, I didn't want to imply that slicing was faster/slower than iteration. Rather I thought that this particular example would run slower than the grouper recipe because of the rest of the python code (assign, increment, testing for False every time through the loop). I have not tried to time it, but all this should make things slower than grouper, which spends most of its time in C. For the special case of ndarrays your argument sounds convincing though! Regarding the differences between this code and grouper, I am well aware of them. It was for that reason that I was mentioning the earlier thread *zip_strict() or similar in itertools again, where Peter Otten shows an elegant alternative. Best, Wolfgang From ncoghlan at gmail.com Mon Apr 8 10:01:10 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Mon, 8 Apr 2013 18:01:10 +1000 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> References: <7653229.S9PYj1kSdO@orcus> <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> Message-ID: On Mon, Apr 8, 2013 at 4:58 PM, Andrew Barnert wrote: > This, I think, is a better point. If you need iterfind, there's a good > chance you're going to want to replace the string with an mmap, an iterator > around read, something that generates the string on the fly, etc. There will > be _some_ programs for which str.iterfind is more useful than a generic > iterfind function, but maybe not that many... As Tom's original post shows, the existing methods are also designed to make it relatively straightforward to implement an efficient iterator if you do need it. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From p at google-groups-2013.dobrogost.net Mon Apr 8 11:56:08 2013 From: p at google-groups-2013.dobrogost.net (Piotr Dobrogost) Date: Mon, 8 Apr 2013 11:56:08 +0200 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: On Fri, Apr 5, 2013 at 12:36 AM, Guido van Rossum wrote: > Lots of reasons. E.g. would you really like this outcome? > > >>> a = 'foo, bar' > >>> b = [a] > >>> print(b) > [foo, bar] > >>> > This is indeed unfortunate but could be mitigated in some extent by enclosing container's elements with let's say pair of <>. Yes, I know it doesn't solve the problem but anything is better than having str() duplicate what repr() does. If I want to get repr I'll call repr() but when I call str() I expect to get some sensible, middle-ground representation not the same thing repr() already provides. On Fri, Apr 5, 2013 at 12:36 AM, Guido van Rossum wrote: > Lots of reasons. E.g. would you really like this outcome? > > >>> a = 'foo, bar' > >>> b = [a] > >>> print(b) > [foo, bar] > >>> > > Plus of course there really would be tons of backwards compatibility > issues. > > On Thu, Apr 4, 2013 at 2:57 PM, Piotr Dobrogost >

wrote: > > Hi! > > > > Having str(container) calling str(item) and not repr(item) sounds like > the > > right thing to do. However, PEP 3140 was rejected on the basis of the > > following statement of Guido: > > > > "Let me just save everyone a lot of time and say that I'm opposed to > > this change, and that I believe that it would cause way too much > > disturbance to be accepted this close to beta." > > > > > > Thu, 29 May 2008 12:32:04 -0700 > > (http://www.mail-archive.com/python-3000 at python.org/msg13686.html) > > > > Does anyone know what's the reason Guido was opposed to this change? > > Is there any chance to revive this PEP? > > > > > > Regards, > > Piotr Dobrogost > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > > > > -- > --Guido van Rossum (python.org/~guido) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Mon Apr 8 13:57:06 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Mon, 8 Apr 2013 12:57:06 +0100 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> Message-ID: On 8 April 2013 06:31, Wolfgang Maier wrote: > Oscar Benjamin writes: >> >> On 7 April 2013 10:37, Wolfgang Maier >> wrote: >> >>Also I find myself often writing helper functions like these: >> >> >> >>def chunked(sequence,size): >> >> i = 0 >> >> while True: >> >> j = i >> >> i += size >> >> chunk = sequence[j:i] >> >> if not chunk: >> >> return >> >> yield chunk >> > >> > This is just an alternate version of the grouper recipe from the itertools >> > documentation, just that grouper should be way faster and will also work >> > with iterators. >> >> It's not quite the same as grouper as it doesn't use fill values; I've >> never found that I wanted fill values in this situation. >> >> Also I'm not sure why you think that grouper would be "way faster". [snip] > > I didn't want to imply that slicing was faster/slower than iteration. Rather > I thought that this particular example would run slower than the grouper > recipe because of the rest of the python code (assign, increment, > testing for False every time through the loop). I have not tried to time it, > but all this should make things slower than grouper, which spends most of > its time in C. For the special case of ndarrays your argument sounds > convincing though! Fair enough. I was making the assumption that the chunk size is large in which case the time is dominated by creating the slice. > Regarding the differences between this code and grouper, I am well aware of > them. It was for that reason that I was mentioning the earlier thread > *zip_strict() or similar in itertools again, where Peter Otten shows an > elegant alternative. Sorry, I didn't read that thread but I have now; I see that you raised precisely this issue. For what it's worth I agree that the fact a generator is needed here suggests that there is some kind of primitive missing from itertools. Also, here's a version of the same from my own code (modified a little) that uses islice instead of zip_longest. I haven't timed it but it was intended to be fast for large chunk sizes and I'd be interested to know how it compares: from itertools import islice def chunked(iterable, size, **kwargs): '''Breaks an iterable into chunks Usage: >>> list(chunked('qwertyuiop', 3)) [['q', 'w', 'e'], ['r', 't', 'y'], ['u', 'i', 'o'], ['p']] >>> list(chunked('qwertyuiop', 3, fillvalue=None)) [['q', 'w', 'e'], ['r', 't', 'y'], ['u', 'i', 'o'], ['p', None, None]] >>> list(chunked('qwertyuiop', 3, strict=True)) Traceback (most recent call last): ... ValueError: Invalid chunk size ''' list_, islice_ = list, islice iterator = iter(iterable) chunk = list_(islice_(iterator, size)) while len(chunk) == size: yield chunk chunk = list_(islice_(iterator, size)) if not chunk: return elif kwargs.get('strict', False): raise ValueError('Invalid chunk size') elif 'fillvalue' in kwargs: yield chunk + (size - len(chunk)) * [kwargs['fillvalue']] else: yield chunk Oscar From eliben at gmail.com Mon Apr 8 14:16:33 2013 From: eliben at gmail.com (Eli Bendersky) Date: Mon, 8 Apr 2013 05:16:33 -0700 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: References: Message-ID: On Sun, Apr 7, 2013 at 10:43 AM, Guido van Rossum wrote: > There is a simpler solution. Just don't use tabs. > Oh, Guido, where were you when then defined the syntax of Go? ;-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From songofacandy at gmail.com Mon Apr 8 15:16:25 2013 From: songofacandy at gmail.com (INADA Naoki) Date: Mon, 8 Apr 2013 22:16:25 +0900 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: References: Message-ID: Basically, Go is same to Python: DON'T MIX TABS AND SPACES On Mon, Apr 8, 2013 at 9:16 PM, Eli Bendersky wrote: > > > > On Sun, Apr 7, 2013 at 10:43 AM, Guido van Rossum wrote: > >> There is a simpler solution. Just don't use tabs. >> > > Oh, Guido, where were you when then defined the syntax of Go? > ;-) > > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- INADA Naoki -------------- next part -------------- An HTML attachment was scrubbed... URL: From phong at phong.org Mon Apr 8 16:08:21 2013 From: phong at phong.org (Tom Schumm) Date: Mon, 08 Apr 2013 10:08:21 -0400 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> References: <7653229.S9PYj1kSdO@orcus> <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> Message-ID: <1684296.eBriSvhKYS@cygnus> On Sun, April 07, 2013 11:58:09 PM Andrew Barnert wrote: > > * Strings are usually short and always entirely in memory - the iterator > > requirement isn't commonplace. > This, I think, is a better point. If you need iterfind, there's a good > chance you're going to want to replace the string with an mmap, an iterator > around read, something that generates the string on the fly, etc. There > will be _some_ programs for which str.iterfind is more useful than a > generic iterfind function, but maybe not that many... The big advantage in my use is that the iterator makes code shorter and more readable, and I can use it in list comprehensions. Maybe it's not useful enough to be a str method; perhaps something for itertools or more-itertools? If it was rewritten to use index() instead of find() it could be generalized for lists and such. Or maybe I'll just leave it in my recipe box with my trees and memoization decorators. Then again, if I'm doing lots and lots of linear searches, I feel like I've not thought about the problem long enough... -- Tom Schumm http://www.fwiffo.com/ From stephen at xemacs.org Mon Apr 8 17:23:58 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 09 Apr 2013 00:23:58 +0900 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> References: <7653229.S9PYj1kSdO@orcus> <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> Message-ID: <87sj316o6p.fsf@uwakimon.sk.tsukuba.ac.jp> Andrew Barnert writes: > Yes, but regular expressions shouldn't be the one way to do a > simple text search! Why not? I don't see a real loss to "match('^start')" vs "startswith('start')" in terms of difficulty of learning, and a potential benefit in encouraging people to avail themselves of the power of regexp search and matching. As far as efficiency goes, XEmacs does the simple thing and checks each alleged regexp for metacharacters, and if there aren't any, falls back to Boyer-Moore. Whoosh! It would not be hard to add similar peephole optimizations for searches or matches that would be most efficiently implemented with startswith, endswith, find, index, etc. Of course, we would need to check how often people search for punctuation (or strings including punctuation). But my suspicion is that people who don't want to grok the most basic features of regexps probably don't search for "\.\*" or the like very often. They probably stick to alphanumerics anyway. From guido at python.org Mon Apr 8 18:01:30 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Apr 2013 09:01:30 -0700 Subject: [Python-ideas] Reviving PEP 3140 - "str(container) should call str(item), not repr(item)" In-Reply-To: References: Message-ID: So call str() on the items yourself. The point of str() is not just to have a convenient notation. It also defines the default behavior of print(). --Guido van Rossum (sent from Android phone) On Apr 8, 2013 2:57 AM, "Piotr Dobrogost" < p at google-groups-2013.dobrogost.net> wrote: > On Fri, Apr 5, 2013 at 12:36 AM, Guido van Rossum wrote: > >> Lots of reasons. E.g. would you really like this outcome? >> >> >>> a = 'foo, bar' >> >>> b = [a] >> >>> print(b) >> [foo, bar] >> >>> >> > > This is indeed unfortunate but could be mitigated in some extent by > enclosing container's elements with let's say pair of <>. Yes, I know it > doesn't solve the problem but anything is better than having str() > duplicate what repr() does. If I want to get repr I'll call repr() but when > I call str() I expect to get some sensible, middle-ground representation > not the same thing repr() already provides. > > > On Fri, Apr 5, 2013 at 12:36 AM, Guido van Rossum wrote: > >> Lots of reasons. E.g. would you really like this outcome? >> >> >>> a = 'foo, bar' >> >>> b = [a] >> >>> print(b) >> [foo, bar] >> >>> >> >> Plus of course there really would be tons of backwards compatibility >> issues. >> >> On Thu, Apr 4, 2013 at 2:57 PM, Piotr Dobrogost >>

wrote: >> > Hi! >> > >> > Having str(container) calling str(item) and not repr(item) sounds like >> the >> > right thing to do. However, PEP 3140 was rejected on the basis of the >> > following statement of Guido: >> > >> > "Let me just save everyone a lot of time and say that I'm opposed to >> > this change, and that I believe that it would cause way too much >> > disturbance to be accepted this close to beta." >> > >> > >> > Thu, 29 May 2008 12:32:04 -0700 >> > (http://www.mail-archive.com/python-3000 at python.org/msg13686.html) >> > >> > Does anyone know what's the reason Guido was opposed to this change? >> > Is there any chance to revive this PEP? >> > >> > >> > Regards, >> > Piotr Dobrogost >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > http://mail.python.org/mailman/listinfo/python-ideas >> > >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) >> > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Apr 8 18:02:15 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Apr 2013 09:02:15 -0700 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: References: Message-ID: Not in the room. :-) --Guido van Rossum (sent from Android phone) On Apr 8, 2013 5:17 AM, "Eli Bendersky" wrote: > > > > On Sun, Apr 7, 2013 at 10:43 AM, Guido van Rossum wrote: > >> There is a simpler solution. Just don't use tabs. >> > > Oh, Guido, where were you when then defined the syntax of Go? > ;-) > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wolfgang.maier at biologie.uni-freiburg.de Mon Apr 8 13:00:54 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Mon, 8 Apr 2013 11:00:54 +0000 (UTC) Subject: [Python-ideas] itertools.chunks() References: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> Message-ID: Wolfgang Maier writes: ok, I did time things now, comparing chunked and Peter Otten's strict_grouper (see the before-mentioned earlier thread). I used lists of integers as input sequences. strict_grouper has a bit more overhead at the start because it needs to obtain an iterable from the sequence, but then it's running fast. The break-even point is at about ten runs of the respective generator loops. For very long sequences strict_grouper is more than 2x faster than chunked. So, in the special case that one wants to group many short sequences chunked is better, but it's not the best general solution. Wolfgang From abarnert at yahoo.com Mon Apr 8 20:04:40 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Mon, 8 Apr 2013 11:04:40 -0700 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <87sj316o6p.fsf@uwakimon.sk.tsukuba.ac.jp> References: <7653229.S9PYj1kSdO@orcus> <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> <87sj316o6p.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <0B1A6D10-82C6-43D6-9E87-CAD2187864B0@yahoo.com> On Apr 8, 2013, at 8:23, "Stephen J. Turnbull" wrote: > Andrew Barnert writes: > >> Yes, but regular expressions shouldn't be the one way to do a >> simple text search! > > Why not? I don't see a real loss to "match('^start')" vs > "startswith('start')" in terms of difficulty of learning, and a > potential benefit in encouraging people to avail themselves of the > power of regexp search and matching. I don't see how you could think these are equally easy to learn. You could show the latter to someone who's never written a line of code and they'd already understand it. But that's not the important part. The benefit of startswith is that it takes less effort to read and understand the code. Reading a regex, or s[:5]=='start' for that matter, isn't _hard_, but it still takes a bit of mental effort, which slows you down a little bit, which limits how much code you can understand in one scan. There's also the fact that there's literally nothing to get wrong with startswith, which means when you're debugging your code, you don't have to mentally check a regex or slice to make sure it's right. One of the great things about python is that you can often understand what a function does, and be sure it's correct, in just a glance. Also, the fact that new programmers can use python for serious work (even text processing work) before they've learned regex is a strength, not a weakness. If you have to tell people "before you can parse that log/csv/whatever you have to learn how to escape parentheses and create matching groups", you might as well teach them perl. From solipsis at pitrou.net Mon Apr 8 20:14:53 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 8 Apr 2013 20:14:53 +0200 Subject: [Python-ideas] An iterable version of find/index for strings? References: <7653229.S9PYj1kSdO@orcus> <44DD9060-CD77-4BBA-8748-8AA026AF1E02@gmail.com> Message-ID: <20130408201453.6f13793f@pitrou.net> Hello, On Fri, 5 Apr 2013 00:42:43 -0700 Raymond Hettinger wrote: > > On Apr 4, 2013, at 6:21 PM, Tom Schumm wrote: > > > Should Python strings (and byte arrays, and other iterables for that matter) > > have an iterator form of find/rfind (or index/rindex)? I've found myself > > wanting one on occasion, > > +1 from me. > > As you say, the current pattern is awkward. Iterators are much more > natural for this task and would lead to cleaner, faster code. I'm mildly positive as well. If iterfind() / finditer() is awkward, let's call it findall(): other search methods just return the first match. Regards Antoine. From _ at lvh.cc Mon Apr 8 20:22:23 2013 From: _ at lvh.cc (Laurens Van Houtven) Date: Mon, 8 Apr 2013 20:22:23 +0200 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <20130408201453.6f13793f@pitrou.net> References: <7653229.S9PYj1kSdO@orcus> <44DD9060-CD77-4BBA-8748-8AA026AF1E02@gmail.com> <20130408201453.6f13793f@pitrou.net> Message-ID: On Mon, Apr 8, 2013 at 8:14 PM, Antoine Pitrou wrote: > let's call it findall() > +1 cheers lvh -------------- next part -------------- An HTML attachment was scrubbed... URL: From phong at phong.org Mon Apr 8 20:59:15 2013 From: phong at phong.org (Tom Schumm) Date: Mon, 08 Apr 2013 14:59:15 -0400 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <0B1A6D10-82C6-43D6-9E87-CAD2187864B0@yahoo.com> References: <7653229.S9PYj1kSdO@orcus> <87sj316o6p.fsf@uwakimon.sk.tsukuba.ac.jp> <0B1A6D10-82C6-43D6-9E87-CAD2187864B0@yahoo.com> Message-ID: <2022380.nlKnPpMZqM@cygnus> On Mon, April 08, 2013 11:04:40 AM Andrew Barnert wrote: > But that's not the important part. The benefit of startswith is that it > takes less effort to read and understand the code. Reading a regex, or > s[:5]=='start' for that matter, isn't _hard_, but it still takes a bit of I was a big fan of regular expressions, going way back; I was a huge Perl fanatic. But over the years I've used them less and less. As Andrew says, if you have a simple string method that does the job, why endure the cognitive overhead of a regular expression? Even if you are using a great regex library that optimized out the computational overhead for simple cases, you still have to write a (potentially cryptic) regex, escape special characters, etc. It's a win if you can make code self-documenting by using a descriptive method like "startswith", "endswith", "if needle in haystack", "find", "strip", etc. All those have trivial regex solutions, but it's better to just say what you mean. -- Tom Schumm http://www.fwiffo.com/ From flying-sheep at web.de Mon Apr 8 22:12:30 2013 From: flying-sheep at web.de (Philipp A.) Date: Mon, 8 Apr 2013 22:12:30 +0200 Subject: [Python-ideas] mixing tabs and spaces In-Reply-To: References: Message-ID: Vala?s sister language Genie uses something like those comments: per default, indentation is one tab per indentation level (imho the most logical and semantical way to indent), but if you really want, you can put a directive like ?[indent=4]? on the first line to change that to spaces. i think python 3?s handling is almost optimal like it is, because it disallows mixing tabs and spaces, but allows anyone to do what they want (as long as they are consistent). only almost optimal, because you can still mix indentation level depths (that you can see such a mistake is another argument for tabs ;)) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Tue Apr 9 01:09:46 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 9 Apr 2013 09:09:46 +1000 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <20130408201453.6f13793f@pitrou.net> References: <7653229.S9PYj1kSdO@orcus> <44DD9060-CD77-4BBA-8748-8AA026AF1E02@gmail.com> <20130408201453.6f13793f@pitrou.net> Message-ID: On 9 Apr 2013 04:20, "Antoine Pitrou" wrote: > > > Hello, > > On Fri, 5 Apr 2013 00:42:43 -0700 > Raymond Hettinger > wrote: > > > > On Apr 4, 2013, at 6:21 PM, Tom Schumm wrote: > > > > > Should Python strings (and byte arrays, and other iterables for that matter) > > > have an iterator form of find/rfind (or index/rindex)? I've found myself > > > wanting one on occasion, > > > > +1 from me. > > > > As you say, the current pattern is awkward. Iterators are much more > > natural for this task and would lead to cleaner, faster code. > > I'm mildly positive as well. > If iterfind() / finditer() is awkward, let's call it findall(): other > search methods just return the first match. +0 from me for findall/rfindall. The overlap keyword-only arg seems like a reasonable approach to that part of the problem, too. Cheers, Nick. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From zarchaoz at gmail.com Tue Apr 9 01:11:18 2013 From: zarchaoz at gmail.com (Zahari Petkov) Date: Tue, 9 Apr 2013 02:11:18 +0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) Message-ID: Hello everyone, In a certain implementation where I was doing some form of dispatch using composition, I decided to try out cooperative inheritance with super. However, I stumbled at something unexpected - there was nothing wrong with cooperative inheritance with super, but the implementation started to have repetative code, which could not be easily resolved, because of how name mangling of private attribute works. In short my proposal is for a language feature or mechanism, which will allow automatic redefinition of methods through the descendants of a superclass during compilation time - at the same time name unmangling is done. In the example below I use a decorator - just to illustrate the idea in a very simple way: class A: __val = 'a' @propagate_through_descendants def print_val(self): print(self.__val) class B(A): __val = 'b' b = B() b.print_val() Please, check the following gist for a bit more concrete example (short and working) with cooperative inheritance: https://gist.github.com/majorz/5341333 The code duplication is very obvious and unavoidable (the two __call__ methods have exactly the same code, which cannot be moved to the superclass). I tried some metaprogramming, but even then it was hard to resolve, since name unmangling seems to happen during complilation time as far as I understand. Thanks, Zahari From guido at python.org Tue Apr 9 01:15:04 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Apr 2013 16:15:04 -0700 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: You are misunderstanding __private. It is for use *within one class* only. For your purpose you should use single underscore (sometimes known as "protected"). On Mon, Apr 8, 2013 at 4:11 PM, Zahari Petkov wrote: > Hello everyone, > > In a certain implementation where I was doing some form of dispatch using > composition, I decided to try out cooperative inheritance with super. However, > I stumbled at something unexpected - there was nothing wrong with cooperative > inheritance with super, but the implementation started to have repetative > code, which could not be easily resolved, because of how name mangling of > private attribute works. > > In short my proposal is for a language feature or mechanism, which will allow > automatic redefinition of methods through the descendants of a superclass > during compilation time - at the same time name unmangling is done. In the > example below I use a decorator - just to illustrate the idea in a very simple > way: > > > class A: > __val = 'a' > > @propagate_through_descendants > def print_val(self): > print(self.__val) > > > class B(A): > __val = 'b' > > > b = B() > b.print_val() > > > Please, check the following gist for a bit more concrete example (short and > working) with cooperative inheritance: > > https://gist.github.com/majorz/5341333 > > The code duplication is very obvious and unavoidable (the two __call__ > methods have exactly the same code, which cannot be moved to the superclass). > I tried some metaprogramming, but even then it was hard to resolve, since > name unmangling seems to happen during complilation time as far as I > understand. > > Thanks, > Zahari > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -- --Guido van Rossum (python.org/~guido) From zarchaoz at gmail.com Tue Apr 9 01:32:35 2013 From: zarchaoz at gmail.com (Zahari Petkov) Date: Tue, 9 Apr 2013 02:32:35 +0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: Thanks for the answer. Now I realize that I picked a wrong example in the email, and your argument is valid about it. However the code repetition in the gist is a valid example, it cannot be resolved with turning the variables/methods to single underscore mode. I actually almost never use the private convention and I am aware of its purpose and use cases (e.g. __update = update after definition). I will think a bit further for a different not-confusing example and write back. Best, Zahari On Tue, Apr 9, 2013 at 2:15 AM, Guido van Rossum wrote: > You are misunderstanding __private. It is for use *within one class* > only. For your purpose you should use single underscore (sometimes > known as "protected"). > > On Mon, Apr 8, 2013 at 4:11 PM, Zahari Petkov wrote: >> Hello everyone, >> >> In a certain implementation where I was doing some form of dispatch using >> composition, I decided to try out cooperative inheritance with super. However, >> I stumbled at something unexpected - there was nothing wrong with cooperative >> inheritance with super, but the implementation started to have repetative >> code, which could not be easily resolved, because of how name mangling of >> private attribute works. >> >> In short my proposal is for a language feature or mechanism, which will allow >> automatic redefinition of methods through the descendants of a superclass >> during compilation time - at the same time name unmangling is done. In the >> example below I use a decorator - just to illustrate the idea in a very simple >> way: >> >> >> class A: >> __val = 'a' >> >> @propagate_through_descendants >> def print_val(self): >> print(self.__val) >> >> >> class B(A): >> __val = 'b' >> >> >> b = B() >> b.print_val() >> >> >> Please, check the following gist for a bit more concrete example (short and >> working) with cooperative inheritance: >> >> https://gist.github.com/majorz/5341333 >> >> The code duplication is very obvious and unavoidable (the two __call__ >> methods have exactly the same code, which cannot be moved to the superclass). >> I tried some metaprogramming, but even then it was hard to resolve, since >> name unmangling seems to happen during complilation time as far as I >> understand. >> >> Thanks, >> Zahari >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > -- > --Guido van Rossum (python.org/~guido) From steve at pearwood.info Tue Apr 9 01:52:39 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 09 Apr 2013 09:52:39 +1000 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <87sj316o6p.fsf@uwakimon.sk.tsukuba.ac.jp> References: <7653229.S9PYj1kSdO@orcus> <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> <87sj316o6p.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <51635847.9000104@pearwood.info> On 09/04/13 01:23, Stephen J. Turnbull wrote: > Andrew Barnert writes: > > > Yes, but regular expressions shouldn't be the one way to do a > > simple text search! > > Why not? I don't see a real loss to "match('^start')" vs > "startswith('start')" in terms of difficulty of learning, I'm not Dutch, but I cannot imagine that: import re prefix = re.escape(prefix) re.match(prefix, mystring) should be considered more obvious than mystring.startswith(prefix) Oh, and just to demonstrate the non-obviousness of re.match, you don't need to anchor the regex to the beginning of the string with ^ since match automatically matches only at the start. > and a > potential benefit in encouraging people to avail themselves of the > power of regexp search and matching. The difficulty is not encouraging people to use regexes when they need them. The difficulty is teaching people not to turn to regexes as the first and only tool for solving every string-based problem. -- Steven From stephen at xemacs.org Tue Apr 9 02:14:05 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 09 Apr 2013 09:14:05 +0900 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <0B1A6D10-82C6-43D6-9E87-CAD2187864B0@yahoo.com> References: <7653229.S9PYj1kSdO@orcus> <34C76B89-B42A-4FC6-8795-72B53CD10009@yahoo.com> <87sj316o6p.fsf@uwakimon.sk.tsukuba.ac.jp> <0B1A6D10-82C6-43D6-9E87-CAD2187864B0@yahoo.com> Message-ID: <87mwt87e7m.fsf@uwakimon.sk.tsukuba.ac.jp> Andrew Barnert writes: > I don't see how you could think these are equally easy to > learn. You could show the latter to someone who's never written a > line of code and they'd already understand it. I didn't say they are equally easy to learn. My point is simply: How many of these 3-line functions all alike do we need to have as builtins? There is a cost to having them all, which may counterbalance the ease of learning each one. > There's also the fact that there's literally nothing to get wrong > with startswith, Of course there is. It may be the wrong function for the purpose. .startswith also encourages embedding magic literals in the code. Both of these make maintenance harder. > you might as well teach them perl. Now, now, let's not be invoking Godwin's Law here. The question "how many do we need" is an empirical question. It should be obvious I'm not seriously suggesting getting rid of .startswith; that would have to wait for Python4 in any case. From steve at pearwood.info Tue Apr 9 02:20:43 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Tue, 09 Apr 2013 10:20:43 +1000 Subject: [Python-ideas] An iterable version of find/index for strings? In-Reply-To: <20130408201453.6f13793f@pitrou.net> References: <7653229.S9PYj1kSdO@orcus> <44DD9060-CD77-4BBA-8748-8AA026AF1E02@gmail.com> <20130408201453.6f13793f@pitrou.net> Message-ID: <51635EDB.9020605@pearwood.info> On 09/04/13 04:14, Antoine Pitrou wrote: > If iterfind() / finditer() is awkward, let's call it findall(): other > search methods just return the first match. +1 on the name and the method. -- Steven From zarchaoz at gmail.com Tue Apr 9 03:54:02 2013 From: zarchaoz at gmail.com (Zahari Petkov) Date: Tue, 9 Apr 2013 04:54:02 +0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: Hello again, Unfortunatelly, the idea cannot be expressed in the simplest way, since this leads to confusion, so I present three short gists and explain the case in this email with a bit more words. I am writing it, since I am confident at least it will be an interesting read. All of the three gists present a different implementation of a simple case with 4 very simple classes involved. There are two classes which render instances of types `int` and `float` (IntRenderer and FloatRenderer). Imagine similar classes that may render any other given type - user classes, a list, etc., but for simplification of the case I use only two classes. Each of those classes is instantiated with a variable and is called to render this variable into a string. There is an abstract base class (Renderer), which provides the common implementation for all the descendants and leaves the `__call__` to the be implemented by the concrete classes. The last, fourth, class is a dispatcher, which has the sole purpose to pick the right renderer, call it and return the result. We prefer many specialized classes instead of one big class that may render all types. The first gist uses composition to achieve that, which would be the usual way to do that. The dispatcher iterates, picks the right renderer, calls it and returns the value: https://gist.github.com/majorz/5342257 Cooperative inheritance is however also a great pick for such a use case. In this type of implementation if a subclass cannot do the job (render), it will delegate the work with super() to the next class in the mro chain. The usage of `__private` is justified here, since each subclass in the chain has to ensure that some of the used methods are his own internal methods: https://gist.github.com/majorz/5342262 And here in this second gist we can immediately spot that the __call__ method has exactly the same textual representation (although the bytecode and the runtime environment are different). If we have hundred such classes, then we have to copy/paste this code hundred times, which is not good. (here I define the problem trying to solve). Thus we need to provide more super() powers - the third gist. I illustrate the concept with importing from `__future__` a decorator called `teleport`, which will make sure that each subclass of a given superclass will have its own copy of a given method (which can access its own `__private` and `super()`): https://gist.github.com/majorz/5342271 Personally, I find a similar solution beautiful. Even if it is not perfect at least it is a valid attempt to solve a small, but existing problem. Of course as an almost ten years user of the language I have to say is something extremely rare to find, and especially in Python 3, which is awesome :) Thanks, Zahari On Tue, Apr 9, 2013 at 2:32 AM, Zahari Petkov wrote: > Thanks for the answer. Now I realize that I picked a wrong example in > the email, and your argument is valid about it. However the code > repetition in the gist is a valid example, it cannot be resolved with > turning the variables/methods to single underscore mode. I actually > almost never use the private convention and I am aware of its purpose > and use cases (e.g. __update = update after definition). > > I will think a bit further for a different not-confusing example and write back. > > Best, > Zahari > > > On Tue, Apr 9, 2013 at 2:15 AM, Guido van Rossum wrote: >> You are misunderstanding __private. It is for use *within one class* >> only. For your purpose you should use single underscore (sometimes >> known as "protected"). >> >> On Mon, Apr 8, 2013 at 4:11 PM, Zahari Petkov wrote: >>> Hello everyone, >>> >>> In a certain implementation where I was doing some form of dispatch using >>> composition, I decided to try out cooperative inheritance with super. However, >>> I stumbled at something unexpected - there was nothing wrong with cooperative >>> inheritance with super, but the implementation started to have repetative >>> code, which could not be easily resolved, because of how name mangling of >>> private attribute works. >>> >>> In short my proposal is for a language feature or mechanism, which will allow >>> automatic redefinition of methods through the descendants of a superclass >>> during compilation time - at the same time name unmangling is done. In the >>> example below I use a decorator - just to illustrate the idea in a very simple >>> way: >>> >>> >>> class A: >>> __val = 'a' >>> >>> @propagate_through_descendants >>> def print_val(self): >>> print(self.__val) >>> >>> >>> class B(A): >>> __val = 'b' >>> >>> >>> b = B() >>> b.print_val() >>> >>> >>> Please, check the following gist for a bit more concrete example (short and >>> working) with cooperative inheritance: >>> >>> https://gist.github.com/majorz/5341333 >>> >>> The code duplication is very obvious and unavoidable (the two __call__ >>> methods have exactly the same code, which cannot be moved to the superclass). >>> I tried some metaprogramming, but even then it was hard to resolve, since >>> name unmangling seems to happen during complilation time as far as I >>> understand. >>> >>> Thanks, >>> Zahari >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >> >> >> >> -- >> --Guido van Rossum (python.org/~guido) From guido at python.org Tue Apr 9 04:13:02 2013 From: guido at python.org (Guido van Rossum) Date: Mon, 8 Apr 2013 19:13:02 -0700 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: On Mon, Apr 8, 2013 at 6:54 PM, Zahari Petkov wrote: > Hello again, > > Unfortunatelly, the idea cannot be expressed in the simplest way, since this > leads to confusion, so I present three short gists and explain the case in this > email with a bit more words. I am writing it, since I am confident at least it > will be an interesting read. > > All of the three gists present a different implementation of a simple case > with 4 very simple classes involved. > > There are two classes which render instances of types `int` and `float` > (IntRenderer and FloatRenderer). Imagine similar classes that may render > any other given type - user classes, a list, etc., but for simplification > of the case I use only two classes. Each of those classes is instantiated with > a variable and is called to render this variable into a string. > > There is an abstract base class (Renderer), which provides the common > implementation for all the descendants and leaves the `__call__` to the be > implemented by the concrete classes. > > The last, fourth, class is a dispatcher, which has the sole purpose to pick > the right renderer, call it and return the result. We prefer many specialized > classes instead of one big class that may render all types. > > The first gist uses composition to achieve that, which would be the > usual way to do that. The dispatcher iterates, picks the right renderer, calls > it and returns the value: > > https://gist.github.com/majorz/5342257 > > Cooperative inheritance is however also a great pick for such a use case. In > this type of implementation if a subclass cannot do the job (render), it will > delegate the work with super() to the next class in the mro chain. The usage > of `__private` is justified here, No. You still misunderstand __private. I fear I cannot help you. Maybe someone on python-list can explain it better. > since each subclass in the chain has to > ensure that some of the used methods are his own internal methods: > > https://gist.github.com/majorz/5342262 > > And here in this second gist we can immediately spot that the __call__ method > has exactly the same textual representation (although the bytecode and the > runtime environment are different). If we have hundred such classes, then we > have to copy/paste this code hundred times, which is not good. (here I define > the problem trying to solve). > > Thus we need to provide more super() powers - the third gist. > I illustrate the concept with importing from `__future__` a decorator called > `teleport`, which will make sure that each subclass of a given superclass will > have its own copy of a given method (which can access its own `__private` and > `super()`): > > https://gist.github.com/majorz/5342271 > > Personally, I find a similar solution beautiful. Even if it is not perfect > at least it is a valid attempt to solve a small, but existing problem. Of course > as an almost ten years user of the language I have to say is something > extremely rare to find, and especially in Python 3, which is awesome :) > > Thanks, > Zahari > > On Tue, Apr 9, 2013 at 2:32 AM, Zahari Petkov wrote: >> Thanks for the answer. Now I realize that I picked a wrong example in >> the email, and your argument is valid about it. However the code >> repetition in the gist is a valid example, it cannot be resolved with >> turning the variables/methods to single underscore mode. I actually >> almost never use the private convention and I am aware of its purpose >> and use cases (e.g. __update = update after definition). >> >> I will think a bit further for a different not-confusing example and write back. >> >> Best, >> Zahari >> >> >> On Tue, Apr 9, 2013 at 2:15 AM, Guido van Rossum wrote: >>> You are misunderstanding __private. It is for use *within one class* >>> only. For your purpose you should use single underscore (sometimes >>> known as "protected"). >>> >>> On Mon, Apr 8, 2013 at 4:11 PM, Zahari Petkov wrote: >>>> Hello everyone, >>>> >>>> In a certain implementation where I was doing some form of dispatch using >>>> composition, I decided to try out cooperative inheritance with super. However, >>>> I stumbled at something unexpected - there was nothing wrong with cooperative >>>> inheritance with super, but the implementation started to have repetative >>>> code, which could not be easily resolved, because of how name mangling of >>>> private attribute works. >>>> >>>> In short my proposal is for a language feature or mechanism, which will allow >>>> automatic redefinition of methods through the descendants of a superclass >>>> during compilation time - at the same time name unmangling is done. In the >>>> example below I use a decorator - just to illustrate the idea in a very simple >>>> way: >>>> >>>> >>>> class A: >>>> __val = 'a' >>>> >>>> @propagate_through_descendants >>>> def print_val(self): >>>> print(self.__val) >>>> >>>> >>>> class B(A): >>>> __val = 'b' >>>> >>>> >>>> b = B() >>>> b.print_val() >>>> >>>> >>>> Please, check the following gist for a bit more concrete example (short and >>>> working) with cooperative inheritance: >>>> >>>> https://gist.github.com/majorz/5341333 >>>> >>>> The code duplication is very obvious and unavoidable (the two __call__ >>>> methods have exactly the same code, which cannot be moved to the superclass). >>>> I tried some metaprogramming, but even then it was hard to resolve, since >>>> name unmangling seems to happen during complilation time as far as I >>>> understand. >>>> >>>> Thanks, >>>> Zahari >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> http://mail.python.org/mailman/listinfo/python-ideas >>> >>> >>> >>> -- >>> --Guido van Rossum (python.org/~guido) -- --Guido van Rossum (python.org/~guido) From tjreedy at udel.edu Tue Apr 9 04:15:14 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 08 Apr 2013 22:15:14 -0400 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: On 4/8/2013 9:54 PM, Zahari Petkov wrote: > Hello again, > > Unfortunatelly, the idea cannot be expressed in the simplest way, since this > leads to confusion, so I present three short gists and explain the case in this > email with a bit more words. I am writing it, since I am confident at least it > will be an interesting read. > > All of the three gists present a different implementation of a simple case > with 4 very simple classes involved. > > There are two classes which render instances of types `int` and `float` > (IntRenderer and FloatRenderer). Imagine similar classes that may render > any other given type - user classes, a list, etc., but for simplification > of the case I use only two classes. Each of those classes is instantiated with > a variable and is called to render this variable into a string. > > There is an abstract base class (Renderer), which provides the common > implementation for all the descendants and leaves the `__call__` to the be > implemented by the concrete classes. > > The last, fourth, class is a dispatcher, which has the sole purpose to pick > the right renderer, call it and return the result. We prefer many specialized > classes instead of one big class that may render all types. > > The first gist uses composition to achieve that, which would be the > usual way to do that. The dispatcher iterates, picks the right renderer, calls > it and returns the value: > > https://gist.github.com/majorz/5342257 > > Cooperative inheritance is however also a great pick for such a use case. In > this type of implementation if a subclass cannot do the job (render), it will > delegate the work with super() to the next class in the mro chain. If you inherit from 20 or 100 classes, each attribute access becomes a substantial linear search. I would forget inheritance and use a dict mapping class of the object being rendered to the data needed to render it. In your simple case, the dict would be {int:'INTEGER {}', float:'FLOAT {}'}. The usage > of `__private` is justified here, since each subclass in the chain has to > ensure that some of the used methods are his own internal methods: > > https://gist.github.com/majorz/5342262 > > And here in this second gist we can immediately spot that the __call__ method > has exactly the same textual representation (although the bytecode and the > runtime environment are different). If we have hundred such classes, then we > have to copy/paste this code hundred times, which is not good. (here I define > the problem trying to solve). These two sentences suggest to me that this is the wrong approach. -- Terry Jan Reedy From jsbueno at python.org.br Tue Apr 9 04:32:17 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Mon, 8 Apr 2013 23:32:17 -0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: On 8 April 2013 23:15, Terry Jan Reedy wrote: > On 4/8/2013 9:54 PM, Zahari Petkov wrote: >> >> Hello again, >> >> Unfortunatelly, the idea cannot be expressed in the simplest way, since >> this >> leads to confusion, so I present three short gists and explain the case in >> this >> email with a bit more words. I am writing it, since I am confident at >> least it >> will be an interesting read. >> >> All of the three gists present a different implementation of a simple case >> with 4 very simple classes involved. >> >> There are two classes which render instances of types `int` and `float` >> (IntRenderer and FloatRenderer). Imagine similar classes that may render >> any other given type - user classes, a list, etc., but for simplification >> of the case I use only two classes. Each of those classes is instantiated >> with >> a variable and is called to render this variable into a string. >> >> There is an abstract base class (Renderer), which provides the common >> implementation for all the descendants and leaves the `__call__` to the be >> implemented by the concrete classes. >> >> The last, fourth, class is a dispatcher, which has the sole purpose to >> pick >> the right renderer, call it and return the result. We prefer many >> specialized >> classes instead of one big class that may render all types. >> >> The first gist uses composition to achieve that, which would be the >> usual way to do that. The dispatcher iterates, picks the right renderer, >> calls >> it and returns the value: >> >> https://gist.github.com/majorz/5342257 >> >> Cooperative inheritance is however also a great pick for such a use case. >> In >> this type of implementation if a subclass cannot do the job (render), it >> will >> delegate the work with super() to the next class in the mro chain. > > > If you inherit from 20 or 100 classes, each attribute access becomes a > substantial linear search. I would forget inheritance and use a dict mapping > class of the object being rendered to the data needed to render it. In your > simple case, the dict would be {int:'INTEGER {}', float:'FLOAT {}'}. Which can easily be conbined with the "one render method per class" writing style you (Zahari) want with something just like: https://gist.github.com/jsbueno/5342460 From zarchaoz at gmail.com Tue Apr 9 04:39:19 2013 From: zarchaoz at gmail.com (Zahari Petkov) Date: Tue, 9 Apr 2013 05:39:19 +0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: Right, I forgot to mention that ordering is important - this is why a priority sequence is needed, not a mapping. The case is that a subclass may need to be rendered in a different way than a superclass. class VerySpecialInt(int): pass and class VerySpecialIntRenderer(Renderer): pass this needs to go before the IntRenderer in the mro chain, or at the front in the composition example. Otherwise the dict dispatch would be perfectly ok, and thanks for sharing this gist. On Tue, Apr 9, 2013 at 5:32 AM, Joao S. O. Bueno wrote: > On 8 April 2013 23:15, Terry Jan Reedy wrote: >> On 4/8/2013 9:54 PM, Zahari Petkov wrote: >>> >>> Hello again, >>> >>> Unfortunatelly, the idea cannot be expressed in the simplest way, since >>> this >>> leads to confusion, so I present three short gists and explain the case in >>> this >>> email with a bit more words. I am writing it, since I am confident at >>> least it >>> will be an interesting read. >>> >>> All of the three gists present a different implementation of a simple case >>> with 4 very simple classes involved. >>> >>> There are two classes which render instances of types `int` and `float` >>> (IntRenderer and FloatRenderer). Imagine similar classes that may render >>> any other given type - user classes, a list, etc., but for simplification >>> of the case I use only two classes. Each of those classes is instantiated >>> with >>> a variable and is called to render this variable into a string. >>> >>> There is an abstract base class (Renderer), which provides the common >>> implementation for all the descendants and leaves the `__call__` to the be >>> implemented by the concrete classes. >>> >>> The last, fourth, class is a dispatcher, which has the sole purpose to >>> pick >>> the right renderer, call it and return the result. We prefer many >>> specialized >>> classes instead of one big class that may render all types. >>> >>> The first gist uses composition to achieve that, which would be the >>> usual way to do that. The dispatcher iterates, picks the right renderer, >>> calls >>> it and returns the value: >>> >>> https://gist.github.com/majorz/5342257 >>> >>> Cooperative inheritance is however also a great pick for such a use case. >>> In >>> this type of implementation if a subclass cannot do the job (render), it >>> will >>> delegate the work with super() to the next class in the mro chain. >> >> >> If you inherit from 20 or 100 classes, each attribute access becomes a >> substantial linear search. I would forget inheritance and use a dict mapping >> class of the object being rendered to the data needed to render it. In your >> simple case, the dict would be {int:'INTEGER {}', float:'FLOAT {}'}. > > Which can easily be conbined with the "one render method per class" > writing style you (Zahari) want with something just like: > https://gist.github.com/jsbueno/5342460 > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From jsbueno at python.org.br Tue Apr 9 05:44:13 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Tue, 9 Apr 2013 00:44:13 -0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: I see - in that case, you can have a class decorator, or even a function as a metaclass, that creates a new _dispatch table for every subclass - If you are using Python 3.3 you can enven use a collections.ChainMap for that to avoid duplicating the dictionary data once for each class. I actually got such an example with decorators working before I read Terry's message and thought a single - root-class based dict could cover all use cases: https://gist.github.com/jsbueno/6f14167b8c4631d0baa7 Using a metaclass instead of a decorator would free you from decorating every class - and it could just be a function receiving (name, bases, dct) as parameters that would call "type" instead of a "real" metaclass, since the changes can be made after the class is created by a plain call to "type". js -><- On 8 April 2013 23:39, Zahari Petkov wrote: > Right, I forgot to mention that ordering is important - this is why a > priority sequence is needed, > not a mapping. > > The case is that a subclass may need to be rendered in a different way > than a superclass. > > class VerySpecialInt(int): > pass > > and > > class VerySpecialIntRenderer(Renderer): > pass > > this needs to go before the IntRenderer in the mro chain, or at the > front in the composition example. > > Otherwise the dict dispatch would be perfectly ok, and thanks for > sharing this gist. > > > On Tue, Apr 9, 2013 at 5:32 AM, Joao S. O. Bueno wrote: >> On 8 April 2013 23:15, Terry Jan Reedy wrote: >>> On 4/8/2013 9:54 PM, Zahari Petkov wrote: >>>> >>>> Hello again, >>>> >>>> Unfortunatelly, the idea cannot be expressed in the simplest way, since >>>> this >>>> leads to confusion, so I present three short gists and explain the case in >>>> this >>>> email with a bit more words. I am writing it, since I am confident at >>>> least it >>>> will be an interesting read. >>>> >>>> All of the three gists present a different implementation of a simple case >>>> with 4 very simple classes involved. >>>> >>>> There are two classes which render instances of types `int` and `float` >>>> (IntRenderer and FloatRenderer). Imagine similar classes that may render >>>> any other given type - user classes, a list, etc., but for simplification >>>> of the case I use only two classes. Each of those classes is instantiated >>>> with >>>> a variable and is called to render this variable into a string. >>>> >>>> There is an abstract base class (Renderer), which provides the common >>>> implementation for all the descendants and leaves the `__call__` to the be >>>> implemented by the concrete classes. >>>> >>>> The last, fourth, class is a dispatcher, which has the sole purpose to >>>> pick >>>> the right renderer, call it and return the result. We prefer many >>>> specialized >>>> classes instead of one big class that may render all types. >>>> >>>> The first gist uses composition to achieve that, which would be the >>>> usual way to do that. The dispatcher iterates, picks the right renderer, >>>> calls >>>> it and returns the value: >>>> >>>> https://gist.github.com/majorz/5342257 >>>> >>>> Cooperative inheritance is however also a great pick for such a use case. >>>> In >>>> this type of implementation if a subclass cannot do the job (render), it >>>> will >>>> delegate the work with super() to the next class in the mro chain. >>> >>> >>> If you inherit from 20 or 100 classes, each attribute access becomes a >>> substantial linear search. I would forget inheritance and use a dict mapping >>> class of the object being rendered to the data needed to render it. In your >>> simple case, the dict would be {int:'INTEGER {}', float:'FLOAT {}'}. >> >> Which can easily be conbined with the "one render method per class" >> writing style you (Zahari) want with something just like: >> https://gist.github.com/jsbueno/5342460 >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas From zarchaoz at gmail.com Tue Apr 9 06:24:10 2013 From: zarchaoz at gmail.com (Zahari Petkov) Date: Tue, 9 Apr 2013 07:24:10 +0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: You give me here some very nice ideas, which I am going to explore more tomorrow, including trying out ChainMap. Thanks for working on this and suggesting those. Back on the original topic, as I understand from the Guido's answers the `__private` attributes have a very specific purpose (e.g. the one explained in the tutorial) and it is not intended as a generic tool like other language features are. Although my code is working correctly, since I am going outside of the indented boundaries, it is normal to hit some rough corners. That invalidates my original idea about eliminating the duplicated code case, since it is a result of wrong presumption - that '__private' attributes is a generic tool. Probably no need to continue further with the `__private` discussion. It is anyway with little practical use. Thanks again guys, Zahari On Tue, Apr 9, 2013 at 6:44 AM, Joao S. O. Bueno wrote: > I see - > in that case, you can have a class decorator, or even a function as a > metaclass, that creates a new _dispatch table for every subclass - If > you are using Python 3.3 you can enven use a collections.ChainMap for > that to avoid duplicating the dictionary data once for each class. > > I actually got such an example with decorators working before I read > Terry's message and thought a single - root-class based dict could > cover all use cases: > > https://gist.github.com/jsbueno/6f14167b8c4631d0baa7 > > Using a metaclass instead of a decorator would free you from > decorating every class - > and it could just be a function receiving (name, bases, dct) as > parameters that would call "type" instead of a "real" metaclass, since > the changes can be made after the class is created by a plain call to > "type". > > js > -><- > > > > On 8 April 2013 23:39, Zahari Petkov wrote: >> Right, I forgot to mention that ordering is important - this is why a >> priority sequence is needed, >> not a mapping. >> >> The case is that a subclass may need to be rendered in a different way >> than a superclass. >> >> class VerySpecialInt(int): >> pass >> >> and >> >> class VerySpecialIntRenderer(Renderer): >> pass >> >> this needs to go before the IntRenderer in the mro chain, or at the >> front in the composition example. >> >> Otherwise the dict dispatch would be perfectly ok, and thanks for >> sharing this gist. >> >> >> On Tue, Apr 9, 2013 at 5:32 AM, Joao S. O. Bueno wrote: >>> On 8 April 2013 23:15, Terry Jan Reedy wrote: >>>> On 4/8/2013 9:54 PM, Zahari Petkov wrote: >>>>> >>>>> Hello again, >>>>> >>>>> Unfortunatelly, the idea cannot be expressed in the simplest way, since >>>>> this >>>>> leads to confusion, so I present three short gists and explain the case in >>>>> this >>>>> email with a bit more words. I am writing it, since I am confident at >>>>> least it >>>>> will be an interesting read. >>>>> >>>>> All of the three gists present a different implementation of a simple case >>>>> with 4 very simple classes involved. >>>>> >>>>> There are two classes which render instances of types `int` and `float` >>>>> (IntRenderer and FloatRenderer). Imagine similar classes that may render >>>>> any other given type - user classes, a list, etc., but for simplification >>>>> of the case I use only two classes. Each of those classes is instantiated >>>>> with >>>>> a variable and is called to render this variable into a string. >>>>> >>>>> There is an abstract base class (Renderer), which provides the common >>>>> implementation for all the descendants and leaves the `__call__` to the be >>>>> implemented by the concrete classes. >>>>> >>>>> The last, fourth, class is a dispatcher, which has the sole purpose to >>>>> pick >>>>> the right renderer, call it and return the result. We prefer many >>>>> specialized >>>>> classes instead of one big class that may render all types. >>>>> >>>>> The first gist uses composition to achieve that, which would be the >>>>> usual way to do that. The dispatcher iterates, picks the right renderer, >>>>> calls >>>>> it and returns the value: >>>>> >>>>> https://gist.github.com/majorz/5342257 >>>>> >>>>> Cooperative inheritance is however also a great pick for such a use case. >>>>> In >>>>> this type of implementation if a subclass cannot do the job (render), it >>>>> will >>>>> delegate the work with super() to the next class in the mro chain. >>>> >>>> >>>> If you inherit from 20 or 100 classes, each attribute access becomes a >>>> substantial linear search. I would forget inheritance and use a dict mapping >>>> class of the object being rendered to the data needed to render it. In your >>>> simple case, the dict would be {int:'INTEGER {}', float:'FLOAT {}'}. >>> >>> Which can easily be conbined with the "one render method per class" >>> writing style you (Zahari) want with something just like: >>> https://gist.github.com/jsbueno/5342460 >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas From ncoghlan at gmail.com Tue Apr 9 07:43:11 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Tue, 9 Apr 2013 15:43:11 +1000 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: On Tue, Apr 9, 2013 at 2:24 PM, Zahari Petkov wrote: > You give me here some very nice ideas, which I am going to explore > more tomorrow, including trying out ChainMap. Thanks for working on > this and suggesting those. > > Back on the original topic, as I understand from the Guido's answers > the `__private` attributes have a very specific purpose (e.g. the one > explained in the tutorial) and it is not intended as a generic tool > like other language features are. Although my code is working > correctly, since I am going outside of the indented boundaries, it is > normal to hit some rough corners. That invalidates my original idea > about eliminating the duplicated code case, since it is a result of > wrong presumption - that '__private' attributes is a generic tool. > Probably no need to continue further with the `__private` discussion. > It is anyway with little practical use. If you would like to use the MRO of the subclass to walk a priority list in a specific order, just do that, instead of trying to use super() to do it recursively. The base class implementation might look something like: class Render: def __init__(self, entity): self._entity = entity for cls in self.__class__.mro(): if isinstance(self._entity, cls._render_type): self._render_entity = cls._render break else: raise TypeError("Cannot render %r instance" % type(entity)) def __call__(self): self._render_entity(self) You could combine something like this with a more conventional system that accepted a list of renderers. This all still sounds more complicated than seems wise, but iterating over the MRO explicitly is a useful alternative once you start hitting the limits of super(). Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From shane at umbrellacode.com Tue Apr 9 09:16:45 2013 From: shane at umbrellacode.com (Shane Green) Date: Tue, 9 Apr 2013 00:16:45 -0700 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: <9019AD95-DA0B-483B-90A8-4270DC5E5194@umbrellacode.com> """Cooperative inheritance is however also a great pick for such a use case. In this type of implementation if a subclass cannot do the job (render), it will delegate the work with super() to the next class in the mro chain. The usage of `__private` is justified here, since each subclass in the chain has to ensure that some of the used methods are his own internal methods:""" This isn't normally what name mangling is used for; it's used to hide private properties so they cannot be accessed, and are basically invisible, outside the context of that class definition. Methods can be accessed as a property of the class anytime their origin need to be restricted. Sent from my iPad On Apr 8, 2013, at 6:54 PM, Zahari Petkov wrote: > > Cooperative inheritance is however also a great pick for such a use case. In > this type of implementation if a subclass cannot do the job (render), it will > delegate the work with super() to the next class in the mro chain. The usage > of `__private` is justified here, since each subclass in the chain has to > ensure that some of the used methods are his own internal methods: From zarchaoz at gmail.com Tue Apr 9 13:10:27 2013 From: zarchaoz at gmail.com (Zahari Petkov) Date: Tue, 9 Apr 2013 14:10:27 +0300 Subject: [Python-ideas] Name mangling and code repetition (with cooperative inheritance use case) In-Reply-To: References: Message-ID: On Tue, Apr 9, 2013 at 8:43 AM, Nick Coghlan wrote: > On Tue, Apr 9, 2013 at 2:24 PM, Zahari Petkov wrote: >> You give me here some very nice ideas, which I am going to explore >> more tomorrow, including trying out ChainMap. Thanks for working on >> this and suggesting those. >> >> Back on the original topic, as I understand from the Guido's answers >> the `__private` attributes have a very specific purpose (e.g. the one >> explained in the tutorial) and it is not intended as a generic tool >> like other language features are. Although my code is working >> correctly, since I am going outside of the indented boundaries, it is >> normal to hit some rough corners. That invalidates my original idea >> about eliminating the duplicated code case, since it is a result of >> wrong presumption - that '__private' attributes is a generic tool. >> Probably no need to continue further with the `__private` discussion. >> It is anyway with little practical use. > > If you would like to use the MRO of the subclass to walk a priority > list in a specific order, just do that, instead of trying to use > super() to do it recursively. > > The base class implementation might look something like: > > class Render: > def __init__(self, entity): > self._entity = entity > for cls in self.__class__.mro(): > if isinstance(self._entity, cls._render_type): > self._render_entity = cls._render > break > else: > raise TypeError("Cannot render %r instance" % type(entity)) > > def __call__(self): > self._render_entity(self) > > You could combine something like this with a more conventional system > that accepted a list of renderers. > > This all still sounds more complicated than seems wise, but iterating > over the MRO explicitly is a useful alternative once you start hitting > the limits of super(). > > Cheers, > Nick. > Your solution is brilliant in a way. It definitely balances well among the implementations I did and resolves the core of the problem extremely well. I modified it a bit and pasted it in a new gist: https://gist.github.com/majorz/5344793 Of course whether to use the MRO for this type of problem is different issue, but I have to conclude that I like it :) Best, Zahari From wolfgang.maier at biologie.uni-freiburg.de Tue Apr 9 18:46:31 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Tue, 9 Apr 2013 18:46:31 +0200 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> Message-ID: <009601ce3541$c9ee0a10$5dca1e30$@biologie.uni-freiburg.de> >Also, here's a version of the same from my own code (modified a >little) that uses islice instead of zip_longest. I haven't timed it but it was intended to be fast for large chunk sizes and I'd be interested to know >how it compares: > > >from itertools import islice > >def chunked(iterable, size, **kwargs): > '''Breaks an iterable into chunks > > Usage: > >>> list(chunked('qwertyuiop', 3)) > [['q', 'w', 'e'], ['r', 't', 'y'], ['u', 'i', 'o'], ['p']] > > >>> list(chunked('qwertyuiop', 3, fillvalue=None)) > [['q', 'w', 'e'], ['r', 't', 'y'], ['u', 'i', 'o'], ['p', None, None]] > > >>> list(chunked('qwertyuiop', 3, strict=True)) > Traceback (most recent call last): > ... > ValueError: Invalid chunk size > ''' > list_, islice_ = list, islice > iterator = iter(iterable) > > chunk = list_(islice_(iterator, size)) > while len(chunk) == size: > yield chunk > chunk = list_(islice_(iterator, size)) > > if not chunk: > return > elif kwargs.get('strict', False): > raise ValueError('Invalid chunk size') > elif 'fillvalue' in kwargs: > yield chunk + (size - len(chunk)) * [kwargs['fillvalue']] > else: > yield chunk Hi there, I have compared now your code for chunked (thanks a lot for sharing it!!) with Peter's strict_grouper using timeit. As a reminder, here's the strict_grouper code again: def strict_grouper(items, size, strict): fillvalue = object() args = [iter(items)]*size chunks = zip_longest(*args, fillvalue=fillvalue) prev = next(chunks) for chunk in chunks: yield prev prev = chunk if prev[-1] is fillvalue: if strict: raise ValueError else: while prev[-1] is fillvalue: prev = prev[:-1] yield prev and here's the code I used for timing: from timeit import timeit results = [] results2 = [] # specify a range of test conditions conds=((100,1),(100,10),(100,80),(1000,1),(1000,10),(1000,100),(1000,800)) # run chunked under the different conditions for cond in conds: r=timeit(stmt='d=[i for i in chunked(range(cond[0]),cond[1])]', \ setup='from __main__ import chunked, cond', number=10000) results.append((cond, r)) # same for strict_grouper for cond in conds: r=timeit(stmt= \ 'd=[i for i in strict_grouper(range(cond[0]),cond[1], strict=False)]', \ setup='from __main__ import strict_grouper, cond', number=10000) results2.append((cond, r)) the results I got were: # the chunked results: [((100, 1), 2.197788960464095), ((100, 10), 0.27306091885475325), ((100, 80), 0.1232851640888839), ((1000, 1), 21.86202648707149), ((1000, 10), 2.47093215096902), ((1000, 100), 0.9069762837680173), ((1000, 800), 0.6114090097580629)] the strict_grouper results: [((100, 1), 0.31356012737705896), ((100, 10), 0.10581013815499318), ((100, 80), 0.45853288974103634), ((1000, 1), 2.5020897878439428), ((1000, 10), 0.6703603850128275), ((1000, 100), 0.5088070259098458), ((1000, 800), 19.14092429336597)] Two things are obvious from this: 1) Peter's solution is usually faster, sometimes a lot, but 2) it performs very poorly when it has to yield a truncated last group of items, like in the range(1000), 800 case. I guess this is true only for the strict=False case (with strict=True it would raise the error instantaneously) because I put in Peter's cautious while loop to trim off the fillvalues. If this was fixed, then I guess strict_grouper would be preferable under pretty much any condition. Best, Wolfgang From oscar.j.benjamin at gmail.com Tue Apr 9 21:19:50 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Tue, 9 Apr 2013 20:19:50 +0100 Subject: [Python-ideas] itertools.chunks() In-Reply-To: <009601ce3541$c9ee0a10$5dca1e30$@biologie.uni-freiburg.de> References: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> <009601ce3541$c9ee0a10$5dca1e30$@biologie.uni-freiburg.de> Message-ID: On 9 April 2013 17:46, Wolfgang Maier wrote: > > Hi there, > I have compared now your code for chunked (thanks a lot for sharing it!!) > with Peter's strict_grouper using > timeit. Thanks. I also ran your code using different conditions like 100000 elements in chunks of 1024 because that's the sort of situation I was interested in. strict_grouper is faster for the bulk of the iteration but is not very fast at the end when the chunk size is large and the last chunk has fill values. > As a reminder, here's the strict_grouper code again: > > def strict_grouper(items, size, strict): > fillvalue = object() > args = [iter(items)]*size > chunks = zip_longest(*args, fillvalue=fillvalue) > prev = next(chunks) > > for chunk in chunks: > yield prev > prev = chunk > > if prev[-1] is fillvalue: > if strict: > raise ValueError > else: This code is the cause of the slow end performance: > while prev[-1] is fillvalue: > prev = prev[:-1] I think where I was assuming large chunk sizes, Peter was assuming small chunk sizes as this is quadratic in the chunk size. If you change these lines to n = len(prev)-1 while prev[n] is fillvalue: n -= 1 del prev[n+1:] yield prev then it will probably be as fast or faster than the one I posted in pretty much all cases. Oscar From wolfgang.maier at biologie.uni-freiburg.de Tue Apr 9 22:11:29 2013 From: wolfgang.maier at biologie.uni-freiburg.de (Wolfgang Maier) Date: Tue, 09 Apr 2013 22:11:29 +0200 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: <004301ce3373$9215c940$b6415bc0$@biologie.uni-freiburg.de> <009601ce3541$c9ee0a10$5dca1e30$@biologie.uni-freiburg.de> Message-ID: On Tue, 9 Apr 2013 20:19:50 +0100 Oscar Benjamin wrote: > > Thanks. I also ran your code using different conditions > like 100000 > elements in chunks of 1024 because that's the sort of > situation I was > interested in. strict_grouper is faster for the bulk of > the iteration > but is not very fast at the end when the chunk size is > large and the > last chunk has fill values. > > > This code is the cause of the slow end performance: > > > while prev[-1] is fillvalue: > > prev = prev[:-1] > > I think where I was assuming large chunk sizes, Peter was > assuming > small chunk sizes as this is quadratic in the chunk size. Yes, and this is also my major use case. It's funny how apparently different problems lead people to similar questions. My most important use of strict_grouper now is for GB-size files with logical units composed of only a handful of lines, and it's doing a great job there. > If you > change these lines to > > n = len(prev)-1 > while prev[n] is fillvalue: > n -= 1 > del prev[n+1:] almost, but strict_grouper returns tuples, so you can't use del here, instead: prev=prev[:n+1] > yield prev > > then it will probably be as fast or faster than the one I > posted in > pretty much all cases. > Right, I just spotted this only weakness in Peter's code when I ran the comparisons to your code, and figured out the almost identical solution above, which works for tuples. Wolfgang From b.petrushev at gmail.com Tue Apr 9 22:21:03 2013 From: b.petrushev at gmail.com (Blagoj Petrushev) Date: Tue, 9 Apr 2013 22:21:03 +0200 Subject: [Python-ideas] Proposal: itertools.batch Message-ID: Hello, I have an idea for a new function in the itertools module. I've been using this pattern quite a lot, so maybe someone else would think it is useful as well. The purpose is to split an iterable into batches with fixed size, and each yielded batch should be an iterator as well. def batch(iterable, batch_size): exhausted = False batch_range = range(batch_size) while not exhausted: def current(): nonlocal exhausted for _ in batch_range: try: yield next(iterable) except StopIteration: exhausted = True yield current() There are problems with this implementation: - the use of try/except is an overkill (the exception is raised only once, so maybe it's not that scarry) - it goes on forever if the batches are not actually consumed - it yields additional empty iterator if the original iterable's length is an exact multiple of batch_size. Here is a simplified version which yields batches as tuples (this is the variation I use in practice): def batch_tuples(iterable, batch_size): while True: batch_ = tuple(itertools.islice(iterable, 0, batch_size)) if len(batch_) == 0: break yield batch_ This is my first proposal, and first email on this list, so take it easy on me :) petrushev --------- https://github.com/petrushev From oscar.j.benjamin at gmail.com Tue Apr 9 23:49:29 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Tue, 9 Apr 2013 22:49:29 +0100 Subject: [Python-ideas] Proposal: itertools.batch In-Reply-To: References: Message-ID: On 9 April 2013 21:21, Blagoj Petrushev wrote: > Hello, > > I have an idea for a new function in the itertools module. I've been > using this pattern quite a lot, so maybe someone else would think it > is useful as well. > > The purpose is to split an iterable into batches with fixed size, and > each yielded batch should be an iterator as well. Are you aware of other threads on this list discussing groupers and batchers and so on? > > def batch(iterable, batch_size): > exhausted = False > batch_range = range(batch_size) > while not exhausted: > def current(): > nonlocal exhausted > for _ in batch_range: > try: > yield next(iterable) > except StopIteration: > exhausted = True > yield current() > > There are problems with this implementation: > - the use of try/except is an overkill (the exception is raised only > once, so maybe it's not that scarry) > - it goes on forever if the batches are not actually consumed What would you want it to do in this case? > - it yields additional empty iterator if the original iterable's > length is an exact multiple of batch_size. This version solves the last issue and might be more efficient in general: from operator import itemgetter from itertools import islice, chain, tee, imap, izip def batch(iterable, batch_size): done = [] def stop(): done.append(None) yield it1, it2 = tee(chain(iterable, stop())) next(it2) # Allow StopIteration to propagate iterator = imap(itemgetter(0), izip(it1, it2)) while not done: yield islice(iterator, batch_size) Oscar From solipsis at pitrou.net Wed Apr 10 10:41:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Wed, 10 Apr 2013 10:41:18 +0200 Subject: [Python-ideas] itertools.chunks() References: Message-ID: <20130410104118.511d4986@pitrou.net> Le Sat, 6 Apr 2013 14:50:16 +0200, Giampaolo Rodol? a ?crit : > def chunks(total, step): > assert total >= step > while total > step: > yield step; > total -= step; > if total: > yield total > > >>> chunks(12, 4) > [4, 4, 4] > >>> chunks(13, 4) > [4, 4, 4, 1] > > I'm not sure how appropriate "chunks" is as a name for such a > function. Anyway, I wrote that because in a unit test I had to create > a file of a precise size, like this: > > FILESIZE = (10 * 1024 * 1024) + 423 # 10MB and 423 bytes > with open(TESTFN, 'wb') as f: > for csize in chunks(FILESIZE, 262144): > f.write(b'x' * csize) This doesn't sound very useful to me, actually. range() already does what you want, except for the "last chunk" thing. Regards Antoine. From ram.rachum at gmail.com Fri Apr 12 00:24:00 2013 From: ram.rachum at gmail.com (Ram Rachum) Date: Thu, 11 Apr 2013 15:24:00 -0700 (PDT) Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions Message-ID: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> I often want to sort objects by an attribute. It's cumbersome to do this: sorted(entries, key=lambda entry: entry.datetime_created) Why not allow this instead: sorted(entries, key='datetime_created') The `sorted` function can check whether the `key` argument is a string, and if so do an attribute lookup. Since I see no other possible use of a string input to `key`, I don't see how this feature would harm anyone. What do you think? Thanks, Ram. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Fri Apr 12 00:35:20 2013 From: carl at oddbird.net (Carl Meyer) Date: Thu, 11 Apr 2013 16:35:20 -0600 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> Message-ID: <51673AA8.3070407@oddbird.net> On 04/11/2013 04:24 PM, Ram Rachum wrote: > I often want to sort objects by an attribute. It's cumbersome to do this: > > sorted(entries, key=lambda entry: entry.datetime_created) > > Why not allow this instead: > > sorted(entries, key='datetime_created') from operator import attrgetter sorted(entries, key=attrgetter('datetime_created')) You can alias attrgetter to an even shorter name if you like. Explicit utility functions are better than implicit special-case behaviors. Why should a string be special-cased to attribute lookup rather than, say, __getitem__ lookup? Carl From ram.rachum at gmail.com Fri Apr 12 00:52:57 2013 From: ram.rachum at gmail.com (Ram Rachum) Date: Thu, 11 Apr 2013 15:52:57 -0700 (PDT) Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: <51673AA8.3070407@oddbird.net> References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: > On 04/11/2013 04:24 PM, Ram Rachum wrote: > > I often want to sort objects by an attribute. It's cumbersome to do > this: > > > > sorted(entries, key=lambda entry: entry.datetime_created) > > > > Why not allow this instead: > > > > sorted(entries, key='datetime_created') > > from operator import attrgetter > sorted(entries, key=attrgetter('datetime_created')) > > You can alias attrgetter to an even shorter name if you like. > That's still cumbersome in my opinion. > > Explicit utility functions are better than implicit special-case > behaviors. Why should a string be special-cased to attribute lookup rather than, say, __getitem__ lookup? > Right, these are options too. I'd guess that attribute lookup is more common, but maybe I'm wrong. > > Carl > _______________________________________________ > Python-ideas mailing list > Python... at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From donald at stufft.io Fri Apr 12 00:54:50 2013 From: donald at stufft.io (Donald Stufft) Date: Thu, 11 Apr 2013 18:54:50 -0400 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: On Apr 11, 2013, at 6:52 PM, Ram Rachum wrote: > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: > On 04/11/2013 04:24 PM, Ram Rachum wrote: > > I often want to sort objects by an attribute. It's cumbersome to do this: > > > > sorted(entries, key=lambda entry: entry.datetime_created) > > > > Why not allow this instead: > > > > sorted(entries, key='datetime_created') > > from operator import attrgetter > sorted(entries, key=attrgetter('datetime_created')) > > You can alias attrgetter to an even shorter name if you like. > > That's still cumbersome in my opinion. > > > Explicit utility functions are better than implicit special-case > behaviors. Why should a string be special-cased to attribute lookup > rather than, say, __getitem__ lookup? > > Right, these are options too. I'd guess that attribute lookup is more common, but maybe I'm wrong. > > > Carl > _______________________________________________ > Python-ideas mailing list > Python... at python.org > http://mail.python.org/mailman/listinfo/python-ideas > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas Special cases aren't special enough to break the rules. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 841 bytes Desc: Message signed with OpenPGP using GPGMail URL: From oscar.j.benjamin at gmail.com Fri Apr 12 01:05:38 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Fri, 12 Apr 2013 00:05:38 +0100 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: On 11 April 2013 23:52, Ram Rachum wrote: > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: >> >> On 04/11/2013 04:24 PM, Ram Rachum wrote: >> > I often want to sort objects by an attribute. It's cumbersome to do >> > this: >> > >> > sorted(entries, key=lambda entry: entry.datetime_created) >> > >> > Why not allow this instead: >> > >> > sorted(entries, key='datetime_created') >> >> from operator import attrgetter >> sorted(entries, key=attrgetter('datetime_created')) >> >> You can alias attrgetter to an even shorter name if you like. > > That's still cumbersome in my opinion. I don't think it's that cumbersome. Leaving aside the import line you're only having to specify two things for your key function: that it's an attribute (attrgetter) and the name of the attribute ('datetime_created'). It's not possible for this to be any more succinct without using special case implicit rules which are generally a bad thing. I like the fact that the API for the sorted function is so simple I can remember all of its arguments and exactly what they do without ever needing to look it up. Oscar From haoyi.sg at gmail.com Fri Apr 12 01:33:10 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Thu, 11 Apr 2013 19:33:10 -0400 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: A more generic and useful thing would be kind of what scala/groovy have: shorthands for defining function literals: Groovy: myList.sort{it.startTime} Scala: myList.sort(_.startTime) Where "_.startTime" and "it.startTime" are shorthand for "x => x.startTime" or python's "lambda x: x.startTime". You could probably get something similar in python: sorted(entries, key = x.datetime_created) if you did some magic with x to make looking up an attribute return a lambda that returns that attribute of its argument. -Haoyi On Thu, Apr 11, 2013 at 7:05 PM, Oscar Benjamin wrote: > On 11 April 2013 23:52, Ram Rachum wrote: > > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: > >> > >> On 04/11/2013 04:24 PM, Ram Rachum wrote: > >> > I often want to sort objects by an attribute. It's cumbersome to do > >> > this: > >> > > >> > sorted(entries, key=lambda entry: entry.datetime_created) > >> > > >> > Why not allow this instead: > >> > > >> > sorted(entries, key='datetime_created') > >> > >> from operator import attrgetter > >> sorted(entries, key=attrgetter('datetime_created')) > >> > >> You can alias attrgetter to an even shorter name if you like. > > > > That's still cumbersome in my opinion. > > I don't think it's that cumbersome. Leaving aside the import line > you're only having to specify two things for your key function: that > it's an attribute (attrgetter) and the name of the attribute > ('datetime_created'). It's not possible for this to be any more > succinct without using special case implicit rules which are generally > a bad thing. I like the fact that the API for the sorted function is > so simple I can remember all of its arguments and exactly what they do > without ever needing to look it up. > > > Oscar > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram.rachum at gmail.com Fri Apr 12 01:38:28 2013 From: ram.rachum at gmail.com (Ram Rachum) Date: Fri, 12 Apr 2013 02:38:28 +0300 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: Interesting! On Fri, Apr 12, 2013 at 2:33 AM, Haoyi Li wrote: > A more generic and useful thing would be kind of what scala/groovy have: > shorthands for defining function literals: > > Groovy: > myList.sort{it.startTime} > > Scala: > myList.sort(_.startTime) > > Where "_.startTime" and "it.startTime" are shorthand for "x => > x.startTime" or python's "lambda x: x.startTime". You could probably get > something similar in python: > > sorted(entries, key = x.datetime_created) > > if you did some magic with x to make looking up an attribute return a > lambda that returns that attribute of its argument. > > -Haoyi > > > > On Thu, Apr 11, 2013 at 7:05 PM, Oscar Benjamin < > oscar.j.benjamin at gmail.com> wrote: > >> On 11 April 2013 23:52, Ram Rachum wrote: >> > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: >> >> >> >> On 04/11/2013 04:24 PM, Ram Rachum wrote: >> >> > I often want to sort objects by an attribute. It's cumbersome to do >> >> > this: >> >> > >> >> > sorted(entries, key=lambda entry: entry.datetime_created) >> >> > >> >> > Why not allow this instead: >> >> > >> >> > sorted(entries, key='datetime_created') >> >> >> >> from operator import attrgetter >> >> sorted(entries, key=attrgetter('datetime_created')) >> >> >> >> You can alias attrgetter to an even shorter name if you like. >> > >> > That's still cumbersome in my opinion. >> >> I don't think it's that cumbersome. Leaving aside the import line >> you're only having to specify two things for your key function: that >> it's an attribute (attrgetter) and the name of the attribute >> ('datetime_created'). It's not possible for this to be any more >> succinct without using special case implicit rules which are generally >> a bad thing. I like the fact that the API for the sorted function is >> so simple I can remember all of its arguments and exactly what they do >> without ever needing to look it up. >> >> >> Oscar >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From oscar.j.benjamin at gmail.com Fri Apr 12 01:39:30 2013 From: oscar.j.benjamin at gmail.com (Oscar Benjamin) Date: Fri, 12 Apr 2013 00:39:30 +0100 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: On 12 April 2013 00:33, Haoyi Li wrote: > A more generic and useful thing would be kind of what scala/groovy have: > shorthands for defining function literals: > > Groovy: > myList.sort{it.startTime} > > Scala: > myList.sort(_.startTime) > > Where "_.startTime" and "it.startTime" are shorthand for "x => x.startTime" > or python's "lambda x: x.startTime". You could probably get something > similar in python: > > sorted(entries, key = x.datetime_created) > > if you did some magic with x to make looking up an attribute return a lambda > that returns that attribute of its argument. You can do that if you want to: import operator class X(object): def __getattribute__(self, attrname): return operator.attrgetter(attrname) def __getitem__(self, index): return operator.itemgetter(index) x = X() Oscar From steve at pearwood.info Fri Apr 12 01:48:57 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 12 Apr 2013 09:48:57 +1000 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> Message-ID: <51674BE9.2090803@pearwood.info> On 12/04/13 08:24, Ram Rachum wrote: > I often want to sort objects by an attribute. It's cumbersome to do this: > > sorted(entries, key=lambda entry: entry.datetime_created) > > Why not allow this instead: > > sorted(entries, key='datetime_created') > > The `sorted` function can check whether the `key` argument is a string, and > if so do an attribute lookup. Why an attribute lookup? Why not a key lookup? Using a trivial lambda makes it obvious what you want: key=lambda obj: obj.name key=lambda obj: obj[name] and is more convenient (although probably slower) than the alternatives: key=operator.attrgetter(name) key=operator.itemgetter(name) without ambiguity or guessing what the caller intended. It also avoids masking TypeError errors if you call with a non-literal argument that happens to be a string. If we allow sorted etc. to guess what the caller wants with strings, should it also guess what they want with integers? key=3 equivalent to key=lambda obj: obj[3] Hmmm... tempting... that would make sorting tuples by a specific field really easy, which is an extremely common use case, and unlike strings, there's no ambiguity. So... -1 on allowing key='string' shortcuts, +0 on allowing key=3 shortcuts. -- Steven From jbvsmo at gmail.com Fri Apr 12 01:58:50 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Thu, 11 Apr 2013 20:58:50 -0300 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: You can have something like that with this module I created: https://github.com/jbvsmo/funcbuilder from funcbuilder import f sorted(entries, key=f.datetime_created) Some features in this module are *very* experimental, but are also very cool... Jo?o Bernardo 2013/4/11 Ram Rachum > Interesting! > > > On Fri, Apr 12, 2013 at 2:33 AM, Haoyi Li wrote: > >> A more generic and useful thing would be kind of what scala/groovy have: >> shorthands for defining function literals: >> >> Groovy: >> myList.sort{it.startTime} >> >> Scala: >> myList.sort(_.startTime) >> >> Where "_.startTime" and "it.startTime" are shorthand for "x => >> x.startTime" or python's "lambda x: x.startTime". You could probably get >> something similar in python: >> >> sorted(entries, key = x.datetime_created) >> >> if you did some magic with x to make looking up an attribute return a >> lambda that returns that attribute of its argument. >> >> -Haoyi >> >> >> >> On Thu, Apr 11, 2013 at 7:05 PM, Oscar Benjamin < >> oscar.j.benjamin at gmail.com> wrote: >> >>> On 11 April 2013 23:52, Ram Rachum wrote: >>> > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: >>> >> >>> >> On 04/11/2013 04:24 PM, Ram Rachum wrote: >>> >> > I often want to sort objects by an attribute. It's cumbersome to do >>> >> > this: >>> >> > >>> >> > sorted(entries, key=lambda entry: entry.datetime_created) >>> >> > >>> >> > Why not allow this instead: >>> >> > >>> >> > sorted(entries, key='datetime_created') >>> >> >>> >> from operator import attrgetter >>> >> sorted(entries, key=attrgetter('datetime_created')) >>> >> >>> >> You can alias attrgetter to an even shorter name if you like. >>> > >>> > That's still cumbersome in my opinion. >>> >>> I don't think it's that cumbersome. Leaving aside the import line >>> you're only having to specify two things for your key function: that >>> it's an attribute (attrgetter) and the name of the attribute >>> ('datetime_created'). It's not possible for this to be any more >>> succinct without using special case implicit rules which are generally >>> a bad thing. I like the fact that the API for the sorted function is >>> so simple I can remember all of its arguments and exactly what they do >>> without ever needing to look it up. >>> >>> >>> Oscar >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >>> >> >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ram.rachum at gmail.com Fri Apr 12 02:07:18 2013 From: ram.rachum at gmail.com (Ram Rachum) Date: Fri, 12 Apr 2013 03:07:18 +0300 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: Awesome module! On Fri, Apr 12, 2013 at 2:58 AM, Jo?o Bernardo wrote: > You can have something like that with this module I created: > https://github.com/jbvsmo/funcbuilder > > from funcbuilder import f > sorted(entries, key=f.datetime_created) > > > Some features in this module are *very* experimental, but are also very > cool... > > > Jo?o Bernardo > > > 2013/4/11 Ram Rachum > >> Interesting! >> >> >> On Fri, Apr 12, 2013 at 2:33 AM, Haoyi Li wrote: >> >>> A more generic and useful thing would be kind of what scala/groovy have: >>> shorthands for defining function literals: >>> >>> Groovy: >>> myList.sort{it.startTime} >>> >>> Scala: >>> myList.sort(_.startTime) >>> >>> Where "_.startTime" and "it.startTime" are shorthand for "x => >>> x.startTime" or python's "lambda x: x.startTime". You could probably get >>> something similar in python: >>> >>> sorted(entries, key = x.datetime_created) >>> >>> if you did some magic with x to make looking up an attribute return a >>> lambda that returns that attribute of its argument. >>> >>> -Haoyi >>> >>> >>> >>> On Thu, Apr 11, 2013 at 7:05 PM, Oscar Benjamin < >>> oscar.j.benjamin at gmail.com> wrote: >>> >>>> On 11 April 2013 23:52, Ram Rachum wrote: >>>> > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: >>>> >> >>>> >> On 04/11/2013 04:24 PM, Ram Rachum wrote: >>>> >> > I often want to sort objects by an attribute. It's cumbersome to do >>>> >> > this: >>>> >> > >>>> >> > sorted(entries, key=lambda entry: entry.datetime_created) >>>> >> > >>>> >> > Why not allow this instead: >>>> >> > >>>> >> > sorted(entries, key='datetime_created') >>>> >> >>>> >> from operator import attrgetter >>>> >> sorted(entries, key=attrgetter('datetime_created')) >>>> >> >>>> >> You can alias attrgetter to an even shorter name if you like. >>>> > >>>> > That's still cumbersome in my opinion. >>>> >>>> I don't think it's that cumbersome. Leaving aside the import line >>>> you're only having to specify two things for your key function: that >>>> it's an attribute (attrgetter) and the name of the attribute >>>> ('datetime_created'). It's not possible for this to be any more >>>> succinct without using special case implicit rules which are generally >>>> a bad thing. I like the fact that the API for the sorted function is >>>> so simple I can remember all of its arguments and exactly what they do >>>> without ever needing to look it up. >>>> >>>> >>>> Oscar >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> http://mail.python.org/mailman/listinfo/python-ideas >>>> >>> >>> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jbvsmo at gmail.com Fri Apr 12 02:14:10 2013 From: jbvsmo at gmail.com (=?ISO-8859-1?Q?Jo=E3o_Bernardo?=) Date: Thu, 11 Apr 2013 21:14:10 -0300 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: It was something I did for fun, so I never had the time to add proper documentation. You can see the best examples to use by reading the doctests from __init__ . BTW, It abuses a lot of Python 3 constructions, so you can't use Python 2.x Jo?o Bernardo 2013/4/11 Ram Rachum > Awesome module! > > > On Fri, Apr 12, 2013 at 2:58 AM, Jo?o Bernardo wrote: > >> You can have something like that with this module I created: >> https://github.com/jbvsmo/funcbuilder >> >> from funcbuilder import f >> sorted(entries, key=f.datetime_created) >> >> >> Some features in this module are *very* experimental, but are also very >> cool... >> >> >> Jo?o Bernardo >> >> >> 2013/4/11 Ram Rachum >> >>> Interesting! >>> >>> >>> On Fri, Apr 12, 2013 at 2:33 AM, Haoyi Li wrote: >>> >>>> A more generic and useful thing would be kind of what scala/groovy >>>> have: shorthands for defining function literals: >>>> >>>> Groovy: >>>> myList.sort{it.startTime} >>>> >>>> Scala: >>>> myList.sort(_.startTime) >>>> >>>> Where "_.startTime" and "it.startTime" are shorthand for "x => >>>> x.startTime" or python's "lambda x: x.startTime". You could probably get >>>> something similar in python: >>>> >>>> sorted(entries, key = x.datetime_created) >>>> >>>> if you did some magic with x to make looking up an attribute return a >>>> lambda that returns that attribute of its argument. >>>> >>>> -Haoyi >>>> >>>> >>>> >>>> On Thu, Apr 11, 2013 at 7:05 PM, Oscar Benjamin < >>>> oscar.j.benjamin at gmail.com> wrote: >>>> >>>>> On 11 April 2013 23:52, Ram Rachum wrote: >>>>> > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: >>>>> >> >>>>> >> On 04/11/2013 04:24 PM, Ram Rachum wrote: >>>>> >> > I often want to sort objects by an attribute. It's cumbersome to >>>>> do >>>>> >> > this: >>>>> >> > >>>>> >> > sorted(entries, key=lambda entry: entry.datetime_created) >>>>> >> > >>>>> >> > Why not allow this instead: >>>>> >> > >>>>> >> > sorted(entries, key='datetime_created') >>>>> >> >>>>> >> from operator import attrgetter >>>>> >> sorted(entries, key=attrgetter('datetime_created')) >>>>> >> >>>>> >> You can alias attrgetter to an even shorter name if you like. >>>>> > >>>>> > That's still cumbersome in my opinion. >>>>> >>>>> I don't think it's that cumbersome. Leaving aside the import line >>>>> you're only having to specify two things for your key function: that >>>>> it's an attribute (attrgetter) and the name of the attribute >>>>> ('datetime_created'). It's not possible for this to be any more >>>>> succinct without using special case implicit rules which are generally >>>>> a bad thing. I like the fact that the API for the sorted function is >>>>> so simple I can remember all of its arguments and exactly what they do >>>>> without ever needing to look it up. >>>>> >>>>> >>>>> Oscar >>>>> _______________________________________________ >>>>> Python-ideas mailing list >>>>> Python-ideas at python.org >>>>> http://mail.python.org/mailman/listinfo/python-ideas >>>>> >>>> >>>> >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carlopires at gmail.com Fri Apr 12 02:31:06 2013 From: carlopires at gmail.com (Carlo Pires) Date: Thu, 11 Apr 2013 21:31:06 -0300 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: Very crafty! I liked the idea of funcbuilder f as replacement for lambdas and the result is true pythonic! 2013/4/11 Jo?o Bernardo > It was something I did for fun, so I never had the time to add proper > documentation. > You can see the best examples to use by reading the doctests from __init__. > > BTW, It abuses a lot of Python 3 constructions, so you can't use Python 2.x > > > > Jo?o Bernardo > > > 2013/4/11 Ram Rachum > >> Awesome module! >> >> >> On Fri, Apr 12, 2013 at 2:58 AM, Jo?o Bernardo wrote: >> >>> You can have something like that with this module I created: >>> https://github.com/jbvsmo/funcbuilder >>> >>> from funcbuilder import f >>> sorted(entries, key=f.datetime_created) >>> >>> >>> Some features in this module are *very* experimental, but are also very >>> cool... >>> >>> >>> Jo?o Bernardo >>> >>> >>> 2013/4/11 Ram Rachum >>> >>>> Interesting! >>>> >>>> >>>> On Fri, Apr 12, 2013 at 2:33 AM, Haoyi Li wrote: >>>> >>>>> A more generic and useful thing would be kind of what scala/groovy >>>>> have: shorthands for defining function literals: >>>>> >>>>> Groovy: >>>>> myList.sort{it.startTime} >>>>> >>>>> Scala: >>>>> myList.sort(_.startTime) >>>>> >>>>> Where "_.startTime" and "it.startTime" are shorthand for "x => >>>>> x.startTime" or python's "lambda x: x.startTime". You could probably get >>>>> something similar in python: >>>>> >>>>> sorted(entries, key = x.datetime_created) >>>>> >>>>> if you did some magic with x to make looking up an attribute return a >>>>> lambda that returns that attribute of its argument. >>>>> >>>>> -Haoyi >>>>> >>>>> >>>>> >>>>> On Thu, Apr 11, 2013 at 7:05 PM, Oscar Benjamin < >>>>> oscar.j.benjamin at gmail.com> wrote: >>>>> >>>>>> On 11 April 2013 23:52, Ram Rachum wrote: >>>>>> > On Friday, April 12, 2013 1:35:20 AM UTC+3, Carl Meyer wrote: >>>>>> >> >>>>>> >> On 04/11/2013 04:24 PM, Ram Rachum wrote: >>>>>> >> > I often want to sort objects by an attribute. It's cumbersome to >>>>>> do >>>>>> >> > this: >>>>>> >> > >>>>>> >> > sorted(entries, key=lambda entry: entry.datetime_created) >>>>>> >> > >>>>>> >> > Why not allow this instead: >>>>>> >> > >>>>>> >> > sorted(entries, key='datetime_created') >>>>>> >> >>>>>> >> from operator import attrgetter >>>>>> >> sorted(entries, key=attrgetter('datetime_created')) >>>>>> >> >>>>>> >> You can alias attrgetter to an even shorter name if you like. >>>>>> > >>>>>> > That's still cumbersome in my opinion. >>>>>> >>>>>> I don't think it's that cumbersome. Leaving aside the import line >>>>>> you're only having to specify two things for your key function: that >>>>>> it's an attribute (attrgetter) and the name of the attribute >>>>>> ('datetime_created'). It's not possible for this to be any more >>>>>> succinct without using special case implicit rules which are generally >>>>>> a bad thing. I like the fact that the API for the sorted function is >>>>>> so simple I can remember all of its arguments and exactly what they do >>>>>> without ever needing to look it up. >>>>>> >>>>>> >>>>>> Oscar >>>>>> _______________________________________________ >>>>>> Python-ideas mailing list >>>>>> Python-ideas at python.org >>>>>> http://mail.python.org/mailman/listinfo/python-ideas >>>>>> >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> Python-ideas mailing list >>>> Python-ideas at python.org >>>> http://mail.python.org/mailman/listinfo/python-ideas >>>> >>>> >>> >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -- Carlo Pires -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Fri Apr 12 08:12:42 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Fri, 12 Apr 2013 15:12:42 +0900 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51673AA8.3070407@oddbird.net> Message-ID: <87fvyw5lb9.fsf@uwakimon.sk.tsukuba.ac.jp> Donald Stufft writes: > Special cases aren't special enough to break the rules. Not to mention: In the face of ambiguity, refuse to guess. N.B. This is part of why Steven d'A changes from -1 to +0 on "key=3". This looks like POSIX sort(1), and cut(1), a bit, doesn't it. From solipsis at pitrou.net Fri Apr 12 10:41:03 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 12 Apr 2013 10:41:03 +0200 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51674BE9.2090803@pearwood.info> Message-ID: <20130412104103.301cea37@pitrou.net> Le Fri, 12 Apr 2013 09:48:57 +1000, Steven D'Aprano a ?crit : > > If we allow sorted etc. to guess what the caller wants with strings, > should it also guess what they want with integers? > > key=3 equivalent to key=lambda obj: obj[3] > > Hmmm... tempting... that would make sorting tuples by a specific > field really easy, which is an extremely common use case, and unlike > strings, there's no ambiguity. > > > So... -1 on allowing key='string' shortcuts, +0 on allowing key=3 > shortcuts. -1 on both :-) Regards Antoine. From ubershmekel at gmail.com Fri Apr 12 12:26:40 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Fri, 12 Apr 2013 13:26:40 +0300 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: <20130412104103.301cea37@pitrou.net> References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <51674BE9.2090803@pearwood.info> <20130412104103.301cea37@pitrou.net> Message-ID: On Fri, Apr 12, 2013 at 11:41 AM, Antoine Pitrou wrote: > Le Fri, 12 Apr 2013 09:48:57 +1000, > Steven D'Aprano a > ?crit : > > So... -1 on allowing key='string' shortcuts, +0 on allowing key=3 > > shortcuts. > > -1 on both :-) > > > Make that a -2 on both. Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From markus at unterwaditzer.net Fri Apr 12 20:21:10 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Fri, 12 Apr 2013 20:21:10 +0200 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> Message-ID: <62a1a95e-4fec-4ee1-ba04-6aa0bf51fd64@email.android.com> Ram Rachum wrote: >I often want to sort objects by an attribute. It's cumbersome to do >this: > > sorted(entries, key=lambda entry: entry.datetime_created) > >Why not allow this instead: > > sorted(entries, key='datetime_created') > >The `sorted` function can check whether the `key` argument is a string, >and >if so do an attribute lookup. > >Since I see no other possible use of a string input to `key`, I don't >see >how this feature would harm anyone. > >What do you think? > > >Thanks, >Ram. > > >------------------------------------------------------------------------ > >_______________________________________________ >Python-ideas mailing list >Python-ideas at python.org >http://mail.python.org/mailman/listinfo/python-ideas I think it's a very bad idea to try to overload the key argument, imo a separate kwarg of sorted would be fine though. E.g: sorted(iterable, attribute='someattr') -- Markus (from phone) From markus at unterwaditzer.net Fri Apr 12 20:28:20 2013 From: markus at unterwaditzer.net (Markus Unterwaditzer) Date: Fri, 12 Apr 2013 20:28:20 +0200 Subject: [Python-ideas] Allow key='attribute_name' to various sorting functions In-Reply-To: References: <032a9950-cce2-4fdd-8fb7-f4c312897f27@googlegroups.com> <9af7a0ec-5916-45d5-91a0-78b0cc4a06f4@email.android.com> Message-ID: Ram Rachum wrote: >That would work for me, +1. (Though I imagine this idea will be >showered >with -1 from ebd...) > > >On Fri, Apr 12, 2013 at 9:19 PM, Markus Unterwaditzer < >markus at unterwaditzer.net> wrote: > >> Ram Rachum wrote: >> >> >I often want to sort objects by an attribute. It's cumbersome to do >> >this: >> > >> > sorted(entries, key=lambda entry: entry.datetime_created) >> > >> >Why not allow this instead: >> > >> > sorted(entries, key='datetime_created') >> > >> >The `sorted` function can check whether the `key` argument is a >string, >> >and >> >if so do an attribute lookup. >> > >> >Since I see no other possible use of a string input to `key`, I >don't >> >see >> >how this feature would harm anyone. >> > >> >What do you think? >> > >> > >> >Thanks, >> >Ram. >> > >> > >> >>------------------------------------------------------------------------ >> > >> >_______________________________________________ >> >Python-ideas mailing list >> >Python-ideas at python.org >> >http://mail.python.org/mailman/listinfo/python-ideas >> >> I think it's a very bad idea to try to overload the key argument, imo >a >> separate kwarg of sorted would be fine though. E.g: >> >> sorted(iterable, attribute='someattr') >> >> -- Markus (from phone) >> Although i think it's better to use attrgetter, since it is more easily reusable and so on. -- Markus (from phone) From peter at norvig.com Sat Apr 13 20:24:02 2013 From: peter at norvig.com (Peter Norvig) Date: Sat, 13 Apr 2013 11:24:02 -0700 Subject: [Python-ideas] "else" expression ":" Message-ID: Beginners will often write code like this: if val > 0: return +1 elif val < 0: return -1 elif val == 0: return 0 Now if you did this in Java, the compiler would produce an error saying that there is an execution path that does not return a value. Python does not give an error message, but it would be considered more idiomatic (and slightly more efficient) to have just "else:" in the third clause. Here's an idea to address this. What do you think of the syntax "else" expression ":" for example: if val > 0: return +1 elif val < 0: return -1 else val == 0: return 0 with the interpretation: if val > 0: return +1 elif val < 0: return -1 else: assert val == 0 return 0 I have to say, I'm uncertain. I'm not sure this is even a good idea at all, and I'm not sure if it should translate into "assert expression" or whether it should be "if not expression: raise ValueError". What do you think? -Peter Norvig -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Apr 13 21:01:03 2013 From: mertz at gnosis.cx (David Mertz) Date: Sat, 13 Apr 2013 12:01:03 -0700 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: On Apr 13, 2013, at 11:24 AM, Peter Norvig wrote: > Here's an idea to address this. What do you think of the syntax > "else" expression ":" > for example: > if val > 0: > return +1 > elif val < 0: > return -1 > else val == 0: > return 0 I often write code like: if cond1: doThing1() elif cond2: doThing2() # ... more steps here ... return something I guess implicitly I think of this as a less verbose form of: if cond1: doThing1() elif cond2: doThing2() else: pass # No need to do anything if not cond1 and not cond2 That is, the situation where every block of the condition ends in a return statement is a special case, and by no means universal to the use of if/elif/else. In particular, not every time I use if/elif do I want a "catch the remaining cases" block, since it is often only in some enumerated special circumstances I want any processing to occur in the compound block. I think the "else with boolean" is definitely readable, modulo exactly what exception is raised if it is violated. However, it feels like the explicit 'assert' statement in those cases where we expect exhaustive conditions is already available. Moreover, we can always add a final else to document our belief that conditions are exhaustive: if val > 0: return +1 elif val < 0: return -1 elif val == 0: return 0 else: raise ValueError("'val' is not negative, positive, or zero! Check the properties of arithmetic") -- mertz@ THIS MESSAGE WAS BROUGHT TO YOU BY: v i gnosis Postmodern Enterprises s r .cx MAKERS OF CHAOS.... i u LOOK FOR IT IN A NEIGHBORHOOD NEAR YOU g s From tjreedy at udel.edu Sat Apr 13 21:21:33 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Sat, 13 Apr 2013 15:21:33 -0400 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: On 4/13/2013 2:24 PM, Peter Norvig wrote: > Beginners will often write code like this: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 So might a Python expert who knows that all three tests could return False for instances of some class. For instance, a fuzzy zero, or an interval that includes 0. Or someone who like to be explicit about the appropriate guard for each return. > Now if you did this in Java, the compiler would produce an error saying > that there is an execution path that does not return a value. Not relevant since in Python all paths that end either raise or return. > Python does not give an error message, because it is not an error. > but it would be considered more idiomatic (and slightly more > efficient) to have just "else:" in the third clause. Only when the last condition is the negation of the conjunction of the first two. Or when it is broader than that. I might actually write something like else: # val = 0 to document the simplified negated conjunction, which in this case is the appropriate guard. Being explicit, at least with a comment, makes it easier for someone to re-order the branches, should there be reason to. Python routinely allows correct but unidiomatic and inefficient code. > Here's an idea to address this. I do not see that there is a problem to fix ;-). > What do you think of the syntax > "else" expression ":" A confusing solution to a non-problem ;-). -- Terry Jan Reedy From breamoreboy at yahoo.co.uk Sat Apr 13 21:34:32 2013 From: breamoreboy at yahoo.co.uk (Mark Lawrence) Date: Sat, 13 Apr 2013 20:34:32 +0100 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: On 13/04/2013 19:24, Peter Norvig wrote: > > > > Beginners will often write code like this: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 > > Now if you did this in Java, the compiler would produce an error saying > that there is an execution path that does not return a value. Python > does not give an error message, but it would be considered more > idiomatic (and slightly more efficient) to have just "else:" in the > third clause. > > Here's an idea to address this. What do you think of the syntax > > "else" expression ":" > > for example: > > if val > 0: > return +1 > elif val < 0: > return -1 > else val == 0: > return 0 > > with the interpretation: > > if val > 0: > return +1 > elif val < 0: > return -1 > else: > assert val == 0 > return 0 > > I have to say, I'm uncertain. I'm not sure this is even a good idea at > all, and I'm not sure if it should translate into "assert expression" > or whether it should be "if not expression: raise ValueError". What do > you think? > > -Peter Norvig > Big -1 from me, if it ain't broke don't fix it. -- If you're using GoogleCrap? please read this http://wiki.python.org/moin/GoogleGroupsPython. Mark Lawrence From shane at umbrellacode.com Sat Apr 13 21:56:11 2013 From: shane at umbrellacode.com (Shane Green) Date: Sat, 13 Apr 2013 12:56:11 -0700 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: <38E9CC19-CF54-4114-A2BE-67A0894C2659@umbrellacode.com> That's a bad example because the comparisons would probably have raised a ValueError or TypeError for any, but I think being explicit is by far preferable, and just as concise, if not more so because you control the exception: elif val < 0: return 0 elif value != 0: raise ValueError() return value Of course you likely will have gotten a ValueError or TypeError already if you've done gt/lt comparisons on a value that turns out to also not be 0? If you're going to validate the data type eventually, why not do it at the top? Shane Green www.umbrellacode.com 408-692-4666 | shane at umbrellacode.com On Apr 13, 2013, at 12:01 PM, David Mertz wrote: > On Apr 13, 2013, at 11:24 AM, Peter Norvig wrote: >> Here's an idea to address this. What do you think of the syntax >> "else" expression ":" >> for example: >> if val > 0: >> return +1 >> elif val < 0: >> return -1 >> else val == 0: >> return 0 > > I often write code like: > > if cond1: > doThing1() > elif cond2: > doThing2() > # ... more steps here ... > return something > > I guess implicitly I think of this as a less verbose form of: > > if cond1: > doThing1() > elif cond2: > doThing2() > else: > pass # No need to do anything if not cond1 and not cond2 > > That is, the situation where every block of the condition ends in a return statement is a special case, and by no means universal to the use of if/elif/else. In particular, not every time I use if/elif do I want a "catch the remaining cases" block, since it is often only in some enumerated special circumstances I want any processing to occur in the compound block. > > I think the "else with boolean" is definitely readable, modulo exactly what exception is raised if it is violated. However, it feels like the explicit 'assert' statement in those cases where we expect exhaustive conditions is already available. Moreover, we can always add a final else to document our belief that conditions are exhaustive: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 > else: > raise ValueError("'val' is not negative, positive, or zero! Check the properties of arithmetic") > > > -- > mertz@ THIS MESSAGE WAS BROUGHT TO YOU BY: v i > gnosis Postmodern Enterprises s r > .cx MAKERS OF CHAOS.... i u > LOOK FOR IT IN A NEIGHBORHOOD NEAR YOU g s > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From greg.ewing at canterbury.ac.nz Sun Apr 14 01:07:38 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 14 Apr 2013 11:07:38 +1200 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: <5169E53A.7030900@canterbury.ac.nz> Peter Norvig wrote: > Beginners will often write code like this: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 > Here's an idea to address this. What do you think of the syntax > > "else" expression ":" This will do nothing to help said beginner. He will continue to write "elif val == 0:", blissfully unaware that there could be another case that he hasn't thought of. If he had the presence of mind to realise that, he would have written something safer in the first place. -- Greg From ben+python at benfinney.id.au Sun Apr 14 03:41:13 2013 From: ben+python at benfinney.id.au (Ben Finney) Date: Sun, 14 Apr 2013 11:41:13 +1000 Subject: [Python-ideas] Poll about -h,--help options References: Message-ID: <7wsj2t6g92.fsf@benfinney.id.au> anatoly techtonik writes: > Is it interesting to know if people expect -h to work as a --help > equivalent by default? There are too many existing tools that make use of ?-h? for other things (e.g. ?hostname to connect to?) for me to expect it to mean ?help? by default. If I'm expecting an option to be a short name for ?help?, it would be ?-??. -- \ ?One bad programmer can easily create two new jobs a year. | `\ Hiring more bad programmers will just increase our perceived | _o__) need for them.? ?David Lorge Parnas, 1999-03 | Ben Finney From steve at pearwood.info Sun Apr 14 04:36:59 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Sun, 14 Apr 2013 12:36:59 +1000 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: <516A164B.9060803@pearwood.info> On 14/04/13 04:24, Peter Norvig wrote: > Beginners will often write code like this: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 > > Now if you did this in Java, the compiler would produce an error saying > that there is an execution path that does not return a value. There's one difference between the languages right there: there is no such case for Python. If you pass something that doesn't match any of the three cases, say a NAN, the function will return None. > Python does > not give an error message, but it would be considered more idiomatic (and > slightly more efficient) to have just "else:" in the third clause. Also incorrect, in a language which supports NANs, as Python does. (And Java, I believe, which may be why Java correctly tells you that there is a path with no return result.) > Here's an idea to address this. What do you think of the syntax > > "else" expression ":" I don't think it will help beginners, and for more experienced programmers, I don't think it is of much benefit over an explicit else: assert expression, "message if the assert fails" (or an explicit ValueError test, if more appropriate). [...] > I have to say, I'm uncertain. I'm not sure this is even a good idea at > all, and I'm not sure if it should translate into "assert expression" or > whether it should be "if not expression: raise ValueError". What do you > think? I think that there's no one right answer. For some code, an assertion will be correct, and for others, an explicit test and ValueError (or some other exception!) will be correct. Neither is so obviously more common that Python should introduce syntax to favour one over the other. -- Steven From shane at umbrellacode.com Sun Apr 14 04:54:58 2013 From: shane at umbrellacode.com (Shane Green) Date: Sat, 13 Apr 2013 19:54:58 -0700 Subject: [Python-ideas] "else" expression ":" In-Reply-To: <516A164B.9060803@pearwood.info> References: <516A164B.9060803@pearwood.info> Message-ID: Ah, yes, I should clarify that when I suggested: elif val < 0: return -1 elif value != 0: raise ValueError() I did NOT mean to propose it as the syntactical translation of anything; I was suggesting that beginning programmers should use that instead of the first approach. I would say that, between: explicit is always better than implicit; in the face of ambiguity, refuse the temptation to buses; there should be one, and preferably only one, way to do it; and special cases aren't special enough to break rules. Well, I like the way this one works now? Shane Green www.umbrellacode.com 408-692-4666 | shane at umbrellacode.com On Apr 13, 2013, at 7:36 PM, Steven D'Aprano wrote: > On 14/04/13 04:24, Peter Norvig wrote: >> Beginners will often write code like this: >> >> if val > 0: >> return +1 >> elif val < 0: >> return -1 >> elif val == 0: >> return 0 >> >> Now if you did this in Java, the compiler would produce an error saying >> that there is an execution path that does not return a value. > > There's one difference between the languages right there: there is no such > case for Python. If you pass something that doesn't match any of the three > cases, say a NAN, the function will return None. > > >> Python does >> not give an error message, but it would be considered more idiomatic (and >> slightly more efficient) to have just "else:" in the third clause. > > Also incorrect, in a language which supports NANs, as Python does. (And Java, > I believe, which may be why Java correctly tells you that there is a path > with no return result.) > > > >> Here's an idea to address this. What do you think of the syntax >> >> "else" expression ":" > > > I don't think it will help beginners, and for more experienced programmers, > I don't think it is of much benefit over an explicit > > else: > assert expression, "message if the assert fails" > > (or an explicit ValueError test, if more appropriate). > > > [...] >> I have to say, I'm uncertain. I'm not sure this is even a good idea at >> all, and I'm not sure if it should translate into "assert expression" or >> whether it should be "if not expression: raise ValueError". What do you >> think? > > I think that there's no one right answer. For some code, an assertion will > be correct, and for others, an explicit test and ValueError (or some other > exception!) will be correct. Neither is so obviously more common that Python > should introduce syntax to favour one over the other. > > > -- > Steven > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From ncoghlan at gmail.com Sun Apr 14 12:31:09 2013 From: ncoghlan at gmail.com (Nick Coghlan) Date: Sun, 14 Apr 2013 20:31:09 +1000 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: On Sun, Apr 14, 2013 at 4:24 AM, Peter Norvig wrote: > > > > Beginners will often write code like this: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 > > Now if you did this in Java, the compiler would produce an error saying that > there is an execution path that does not return a value. Python does not > give an error message, but it would be considered more idiomatic (and > slightly more efficient) to have just "else:" in the third clause. > > Here's an idea to address this. What do you think of the syntax > > "else" expression ":" > > for example: > > if val > 0: > return +1 > elif val < 0: > return -1 > else val == 0: > return 0 > > with the interpretation: > > if val > 0: > return +1 > elif val < 0: > return -1 > else: > assert val == 0 > return 0 > > I have to say, I'm uncertain. I'm not sure this is even a good idea at all, > and I'm not sure if it should translate into "assert expression" or whether > it should be "if not expression: raise ValueError". What do you think? I think the difference between: if val > 0: return +1 elif val < 0: return -1 else val == 0: return 0 and: if val > 0: return +1 elif val < 0: return -1 elif val == 0: return 0 is far too subtle to be helpful. If an if/elif chain is expected to be exhaustive and you want to ensure it remains that way during ongoing maintenance, you can already do: if val > 0: return +1 elif val < 0: return -1 elif val == 0: return 0 else: raise RuntimeError("Unhandled input: {!r:100}".format(val)) (with the "else:" being optional if this is the last statement in a function) In general, though, Python already decided on its answer to this question by always allowing a final "return None" to be implicit, even when there are explicit return statements elsewhere in the function. Cheers, Nick. -- Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia From mal at egenix.com Sun Apr 14 13:16:52 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Sun, 14 Apr 2013 13:16:52 +0200 Subject: [Python-ideas] [pydotorg-www] Poll about -h,--help options In-Reply-To: <20130414130637.316476a5@fsol> References: <20130414130637.316476a5@fsol> Message-ID: <516A9024.2050002@egenix.com> [Taking pydotorg-www off CC - this doesn't have anything to do with the website] On 14.04.2013 13:06, Antoine Pitrou wrote: > On Mon, 18 Feb 2013 10:40:07 +0300 > anatoly techtonik > wrote: >> Hi, >> >> Is it interesting to know if people expect -h to work as a --help >> equivalent by default? > > Yes, I do expect it. > (and I find it quite annoying when it doesn't) Same here. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 14 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-04-09: Released mxODBC.Connect 2.0.3 ... http://egenix.com/go42 2013-04-02: Released mxODBC Zope DA 2.1.1 ... http://egenix.com/go41 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From shane at umbrellacode.com Sun Apr 14 13:48:08 2013 From: shane at umbrellacode.com (Shane Green) Date: Sun, 14 Apr 2013 04:48:08 -0700 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: <498A396F-3736-4525-A518-D33F5EB9EA06@umbrellacode.com> I think that, if something special were added for this case, it may be best for it to be not overload the so incredibly well defined through the ages if/elif/ or else clauses, perhaps adding something new, like: elifassert val==0: retur? ?you know, I've got say, I don't really recall finding myself in this situation very often in the first place. It seems like the most productive solution might be to plant an easter egg that suggests some of these alternative approaches! Shane Green www.umbrellacode.com 408-692-4666 | shane at umbrellacode.com On Apr 14, 2013, at 3:31 AM, Nick Coghlan wrote: > On Sun, Apr 14, 2013 at 4:24 AM, Peter Norvig wrote: >> >> >> >> Beginners will often write code like this: >> >> if val > 0: >> return +1 >> elif val < 0: >> return -1 >> elif val == 0: >> return 0 >> >> Now if you did this in Java, the compiler would produce an error saying that >> there is an execution path that does not return a value. Python does not >> give an error message, but it would be considered more idiomatic (and >> slightly more efficient) to have just "else:" in the third clause. >> >> Here's an idea to address this. What do you think of the syntax >> >> "else" expression ":" >> >> for example: >> >> if val > 0: >> return +1 >> elif val < 0: >> return -1 >> else val == 0: >> return 0 >> >> with the interpretation: >> >> if val > 0: >> return +1 >> elif val < 0: >> return -1 >> else: >> assert val == 0 >> return 0 >> >> I have to say, I'm uncertain. I'm not sure this is even a good idea at all, >> and I'm not sure if it should translate into "assert expression" or whether >> it should be "if not expression: raise ValueError". What do you think? > > I think the difference between: > > if val > 0: > return +1 > elif val < 0: > return -1 > else val == 0: > return 0 > > and: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 > > is far too subtle to be helpful. If an if/elif chain is expected to be > exhaustive and you want to ensure it remains that way during ongoing > maintenance, you can already do: > > if val > 0: > return +1 > elif val < 0: > return -1 > elif val == 0: > return 0 > else: > raise RuntimeError("Unhandled input: {!r:100}".format(val)) > > (with the "else:" being optional if this is the last statement in a function) > > In general, though, Python already decided on its answer to this > question by always allowing a final "return None" to be implicit, even > when there are explicit return statements elsewhere in the function. > > Cheers, > Nick. > > -- > Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From storchaka at gmail.com Sun Apr 14 13:58:49 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Sun, 14 Apr 2013 14:58:49 +0300 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: On 13.04.13 22:21, Terry Jan Reedy wrote: > On 4/13/2013 2:24 PM, Peter Norvig wrote: >> Beginners will often write code like this: >> >> if val > 0: >> return +1 >> elif val < 0: >> return -1 >> elif val == 0: >> return 0 > > So might a Python expert who knows that all three tests could return > False for instances of some class. For instance, a fuzzy zero, or an > interval that includes 0. Or float('nan'). From yaroslav at fedevych.name Thu Apr 18 08:29:12 2013 From: yaroslav at fedevych.name (Yaroslav Fedevych) Date: Thu, 18 Apr 2013 08:29:12 +0200 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: An obligatory read for anyone who comes up with ideas like this: http://thedailywtf.com/Articles/ButAnything-Can-Happen!.aspx -------------- next part -------------- An HTML attachment was scrubbed... URL: From flying-sheep at web.de Thu Apr 18 16:21:06 2013 From: flying-sheep at web.de (Philipp A.) Date: Thu, 18 Apr 2013 16:21:06 +0200 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: no, this is different; it?s not for booleans, but for assertions, and could be used for e.g. exhaustive switches, e.g. it would make the first test in the following unnecessary: if spam not in {"a", "b", "c"}: throw new ValueError("spam {} should be a, b, or c!".format(spam)) if spam == "a": foo() elif spam == "b": bar() else: baz() i?m not saing i support the idea though. i?d rather see scala-like extensible pattern matching. that would solve this, the eternal switch debate, and more (using extractors). is there a proposal for pattern matching? if not, i?ll come up with one ;) 2013/4/18 Yaroslav Fedevych > An obligatory read for anyone who comes up with ideas like this: > http://thedailywtf.com/Articles/ButAnything-Can-Happen!.aspx > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rosuav at gmail.com Thu Apr 18 16:33:28 2013 From: rosuav at gmail.com (Chris Angelico) Date: Fri, 19 Apr 2013 00:33:28 +1000 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: On Fri, Apr 19, 2013 at 12:21 AM, Philipp A. wrote: > no, this is different; it?s not for booleans, but for assertions, and could > be used for e.g. exhaustive switches, e.g. it would make the first test in > the following unnecessary: > > if spam not in {"a", "b", "c"}: > throw new ValueError("spam {} should be a, b, or c!".format(spam)) > if spam == "a": > foo() > elif spam == "b": > bar() > else: > baz() Or alternatively, you could write it as: if spam == "genuine watch": foo() elif spam == "buy a college degree": bar() elif spam == "rich guy wants to move money offshore": baz() else: raise ValueError("Unrecognized spam '%s'!" % spam) That removes the need to pre-check and match your if block. ChrisA From storchaka at gmail.com Thu Apr 18 22:04:29 2013 From: storchaka at gmail.com (Serhiy Storchaka) Date: Thu, 18 Apr 2013 23:04:29 +0300 Subject: [Python-ideas] "else" expression ":" In-Reply-To: References: Message-ID: On 18.04.13 17:33, Chris Angelico wrote: > On Fri, Apr 19, 2013 at 12:21 AM, Philipp A. wrote: >> no, this is different; it?s not for booleans, but for assertions, and could >> be used for e.g. exhaustive switches, e.g. it would make the first test in >> the following unnecessary: >> >> if spam not in {"a", "b", "c"}: >> throw new ValueError("spam {} should be a, b, or c!".format(spam)) >> if spam == "a": >> foo() >> elif spam == "b": >> bar() >> else: >> baz() > > Or alternatively, you could write it as: > > if spam == "genuine watch": > foo() > elif spam == "buy a college degree": > bar() > elif spam == "rich guy wants to move money offshore": > baz() > else: > raise ValueError("Unrecognized spam '%s'!" % spam) > > That removes the need to pre-check and match your if block. Or alternative: alternatives = { "genuine watch": foo, "buy a college degree": bar, "rich guy wants to move money offshore": baz, } try: alternative = alternatives[spam] except KeyError: raise ValueError("Unrecognized spam '%s'!" % spam) alternatives[spam]() From haoyi.sg at gmail.com Wed Apr 24 05:49:53 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Tue, 23 Apr 2013 23:49:53 -0400 Subject: [Python-ideas] Macros for Python Message-ID: I thought this may be of interest to some people on this list, even if not strictly an "idea". I'm working on MacroPy , a little pure-python library that allows user-defined AST rewrites as part of the import process (using PEP 302). In short, it makes mucking around with Python's semantics so easy as to be almost trivial: you write a function that takes an AST and returns an AST, register it as a macro, and you're off to the races. To give a sense of it, I just finished implementing Scala/Groovy style anonymous lambdas: map(f%(_ + 1), [1, 2, 3])#[2, 3, 4] reduce(f%(_ + _), [1, 2, 3])#6 ...which took about half an hour and 30 lines of code, start to finish. We're currently working on implementing destructuring-pattern-matching on objects (i.e. like in Haskell/Scala) and a clone of .NET's LINQ to SQL. It's still very much a work in progress, but we have a list of pretty cool macros already done, which shows off what you can do with it. If anyone else was thinking about messing around with the semantics of the Python language but was too scared to jump into the CPython internals, this offers a somewhat easier path. Hope this was interesting to somebody! -Haoyi -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Wed Apr 24 10:59:40 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Wed, 24 Apr 2013 11:59:40 +0300 Subject: [Python-ideas] Automatic context managers Message-ID: Long time no see, all. :P PySide Qt binding have an interesting property - when you create widgets, you need to assign them to variables. When such variable is lost, object is immediately destroyed. I often use this one-shot code in setup.py: ... long_description = open('README.txt').read(), .... Which probably leaves the README.txt file open until the setup.py exits. So, the idea is to close the file as soon as the variable is lost. I don't know why it was not implemented in the first place. Any ideas? Depending on the answer to the above, the solution can be different. I assume that this was done for a reason (probably immediate garbage collection is too expensive), but confirmation is welcome. Meanwhile the solution can be implemented with auto context manager, which __exit__ method is automatically called when all links to created object are lost. Difference from usual "with something" is that it is transparent to the user (leaves less details to worry about) and makes code more beautiful -- in the example above the assignment is made inside setup(...) parameter assignment. An example with ordinary "with" statement would look like: with open('README.txt') as readme: setup( .... long_description=readme.read(), .... ) The nesting level increases with every new file you need to read. Another variation is intermediate variables, which is also less nice. with open('README.txt') as readme: content = readme.read() setup( .... long_description=content, .... ) -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Wed Apr 24 11:23:09 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Wed, 24 Apr 2013 11:23:09 +0200 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: Message-ID: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> On 24 Apr, 2013, at 10:59, anatoly techtonik wrote: > Long time no see, all. :P > > PySide Qt binding have an interesting property - when you create widgets, you need to assign them to variables. When such variable is lost, object is immediately destroyed. > > I often use this one-shot code in setup.py: > ... > long_description = open('README.txt').read(), > .... > > Which probably leaves the README.txt file open until the setup.py exits. So, the idea is to close the file as soon as the variable is lost. The file is automaticly closed as soon as the file object is garbage collected. In your example CPython would currently collect at the end of the read call (unles there is an exception) because of the reference counting garbage collector, but other implementations have other garbage collectors and can collect the file object (much) later. > > I don't know why it was not implemented in the first place. Any ideas? It was implemented a long time ago. The with statement was added because relying on automatic resource cleanup by a destructor might clean up the resource too late (for example because the file object is referenced by a local variable in a frame that's referenced by an exception). > > Depending on the answer to the above, the solution can be different. I assume that this was done for a reason (probably immediate garbage collection is too expensive), but confirmation is welcome. Meanwhile the solution can be implemented with auto context manager, which __exit__ method is automatically called when all links to created object are lost. > > Difference from usual "with something" is that it is transparent to the user (leaves less details to worry about) and makes code more beautiful -- in the example above the assignment is made inside setup(...) parameter assignment. An example with ordinary "with" statement would look like: > > with open('README.txt') as readme: > setup( > .... > long_description=readme.read(), > .... > ) In python 3.3 and later you can use contexlib.ExitStack: with contextlib.ExitStack() as stack: setup( ... long_description = stack.enter_context(open('README.txt')).read(), ... ) But for simple scripts like a setup.py I wouldn't worry too much about closing files later than expected. Ronald > > The nesting level increases with every new file you need to read. > Another variation is intermediate variables, which is also less nice. > > with open('README.txt') as readme: > content = readme.read() > setup( > .... > long_description=content, > .... > ) > > -- > anatoly t. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From jsbueno at python.org.br Wed Apr 24 13:50:34 2013 From: jsbueno at python.org.br (Joao S. O. Bueno) Date: Wed, 24 Apr 2013 08:50:34 -0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: Message-ID: On 24 April 2013 05:59, anatoly techtonik wrote: > PySide Qt binding have an interesting property - when you create widgets, > you need to assign them to variables. When such variable is lost, object is > immediately destroyed. I truly hope it is not quite as you describe - otherwisew pySie would be completly unusable. What if one adds created objects to a list, instead of assigning them to a variable? Otherwiser, reference counting is usually enough in cPython to trigger object destruction - and if it was not working as this before, it was broken. (It is so in tkinter, for example: >>> import tkinter >>> t = tkinter.Tk() >>> del t >>> And the window it created is kept open - it indeed should be destroyed just as you put it. It can't be changed in tkinter now, or it would certainly break more than half the programs that use it. js -><- From ubershmekel at gmail.com Wed Apr 24 14:31:42 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Wed, 24 Apr 2013 15:31:42 +0300 Subject: [Python-ideas] Macros for Python In-Reply-To: References: Message-ID: On Wed, Apr 24, 2013 at 6:49 AM, Haoyi Li wrote: > you write a function that takes an AST and returns an AST, register it as > a macro, and you're off to the races. > Insane and insanely brilliant. I'm not going to touch this until a lot of smoke blows over. > a clone of .NET's LINQ to SQL. > Sounds awesome. This is completely uncharted territory for me. I'd love to hear how this pans out in a year or so. Seems like it's very powerful and can help a lot in shaping Python based Domain Specific Languages, but perhaps too powerful to the point where it'll end up a horrific and complicated bug magnet. Yuval Greenfield -------------- next part -------------- An HTML attachment was scrubbed... URL: From tjreedy at udel.edu Wed Apr 24 16:35:50 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Wed, 24 Apr 2013 10:35:50 -0400 Subject: [Python-ideas] Macros for Python In-Reply-To: References: Message-ID: On 4/23/2013 11:49 PM, Haoyi Li wrote: > I thought this may be of interest to some people on this list, even if > not strictly an "idea". > > I'm working on MacroPy , a little > pure-python library that allows user-defined AST rewrites as part of the > import process (using PEP 302). From the readme ''' String Interpolation a, b = 1, 2 c = s%"%{a} apple and %{b} bananas" print c #1 apple and 2 bananas ''' I am a little surprised that you would base a cutting edge extension on Py 2. Do you have it working with 3.3 also? '''Unlike the normal string interpolation in Python, MacroPy's string interpolation allows the programmer to specify the variables to be interpolated inline inside the string.''' Not true as I read that. a, b = 1, 2 print("{a} apple and {b} bananas".format(**locals())) print("%(a)s apple and %(b)s bananas" % locals()) #1 apple and 2 bananas #1 apple and 2 bananas I rather like the anon funcs with anon params. That only works when each param is only used once in the expression, but that restriction is the normal case. I am interested to see what you do with pattern matching. tjr From haoyi.sg at gmail.com Wed Apr 24 17:05:22 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 24 Apr 2013 11:05:22 -0400 Subject: [Python-ideas] Macros for Python In-Reply-To: References: Message-ID: >I am a little surprised that you would base a cutting edge extension on Py 2. Do you have it working with 3.3 also? It's not really a cutting edge extension yet, it's more a completely-crazy "you did WHAT?" proof of concept to explore the space of possibilities. 2.7 was what we had installed, so we just ran with it. Haven't done any testing at all on 3.4, but if the project turns out well (i.e. the functionality is actually usable, and people are interested) we could look at porting it. I don't think the core of the system will change much, but the individual macros may have to be re-written since the ASTs are slightly different. > a, b = 1, 2 print("{a} apple and {b} bananas".format(**locals())) print("%(a)s apple and %(b)s bananas" % locals()) Yes, you can do it like that. You can't do more complex stuff though, like "%{a ** b} is %{a} to the power of %{b}" Perhaps I should put it in the readme, since I already have a unit test for it. You actually can get a syntax like that without macros, using stack-introspection, locals-trickery and lots of `eval`. The question is whether you consider macros more "extreme" than stack-introspection, locals-trickery and `eval`! A JIT compiler will probably be much happier with macros. On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy wrote: > On 4/23/2013 11:49 PM, Haoyi Li wrote: > >> I thought this may be of interest to some people on this list, even if >> not strictly an "idea". >> >> I'm working on MacroPy >, >> a little >> >> pure-python library that allows user-defined AST rewrites as part of the >> import process (using PEP 302). >> > > From the readme > ''' > String Interpolation > > a, b = 1, 2 > c = s%"%{a} apple and %{b} bananas" > print c > #1 apple and 2 bananas > ''' > I am a little surprised that you would base a cutting edge extension on Py > 2. Do you have it working with 3.3 also? > > '''Unlike the normal string interpolation in Python, MacroPy's string > interpolation allows the programmer to specify the variables to be > interpolated inline inside the string.''' > > Not true as I read that. > > a, b = 1, 2 > print("{a} apple and {b} bananas".format(**locals())) > print("%(a)s apple and %(b)s bananas" % locals()) > #1 apple and 2 bananas > #1 apple and 2 bananas > > I rather like the anon funcs with anon params. That only works when each > param is only used once in the expression, but that restriction is the > normal case. > > I am interested to see what you do with pattern matching. > > tjr > > ______________________________**_________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/**mailman/listinfo/python-ideas > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Wed Apr 24 17:55:52 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Wed, 24 Apr 2013 08:55:52 -0700 Subject: [Python-ideas] Macros for Python In-Reply-To: References: Message-ID: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> On Apr 24, 2013, at 8:05, Haoyi Li wrote: > You actually can get a syntax like that without macros, using stack-introspection, locals-trickery and lots of `eval`. The question is whether you consider macros more "extreme" than stack-introspection, locals-trickery and `eval`! A JIT compiler will probably be much happier with macros. That last point makes this approach seem particularly interesting to me, which makes me wonder: Is your code CPython specific, or does it also work with PyPy (or Jython or Iron)? While PyPy is obviously a whole lot easier to mess with in the first place than CPython, having macros at the same language level as your code is just as interesting in both implementations. > > On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy wrote: >> On 4/23/2013 11:49 PM, Haoyi Li wrote: >>> I thought this may be of interest to some people on this list, even if >>> not strictly an "idea". >>> >>> I'm working on MacroPy , a little >>> >>> pure-python library that allows user-defined AST rewrites as part of the >>> import process (using PEP 302). >> >> From the readme >> ''' >> String Interpolation >> >> a, b = 1, 2 >> c = s%"%{a} apple and %{b} bananas" >> print c >> #1 apple and 2 bananas >> ''' >> I am a little surprised that you would base a cutting edge extension on Py 2. Do you have it working with 3.3 also? >> >> '''Unlike the normal string interpolation in Python, MacroPy's string interpolation allows the programmer to specify the variables to be interpolated inline inside the string.''' >> >> Not true as I read that. >> >> a, b = 1, 2 >> print("{a} apple and {b} bananas".format(**locals())) >> print("%(a)s apple and %(b)s bananas" % locals()) >> #1 apple and 2 bananas >> #1 apple and 2 bananas >> >> I rather like the anon funcs with anon params. That only works when each param is only used once in the expression, but that restriction is the normal case. >> >> I am interested to see what you do with pattern matching. >> >> tjr >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From haoyi.sg at gmail.com Wed Apr 24 19:53:23 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 24 Apr 2013 13:53:23 -0400 Subject: [Python-ideas] Macros for Python In-Reply-To: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: I haven't tested in on various platforms, so hard to say for sure. MacroPy basically relies on a few things: - exec/eval - PEP 302 - the ast module All of these are pretty old pieces of python (almost 10 years old!) so it's not some new-and-fancy functionality. Jython seems to have all of them, I couldn't find any information about PyPy. When the project is more mature and I have some time, I'll see if I can get it to work cross platform. If anyone wants to fork the repo and try it out, that'd be great too! -Haoyi On Wed, Apr 24, 2013 at 11:55 AM, Andrew Barnert wrote: > On Apr 24, 2013, at 8:05, Haoyi Li wrote: > > You actually can get a syntax like that without macros, using > stack-introspection, locals-trickery and lots of `eval`. The question is > whether you consider macros more "extreme" than stack-introspection, > locals-trickery and `eval`! A JIT compiler will probably be much happier > with macros. > > > That last point makes this approach seem particularly interesting to me, > which makes me wonder: Is your code CPython specific, or does it also work > with PyPy (or Jython or Iron)? While PyPy is obviously a whole lot easier > to mess with in the first place than CPython, having macros at the same > language level as your code is just as interesting in both implementations. > > > On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy wrote: > >> On 4/23/2013 11:49 PM, Haoyi Li wrote: >> >>> I thought this may be of interest to some people on this list, even if >>> not strictly an "idea". >>> >>> I'm working on MacroPy >, >>> a little >>> >>> pure-python library that allows user-defined AST rewrites as part of the >>> import process (using PEP 302). >>> >> >> From the readme >> ''' >> String Interpolation >> >> a, b = 1, 2 >> c = s%"%{a} apple and %{b} bananas" >> print c >> #1 apple and 2 bananas >> ''' >> I am a little surprised that you would base a cutting edge extension on >> Py 2. Do you have it working with 3.3 also? >> >> '''Unlike the normal string interpolation in Python, MacroPy's string >> interpolation allows the programmer to specify the variables to be >> interpolated inline inside the string.''' >> >> Not true as I read that. >> >> a, b = 1, 2 >> print("{a} apple and {b} bananas".format(**locals())) >> print("%(a)s apple and %(b)s bananas" % locals()) >> #1 apple and 2 bananas >> #1 apple and 2 bananas >> >> I rather like the anon funcs with anon params. That only works when each >> param is only used once in the expression, but that restriction is the >> normal case. >> >> I am interested to see what you do with pattern matching. >> >> tjr >> >> ______________________________**_________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/**mailman/listinfo/python-ideas >> > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonathan at slenders.be Wed Apr 24 23:48:07 2013 From: jonathan at slenders.be (Jonathan Slenders) Date: Wed, 24 Apr 2013 23:48:07 +0200 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: One use case I have is for Twisted's inlineCallbacks. I forked the pypy project to implement the await-keyword. Basically it transforms: def async_function(deferred_param): a = await deferred_param b = await some_call(a) return b into: @defer.inlineCallbacks def async_function(deferred_param): a = yield deferred_param b = yield some_call(a) yield defer.returnValue(b) Are such things possible? And if so, what lines of code would pdb show during introspection of the code? It's interesting, but when macros become more complicated, the debugging of these things can turn out to be really hard, I think. 2013/4/24 Haoyi Li : > I haven't tested in on various platforms, so hard to say for sure. MacroPy > basically relies on a few things: > > - exec/eval > - PEP 302 > - the ast module > > All of these are pretty old pieces of python (almost 10 years old!) so it's > not some new-and-fancy functionality. Jython seems to have all of them, I > couldn't find any information about PyPy. > > When the project is more mature and I have some time, I'll see if I can get > it to work cross platform. If anyone wants to fork the repo and try it out, > that'd be great too! > > -Haoyi > > > > > > On Wed, Apr 24, 2013 at 11:55 AM, Andrew Barnert wrote: >> >> On Apr 24, 2013, at 8:05, Haoyi Li wrote: >> >> You actually can get a syntax like that without macros, using >> stack-introspection, locals-trickery and lots of `eval`. The question is >> whether you consider macros more "extreme" than stack-introspection, >> locals-trickery and `eval`! A JIT compiler will probably be much happier >> with macros. >> >> >> That last point makes this approach seem particularly interesting to me, >> which makes me wonder: Is your code CPython specific, or does it also work >> with PyPy (or Jython or Iron)? While PyPy is obviously a whole lot easier to >> mess with in the first place than CPython, having macros at the same >> language level as your code is just as interesting in both implementations. >> >> >> On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy >> wrote: >>> >>> On 4/23/2013 11:49 PM, Haoyi Li wrote: >>>> >>>> I thought this may be of interest to some people on this list, even if >>>> not strictly an "idea". >>>> >>>> I'm working on MacroPy , a little >>>> >>>> pure-python library that allows user-defined AST rewrites as part of the >>>> import process (using PEP 302). >>> >>> >>> From the readme >>> ''' >>> String Interpolation >>> >>> a, b = 1, 2 >>> c = s%"%{a} apple and %{b} bananas" >>> print c >>> #1 apple and 2 bananas >>> ''' >>> I am a little surprised that you would base a cutting edge extension on >>> Py 2. Do you have it working with 3.3 also? >>> >>> '''Unlike the normal string interpolation in Python, MacroPy's string >>> interpolation allows the programmer to specify the variables to be >>> interpolated inline inside the string.''' >>> >>> Not true as I read that. >>> >>> a, b = 1, 2 >>> print("{a} apple and {b} bananas".format(**locals())) >>> print("%(a)s apple and %(b)s bananas" % locals()) >>> #1 apple and 2 bananas >>> #1 apple and 2 bananas >>> >>> I rather like the anon funcs with anon params. That only works when each >>> param is only used once in the expression, but that restriction is the >>> normal case. >>> >>> I am interested to see what you do with pattern matching. >>> >>> tjr >>> >>> _______________________________________________ >>> Python-ideas mailing list >>> Python-ideas at python.org >>> http://mail.python.org/mailman/listinfo/python-ideas >> >> >> _______________________________________________ >> Python-ideas mailing list >> Python-ideas at python.org >> http://mail.python.org/mailman/listinfo/python-ideas > > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From haoyi.sg at gmail.com Thu Apr 25 00:15:36 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Wed, 24 Apr 2013 18:15:36 -0400 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: @Jonathan: That would be possible, although I can't say I know how to do it. A naive macro that wraps everything and has a "substitute awaits for yields, wrap them in inlineCallbacks(), and substitute returns for returnValue()s" may work, but I'm guessing it would run into a forest of edge cases where the code isn't so simple (what if you *want* a return? etc.). pdb *should* show the code after macro expansion. Without source maps, I'm not sure there's any way around that, so debugging may be hard. Of course, if the alternative is macros of forking the interpreter, maybe macros is the easier way to do it =) Debugging a buggy custom-forked interpreter probably isn't easy either! On Wed, Apr 24, 2013 at 5:48 PM, Jonathan Slenders wrote: > One use case I have is for Twisted's inlineCallbacks. I forked the > pypy project to implement the await-keyword. Basically it transforms: > > def async_function(deferred_param): > a = await deferred_param > b = await some_call(a) > return b > > into: > > @defer.inlineCallbacks > def async_function(deferred_param): > a = yield deferred_param > b = yield some_call(a) > yield defer.returnValue(b) > > > Are such things possible? And if so, what lines of code would pdb show > during introspection of the code? > > It's interesting, but when macros become more complicated, the > debugging of these things can turn out to be really hard, I think. > > > 2013/4/24 Haoyi Li : > > I haven't tested in on various platforms, so hard to say for sure. > MacroPy > > basically relies on a few things: > > > > - exec/eval > > - PEP 302 > > - the ast module > > > > All of these are pretty old pieces of python (almost 10 years old!) so > it's > > not some new-and-fancy functionality. Jython seems to have all of them, I > > couldn't find any information about PyPy. > > > > When the project is more mature and I have some time, I'll see if I can > get > > it to work cross platform. If anyone wants to fork the repo and try it > out, > > that'd be great too! > > > > -Haoyi > > > > > > > > > > > > On Wed, Apr 24, 2013 at 11:55 AM, Andrew Barnert > wrote: > >> > >> On Apr 24, 2013, at 8:05, Haoyi Li wrote: > >> > >> You actually can get a syntax like that without macros, using > >> stack-introspection, locals-trickery and lots of `eval`. The question is > >> whether you consider macros more "extreme" than stack-introspection, > >> locals-trickery and `eval`! A JIT compiler will probably be much happier > >> with macros. > >> > >> > >> That last point makes this approach seem particularly interesting to me, > >> which makes me wonder: Is your code CPython specific, or does it also > work > >> with PyPy (or Jython or Iron)? While PyPy is obviously a whole lot > easier to > >> mess with in the first place than CPython, having macros at the same > >> language level as your code is just as interesting in both > implementations. > >> > >> > >> On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy > >> wrote: > >>> > >>> On 4/23/2013 11:49 PM, Haoyi Li wrote: > >>>> > >>>> I thought this may be of interest to some people on this list, even if > >>>> not strictly an "idea". > >>>> > >>>> I'm working on MacroPy , a little > >>>> > >>>> pure-python library that allows user-defined AST rewrites as part of > the > >>>> import process (using PEP 302). > >>> > >>> > >>> From the readme > >>> ''' > >>> String Interpolation > >>> > >>> a, b = 1, 2 > >>> c = s%"%{a} apple and %{b} bananas" > >>> print c > >>> #1 apple and 2 bananas > >>> ''' > >>> I am a little surprised that you would base a cutting edge extension on > >>> Py 2. Do you have it working with 3.3 also? > >>> > >>> '''Unlike the normal string interpolation in Python, MacroPy's string > >>> interpolation allows the programmer to specify the variables to be > >>> interpolated inline inside the string.''' > >>> > >>> Not true as I read that. > >>> > >>> a, b = 1, 2 > >>> print("{a} apple and {b} bananas".format(**locals())) > >>> print("%(a)s apple and %(b)s bananas" % locals()) > >>> #1 apple and 2 bananas > >>> #1 apple and 2 bananas > >>> > >>> I rather like the anon funcs with anon params. That only works when > each > >>> param is only used once in the expression, but that restriction is the > >>> normal case. > >>> > >>> I am interested to see what you do with pattern matching. > >>> > >>> tjr > >>> > >>> _______________________________________________ > >>> Python-ideas mailing list > >>> Python-ideas at python.org > >>> http://mail.python.org/mailman/listinfo/python-ideas > >> > >> > >> _______________________________________________ > >> Python-ideas mailing list > >> Python-ideas at python.org > >> http://mail.python.org/mailman/listinfo/python-ideas > > > > > > > > _______________________________________________ > > Python-ideas mailing list > > Python-ideas at python.org > > http://mail.python.org/mailman/listinfo/python-ideas > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Thu Apr 25 05:49:50 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 25 Apr 2013 06:49:50 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: Message-ID: On Wed, Apr 24, 2013 at 2:50 PM, Joao S. O. Bueno wrote: > On 24 April 2013 05:59, anatoly techtonik wrote: > > PySide Qt binding have an interesting property - when you create widgets, > > you need to assign them to variables. When such variable is lost, object > is > > immediately destroyed. > > I truly hope it is not quite as you describe - otherwisew pySie would > be completly unusable. > What if one adds created objects to a list, instead of assigning them > to a variable? > As long as the list references the widget, the widget won't be destroyed. "assign to variable" is not a correct term - I guess "reference keeping" is better. > Otherwiser, reference counting is usually enough in cPython to trigger > object destruction - > It is no guaranteed, so you can have a delayed shot in the foot, which is times worse that immediate. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Thu Apr 25 06:00:59 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 25 Apr 2013 07:00:59 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> Message-ID: On Wed, Apr 24, 2013 at 12:23 PM, Ronald Oussoren wrote: > On 24 Apr, 2013, at 10:59, anatoly techtonik wrote: > > > Long time no see, all. :P > > > > PySide Qt binding have an interesting property - when you create > widgets, you need to assign them to variables. When such variable is lost, > object is immediately destroyed. > > > > I often use this one-shot code in setup.py: > > ... > > long_description = open('README.txt').read(), > > .... > > > > Which probably leaves the README.txt file open until the setup.py exits. > So, the idea is to close the file as soon as the variable is lost. > > The file is automaticly closed as soon as the file object is garbage > collected. In your example CPython would currently collect at the end of > the read call (unles there is an exception) because of the reference > counting garbage collector, but other implementations have other garbage > collectors and can collect the file object (much) later. Right. Automatic context manager proposal brings this mechanism from garbage collection implementation level to language definition level. > > > > I don't know why it was not implemented in the first place. Any ideas? > > It was implemented a long time ago. The with statement was added because > relying on automatic resource cleanup by a destructor might clean up the > resource too late (for example because the file object is referenced by a > local variable in a frame that's referenced by an exception). You're speaking about immediate garbage collection for file object in CPython, which is a narrowing. I am still questioning about the whole concept. Automatic content managers are useful for other cases and other Python implementations. > > > > Depending on the answer to the above, the solution can be different. I > assume that this was done for a reason (probably immediate garbage > collection is too expensive), but confirmation is welcome. Meanwhile the > solution can be implemented with auto context manager, which __exit__ > method is automatically called when all links to created object are lost. > > > > Difference from usual "with something" is that it is transparent to the > user (leaves less details to worry about) and makes code more beautiful -- > in the example above the assignment is made inside setup(...) parameter > assignment. An example with ordinary "with" statement would look like: > > > > with open('README.txt') as readme: > > setup( > > .... > > long_description=readme.read(), > > .... > > ) > > In python 3.3 and later you can use contexlib.ExitStack: > > with contextlib.ExitStack() as stack: > setup( > ... > long_description = > stack.enter_context(open('README.txt')).read(), > ... > ) > It is no better than an additional indented with statement - both require a change to the whole parent block. > But for simple scripts like a setup.py I wouldn't worry too much about > closing files later than expected. > It is just a real-world user story to backup the rationale. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Thu Apr 25 06:47:00 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Thu, 25 Apr 2013 07:47:00 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <20130425042335.GA60739@cskk.homeip.net> References: <20130425042335.GA60739@cskk.homeip.net> Message-ID: On Thu, Apr 25, 2013 at 7:23 AM, Cameron Simpson wrote: > On 25Apr2013 06:49, anatoly techtonik wrote: > | On Wed, Apr 24, 2013 at 2:50 PM, Joao S. O. Bueno >wrote: > | > On 24 April 2013 05:59, anatoly techtonik wrote: > | > > PySide Qt binding have an interesting property - when you create > widgets, > | > > you need to assign them to variables. When such variable is lost, > object > | > is > | > > immediately destroyed. > | > > | > I truly hope it is not quite as you describe - otherwisew pySie would > | > be completly unusable. > | > What if one adds created objects to a list, instead of assigning them > | > to a variable? > | > | As long as the list references the widget, the widget won't be destroyed. > | "assign to variable" is not a correct term - I guess "reference keeping" > is > | better. > > Then aren't you just talking about the __del__ method? > No. The __del__ method is only called during garbage collection phase which may be delayed. In PySide the QObject is deleted immediately. -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From cs at zip.com.au Thu Apr 25 06:23:35 2013 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 25 Apr 2013 14:23:35 +1000 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: Message-ID: <20130425042335.GA60739@cskk.homeip.net> On 25Apr2013 06:49, anatoly techtonik wrote: | On Wed, Apr 24, 2013 at 2:50 PM, Joao S. O. Bueno wrote: | > On 24 April 2013 05:59, anatoly techtonik wrote: | > > PySide Qt binding have an interesting property - when you create widgets, | > > you need to assign them to variables. When such variable is lost, object | > is | > > immediately destroyed. | > | > I truly hope it is not quite as you describe - otherwisew pySie would | > be completly unusable. | > What if one adds created objects to a list, instead of assigning them | > to a variable? | | As long as the list references the widget, the widget won't be destroyed. | "assign to variable" is not a correct term - I guess "reference keeping" is | better. Then aren't you just talking about the __del__ method? -- Cameron Simpson Ninety percent of everything is crud. - Theodore Sturgeon From ronaldoussoren at mac.com Thu Apr 25 07:46:26 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Thu, 25 Apr 2013 07:46:26 +0200 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> Message-ID: On 25 Apr, 2013, at 6:00, anatoly techtonik wrote: > On Wed, Apr 24, 2013 at 12:23 PM, Ronald Oussoren wrote: > > > > > On 24 Apr, 2013, at 10:59, anatoly techtonik wrote: > > > Long time no see, all. :P > > > > PySide Qt binding have an interesting property - when you create widgets, you need to assign them to variables. When such variable is lost, object is immediately destroyed. > > > > I often use this one-shot code in setup.py: > > ... > > long_description = open('README.txt').read(), > > .... > > > > Which probably leaves the README.txt file open until the setup.py exits. So, the idea is to close the file as soon as the variable is lost. > > The file is automaticly closed as soon as the file object is garbage collected. In your example CPython would currently collect at the end of the read call (unles there is an exception) because of the reference counting garbage collector, but other implementations have other garbage collectors and can collect the file object (much) later. > > Right. Automatic context manager proposal brings this mechanism from garbage collection implementation level to language definition level. What proposal? What you appear to propose is either that implementations must use a reference counting collector (more or less ensuring that the file will be closed after the call to read in your example), or that the exit part of the context protocol is run whenever an object is going out of scope. Neither is going to happen, the language specification doesn't proscribe the garbage collection algorithm for a reason: implementations of Python like Jython and IronPython inherit the garbage collector from their host environment (the JVM and CLR). Automaticly calling __exit__ when an object goes out of scope won't work either, it would break passing arguments to functions. Ronald From steve at pearwood.info Thu Apr 25 08:17:10 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Thu, 25 Apr 2013 16:17:10 +1000 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> Message-ID: <5178CA66.3060205@pearwood.info> On 25/04/13 14:47, anatoly techtonik wrote: > On Thu, Apr 25, 2013 at 7:23 AM, Cameron Simpson wrote: >> Then aren't you just talking about the __del__ method? >> > > No. The __del__ method is only called during garbage collection phase which > may be delayed. In PySide the QObject is deleted immediately. Citation please. Where is this documented? I find your claim difficult to believe, unless PySide implements it's own garbage collector which runs side-by-side to the Python one and knows about PySide objects. Otherwise when objects are deleted depends on Python, not the framework. Objects in Python do not know when they are deleted unless Python calls their __del__ method. My guess is that if you set up a circular reference between two PySide objects, they will suffer the exact same delay in garbage collection as any other two objects. This thread on the PySide mailing list suggests that you are mistaken, PySide does not have superpowers over and above Python's garbage collector, and is subject to the exact same non-deterministic destructors as any other Python object. Whether you call that destructor __del__ or __exit__ makes no difference. http://www.mail-archive.com/pyside at lists.openbossa.org/msg01029.html Or, and for the record, the reason that with statements work so well is because they are guaranteed to be deterministic. You cannot leave the with block without the __exit__ method being called. It doesn't matter whether you have one reference to the context manager object or ten references, the __exit__ method is still called, and the object still exists. That is *very* different from a destructor method. py> with open('/tmp/rubbish', 'w') as f: ... f.write('hello world') ... 11 py> f.write('goodbye') Traceback (most recent call last): File "", line 1, in ValueError: I/O operation on closed file. py> f <_io.TextIOWrapper name='/tmp/rubbish' mode='w' encoding='UTF-8'> On the other hand, objects being freed is not deterministic. They'll be freed when there are no longer any references to it, which may be Never. Reference counting GCs are deterministic, but cannot deal with circular references. Other GCs can deal with circular references, but are non-deterministic. Even the Java GC doesn't guarantee that the finalize() method will always be called. -- Steven From cs at zip.com.au Thu Apr 25 10:16:40 2013 From: cs at zip.com.au (Cameron Simpson) Date: Thu, 25 Apr 2013 18:16:40 +1000 Subject: [Python-ideas] Automatic context managers In-Reply-To: <5178CA66.3060205@pearwood.info> References: <5178CA66.3060205@pearwood.info> Message-ID: <20130425081640.GA72882@cskk.homeip.net> On 25/04/13 14:47, anatoly techtonik wrote: | >On Thu, Apr 25, 2013 at 7:23 AM, Cameron Simpson wrote: | | >>Then aren't you just talking about the __del__ method? | > | >No. The __del__ method is only called during garbage collection phase which | >may be delayed. In PySide the QObject is deleted immediately. The CPython doco says when the reference count goes to zero. (Snippet below.) So in your example, also immediately. Garbage collection can find find detached objects whose count hasn't gone to zero, but I think your example would fit the refs-gone-to-zero, and anyway I think that was your criterion for having this work in the first place (automatically). So I'm not sure how __del__ is particularly less predictable than your "implicit" scenario. Then at 25Apr2013 16:17, Steven D'Aprano wrote: | Citation please. Where is this documented? Well, the 3.2.3 doco on __del__ says: Called when the instance is about to be destroyed. [...] It is not guaranteed that __del__() methods are called for objects that still exist when the interpreter exits. [...] del x doesn?t directly call x.__del__() ? the former decrements the reference count for x by one, and the latter is only called when x?s reference count reaches zero. [ <-- Obviously only valid for ref counting Python implementations. - Cameron ] Some common situations that may prevent the reference count of an object from going to zero include: [...] The 2.7 doco is very similar. So, yes, when references go to zero. But as you say later, that may never happen. The GC _may_ find isolated circles and delete then, thus at GC time in anatoly's nomenclature. Reference counting makes __del__ fairly predictable if you have tight control over the references to an object. Not always the case of course. And other Pythons don't necessarily do reference counting unless I misremember. | Or, and for the record, the reason that with statements work so | well is because they are guaranteed to be deterministic. You cannot | leave the with block without the __exit__ method being called. It | doesn't matter whether you have one reference to the context manager | object or ten references, the __exit__ method is still called, and | the object still exists. This is why I'm for with statements also. | That is *very* different from a destructor method. [... snip other stuff I agree with...; I fact I agree with everything you say but anatoly is not totally off the mark with "GC time". ] Yep. -- Cameron Simpson I am now convinced that theoretical physics is actual philosophy. - Max Born From steve at pearwood.info Thu Apr 25 20:20:36 2013 From: steve at pearwood.info (Steven D'Aprano) Date: Fri, 26 Apr 2013 04:20:36 +1000 Subject: [Python-ideas] Automatic context managers In-Reply-To: <20130425081640.GA72882@cskk.homeip.net> References: <5178CA66.3060205@pearwood.info> <20130425081640.GA72882@cskk.homeip.net> Message-ID: <517973F4.1040302@pearwood.info> On 25/04/13 18:16, Cameron Simpson wrote: > Then at 25Apr2013 16:17, Steven D'Aprano wrote: > | Citation please. Where is this documented? > > Well, the 3.2.3 doco on __del__ says: Please be more careful about quoting me out of context. I'm perfectly aware of what the docs for __del__ say. I was addressing my question to Anatoly, asking him for documentation for his claims that PySide objects are more deterministic than __del__, that is, that they don't suffer from the same issues regarding circular references and __del__ as other, non-PySide objects. I won't categorically say that's impossible, but I find it an extraordinary claim that requires more evidence than just one person's say so. [...] > Reference counting makes __del__ fairly predictable if you have > tight control over the references to an object. Not always the case > of course. And other Pythons don't necessarily do reference counting > unless I misremember. You remember correctly. IronPython and Jython use the .Net and Java virtual machines, including their garbage collectors, neither of which are reference counting. (In fact, there's often a fair bit of snobbery in Java circles about CPython's ref counting not being a "real" GC.) PyPy can use various GCs, selected at build-time (I think), including a ref counting one. I don't know about other implementations. -- Steven From techtonik at gmail.com Fri Apr 26 15:02:28 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 26 Apr 2013 16:02:28 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <5178CA66.3060205@pearwood.info> References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> Message-ID: On Thu, Apr 25, 2013 at 9:17 AM, Steven D'Aprano wrote: > On 25/04/13 14:47, anatoly techtonik wrote: > >> On Thu, Apr 25, 2013 at 7:23 AM, Cameron Simpson wrote: >> > > Then aren't you just talking about the __del__ method? >>> >>> >> No. The __del__ method is only called during garbage collection phase >> which >> may be delayed. In PySide the QObject is deleted immediately. >> > > Citation please. Where is this documented? > Here: http://qt-project.org/wiki/PySide_Pitfalls """ If a QObject falls out of scope in Python, it will get deleted. You have to take care of keeping a reference to the object: * Store it as an attribute of an object you keep around, e.g. self.window = QMainWindow() * Pass a parent QObject to the object?s constructor, so it gets owned by the parent """ This thread on the PySide mailing list suggests that you are mistaken, > PySide does not have superpowers over and above Python's garbage collector, > and is subject to the exact same non-deterministic destructors as any other > Python object. Whether you call that destructor __del__ or __exit__ makes > no difference. > > http://www.mail-archive.com/**pyside at lists.openbossa.org/**msg01029.html I am not an expert in internals. I guess the QObject is on a C side - not on a Python side, so it is destroyed immediately. And perhaps when you wrap (subclass) it on Python side, it will start to suffer from delayed garbage collection. Is that plausible? > Or, and for the record, the reason that with statements work so well is > because they are guaranteed to be deterministic. You cannot leave the with > block without the __exit__ method being called. It doesn't matter whether > you have one reference to the context manager object or ten references, the > __exit__ method is still called, and the object still exists. That is > *very* different from a destructor method. > I am not touching destructor methods. The idea is to make with statement transparent - embed inside objects that require it. I am not sure what the implementation should be. Probably object should have an ability to enable context scope tracking in its constructor, to tell Python to call its __exit__ method at the moment when its reference count reaches zero, and before it is garbage collected. > On the other hand, objects being freed is not deterministic. They'll be > freed when there are no longer any references to it, which may be Never. > > Reference counting GCs are deterministic, but cannot deal with circular > references. Other GCs can deal with circular references, but are > non-deterministic. Even the Java GC doesn't guarantee that the finalize() > method will always be called. This circular reference problem is interesting. In object space it probably looks like a stellar detached from the visible (attached) universe. Is the main problem in detecting it? -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Fri Apr 26 15:22:48 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 26 Apr 2013 16:22:48 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> Message-ID: On Thu, Apr 25, 2013 at 8:46 AM, Ronald Oussoren wrote: > > On 25 Apr, 2013, at 6:00, anatoly techtonik wrote: > > > On Wed, Apr 24, 2013 at 12:23 PM, Ronald Oussoren < > ronaldoussoren at mac.com> wrote: > > > > > > > > > > On 24 Apr, 2013, at 10:59, anatoly techtonik > wrote: > > > > > Long time no see, all. :P > > > > > > PySide Qt binding have an interesting property - when you create > widgets, you need to assign them to variables. When such variable is lost, > object is immediately destroyed. > > > > > > I often use this one-shot code in setup.py: > > > ... > > > long_description = open('README.txt').read(), > > > .... > > > > > > Which probably leaves the README.txt file open until the setup.py > exits. So, the idea is to close the file as soon as the variable is lost. > > > > The file is automaticly closed as soon as the file object is garbage > collected. In your example CPython would currently collect at the end of > the read call (unles there is an exception) because of the reference > counting garbage collector, but other implementations have other garbage > collectors and can collect the file object (much) later. > > > > Right. Automatic context manager proposal brings this mechanism from > garbage collection implementation level to language definition level. > > What proposal? What you appear to propose is either that implementations > must use a reference counting collector (more or less ensuring that the > file will be closed after the call to read in your example), or that the > exit part of the context protocol is run whenever an object is going out of > scope. > The proposal is fully illustrated by the user story above - immediately close file after it is read operation is complete. The proposed implementation is automatic context manager -- optional, Python level mechanism to run exit part of the context protocol when object loses all references. GC is out of scope here. > Automaticly calling __exit__ when an object goes out of scope won't work > either, it would break passing arguments to functions. > Why? Can you provide an example? -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Fri Apr 26 15:33:38 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 26 Apr 2013 09:33:38 -0400 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> Message-ID: <517A8232.6060604@nedbatchelder.com> On 4/26/2013 9:22 AM, anatoly techtonik wrote: > The proposal is fully illustrated by the user story above - > immediately close file after it is read operation is complete. The > proposed implementation is automatic context manager -- optional, > Python level mechanism to run exit part of the context protocol when > object loses all references. GC is out of scope here. The behavior is already that files are closed as part of reclaiming the file object. I don't see how your proposal could change that behavior, since your proposal (I think) is to call __exit__ when the object is reclaimed, but all that does for a file is close the file. Since your proposal doesn't seem to change existing behavior, I can only conclude that either you misunderstand the existing behavior, or I misunderstand your proposal. > Automaticly calling __exit__ when an object goes out of scope > won't work either, it would break passing arguments to functions. > > > Why? Can you provide an example? > Part of the confusion here is the phrase "when an object goes out of scope". Values in Python have no scope. Names have scope. This proposal doesn't involve names, it involves values, and so can have nothing to do with scope, unless I've misunderstood something. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From clay.sweetser at gmail.com Fri Apr 26 15:39:25 2013 From: clay.sweetser at gmail.com (Clay Sweetser) Date: Fri, 26 Apr 2013 09:39:25 -0400 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: References: Message-ID: I've noticed that the python standard library lacks a cross-platform sound/audio module, instead having seperate modules for linux and windows. Is there any reason a standard cross platform library has not been created yet? Clay Sweetser -------------- next part -------------- An HTML attachment was scrubbed... URL: From ronaldoussoren at mac.com Fri Apr 26 15:45:58 2013 From: ronaldoussoren at mac.com (Ronald Oussoren) Date: Fri, 26 Apr 2013 15:45:58 +0200 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> Message-ID: <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> On 26 Apr, 2013, at 15:22, anatoly techtonik wrote: > On Thu, Apr 25, 2013 at 8:46 AM, Ronald Oussoren wrote: > > On 25 Apr, 2013, at 6:00, anatoly techtonik wrote: > > > On Wed, Apr 24, 2013 at 12:23 PM, Ronald Oussoren wrote: > > > > > > > > > > On 24 Apr, 2013, at 10:59, anatoly techtonik wrote: > > > > > Long time no see, all. :P > > > > > > PySide Qt binding have an interesting property - when you create widgets, you need to assign them to variables. When such variable is lost, object is immediately destroyed. > > > > > > I often use this one-shot code in setup.py: > > > ... > > > long_description = open('README.txt').read(), > > > .... > > > > > > Which probably leaves the README.txt file open until the setup.py exits. So, the idea is to close the file as soon as the variable is lost. > > > > The file is automaticly closed as soon as the file object is garbage collected. In your example CPython would currently collect at the end of the read call (unles there is an exception) because of the reference counting garbage collector, but other implementations have other garbage collectors and can collect the file object (much) later. > > > > Right. Automatic context manager proposal brings this mechanism from garbage collection implementation level to language definition level. > > What proposal? What you appear to propose is either that implementations must use a reference counting collector (more or less ensuring that the file will be closed after the call to read in your example), or that the exit part of the context protocol is run whenever an object is going out of scope. > > The proposal is fully illustrated by the user story above Right, there is no proposal, only vague handwaving. I haven't seen anything yet that wouldn't require the use of refcounting (the file is closed as soon as the last reference to the file object goes away), or some serious magic (when you want the file object to be closed even when read raises an exeption). When you want to propose something you need to do some work yourself. That doesn't mean you have to provide a patch, but you do need to specify your proposal detailed enough to understand it without trying to second guess you. The batteries of my crystal ball ran out, Ronald From solipsis at pitrou.net Fri Apr 26 15:55:59 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Fri, 26 Apr 2013 15:55:59 +0200 Subject: [Python-ideas] Cross Platform Python Sound Module/Library References: Message-ID: <20130426155559.7307e759@pitrou.net> Hello, Le Fri, 26 Apr 2013 09:39:25 -0400, Clay Sweetser a ?crit : > I've noticed that the python standard library lacks a cross-platform > sound/audio module, instead having seperate modules for linux and > windows. Is there any reason a standard cross platform library has > not been created yet? Because writing a rich cross-platform abstraction for audio isn't easy, and it isn't really in our core competences. It's a better idea to use bindings for existing libraries such as Portaudio, Phonon or SDL. Regards Antoine. From phd at phdru.name Fri Apr 26 15:56:23 2013 From: phd at phdru.name (Oleg Broytman) Date: Fri, 26 Apr 2013 17:56:23 +0400 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: References: Message-ID: <20130426135623.GA11244@iskra.aviel.ru> On Fri, Apr 26, 2013 at 09:39:25AM -0400, Clay Sweetser wrote: > I've noticed that the python standard library lacks a cross-platform > sound/audio module, instead having seperate modules for linux and windows. > Is there any reason a standard cross platform library has not been created > yet? Perhaps because Python's cross-platform code is often a reimplementation of standard protocols (libraries such as http.py or smtp.py) and Python's libraries are usually thin wrappers over operating system libraries. Some OS libraries are cross-platform (e.g., libz) but with audio you've touched a darker area. Are there cross-platform audio libraries that Python could wrap? Also it is because nobody has proposed a patch. Wanna be the champion? Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From techtonik at gmail.com Fri Apr 26 16:36:06 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 26 Apr 2013 17:36:06 +0300 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: References: Message-ID: Because there are many user stories and it is hard to get API that suits them all. Audio is not a sound - it is a stream, and the stream is continuous. You need to be able to detect latency and control the buffer size to compensate that. There is also a problem of mixing multiple streams - your system probably has a limitation for it, and multiple ways to do this. Anyway, it would be awesome to have at least some basic user stories of playing sounds covered by stdlib. I experimented with some things on Windows, so you may find the following public domain code useful - https://bitbucket.org/techtonik/audiosocket - it plays CD format audio streams. -- anatoly t. On Fri, Apr 26, 2013 at 4:39 PM, Clay Sweetser wrote: > I've noticed that the python standard library lacks a cross-platform > sound/audio module, instead having seperate modules for linux and windows. > Is there any reason a standard cross platform library has not been created > yet? > > Clay Sweetser > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Apr 26 15:55:25 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 26 Apr 2013 06:55:25 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> Message-ID: <517A874D.7020204@stoneleaf.us> On 04/26/2013 06:02 AM, anatoly techtonik wrote: > On Thu, Apr 25, 2013 at 9:17 AM, Steven D'Aprano > wrote: > > On 25/04/13 14:47, anatoly techtonik wrote: > > On Thu, Apr 25, 2013 at 7:23 AM, Cameron Simpson > wrote: > > > Then aren't you just talking about the __del__ method? > > > No. The __del__ method is only called during garbage collection phase which > may be delayed. In PySide the QObject is deleted immediately. > > > Citation please. Where is this documented? > > > Here:? http://qt-project.org/wiki/PySide_Pitfalls > > """ > If a QObject falls out of scope in Python, it will get deleted. You have to take care of keeping a reference to the object: > > * Store it as an attribute of an object you keep around, e.g. self.window = QMainWindow() > * Pass a parent QObject to the object???s constructor, so it gets owned by the parent > """ > > This thread on the PySide mailing list suggests that you are mistaken, PySide does not have superpowers over and > above Python's garbage collector, and is subject to the exact same non-deterministic destructors as any other Python > object. Whether you call that destructor __del__ or __exit__ makes no difference. > > http://www.mail-archive.com/__pyside at lists.openbossa.org/__msg01029.html > You'll notice it doesn't say "gets /immediately/ deleted" -- because it doesn't. It gets deleted when it gets garbage collected. -- ~Ethan~ From techtonik at gmail.com Fri Apr 26 17:21:25 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 26 Apr 2013 18:21:25 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <517A874D.7020204@stoneleaf.us> References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517A874D.7020204@stoneleaf.us> Message-ID: On Fri, Apr 26, 2013 at 4:55 PM, Ethan Furman wrote: > On 04/26/2013 06:02 AM, anatoly techtonik wrote: > >> On Thu, Apr 25, 2013 at 9:17 AM, Steven D'Aprano > steve at pearwood.info>> wrote: >> >> On 25/04/13 14:47, anatoly techtonik wrote: >> >> On Thu, Apr 25, 2013 at 7:23 AM, Cameron Simpson > cs at zip.com.au>> wrote: >> >> >> Then aren't you just talking about the __del__ method? >> >> >> No. The __del__ method is only called during garbage collection >> phase which >> may be delayed. In PySide the QObject is deleted immediately. >> >> >> Citation please. Where is this documented? >> >> >> Here:? http://qt-project.org/wiki/**PySide_Pitfalls >> >> >> """ >> If a QObject falls out of scope in Python, it will get deleted. You have >> to take care of keeping a reference to the object: >> >> * Store it as an attribute of an object you keep around, e.g. self.window >> = QMainWindow() >> * Pass a parent QObject to the object???s constructor, so it gets owned >> by the parent >> >> """ >> >> This thread on the PySide mailing list suggests that you are >> mistaken, PySide does not have superpowers over and >> above Python's garbage collector, and is subject to the exact same >> non-deterministic destructors as any other Python >> object. Whether you call that destructor __del__ or __exit__ makes no >> difference. >> >> http://www.mail-archive.com/__**pyside at lists.openbossa.org/__** >> msg01029.html >> > msg01029.html >> > >> > > You'll notice it doesn't say "gets /immediately/ deleted" -- because it > doesn't. It gets deleted when it gets garbage collected. > Are you sure about that? The example on the PySide wiki is pretty reproducible. With current garbage collector lazyness it should be at least in some cases non-reliable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Fri Apr 26 17:25:17 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 26 Apr 2013 18:25:17 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> Message-ID: On Fri, Apr 26, 2013 at 4:45 PM, Ronald Oussoren wrote: > > On 26 Apr, 2013, at 15:22, anatoly techtonik wrote: > > > On Thu, Apr 25, 2013 at 8:46 AM, Ronald Oussoren > wrote: > > > > On 25 Apr, 2013, at 6:00, anatoly techtonik wrote: > > > > > On Wed, Apr 24, 2013 at 12:23 PM, Ronald Oussoren < > ronaldoussoren at mac.com> wrote: > > > > > > > > > > > > > > > On 24 Apr, 2013, at 10:59, anatoly techtonik > wrote: > > > > > > > Long time no see, all. :P > > > > > > > > PySide Qt binding have an interesting property - when you create > widgets, you need to assign them to variables. When such variable is lost, > object is immediately destroyed. > > > > > > > > I often use this one-shot code in setup.py: > > > > ... > > > > long_description = open('README.txt').read(), > > > > .... > > > > > > > > Which probably leaves the README.txt file open until the setup.py > exits. So, the idea is to close the file as soon as the variable is lost. > > > > > > The file is automaticly closed as soon as the file object is garbage > collected. In your example CPython would currently collect at the end of > the read call (unles there is an exception) because of the reference > counting garbage collector, but other implementations have other garbage > collectors and can collect the file object (much) later. > > > > > > Right. Automatic context manager proposal brings this mechanism from > garbage collection implementation level to language definition level. > > > > What proposal? What you appear to propose is either that > implementations must use a reference counting collector (more or less > ensuring that the file will be closed after the call to read in your > example), or that the exit part of the context protocol is run whenever an > object is going out of scope. > > > > The proposal is fully illustrated by the user story above > > Right, there is no proposal, only vague handwaving. I haven't seen > anything yet that wouldn't require the use of refcounting (the file is > closed as soon as the last reference to the file object goes away), or some > serious magic (when you want the file object to be closed even when read > raises an exeption). > > When you want to propose something you need to do some work yourself. That > doesn't mean you have to provide a patch, but you do need to specify your > proposal detailed enough to understand it without trying to second guess > you. > > The batteries of my crystal ball ran out, > Ok. The proposal is patch Python to be able to write: boolean = open(resource).use() Instead of: boolean = None with open(resource) as tempvar: boolean = tempvar.use() That's it. I am not pretending I know how to implement it. I just expressed my opinion that this might be possible, because PySide seems to do this somehow. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Fri Apr 26 17:39:24 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Fri, 26 Apr 2013 11:39:24 -0400 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517A874D.7020204@stoneleaf.us> Message-ID: <517A9FAC.4090004@nedbatchelder.com> On 4/26/2013 11:21 AM, anatoly techtonik wrote: > > """ > If a QObject falls out of scope in Python, it will get > deleted. You have to take care of keeping a reference to the > object: > > * Store it as an attribute of an object you keep around, e.g. > self.window = QMainWindow() > * Pass a parent QObject to the object?EUR^(TM)s constructor, > so it gets owned by the parent > > """ > > This thread on the PySide mailing list suggests that you > are mistaken, PySide does not have superpowers over and > above Python's garbage collector, and is subject to the > exact same non-deterministic destructors as any other Python > object. Whether you call that destructor __del__ or > __exit__ makes no difference. > > http://www.mail-archive.com/__pyside at lists.openbossa.org/__msg01029.html > > > > > You'll notice it doesn't say "gets /immediately/ deleted" -- > because it doesn't. It gets deleted when it gets garbage collected. > > > Are you sure about that? The example on the PySide wiki is pretty > reproducible. With current garbage collector lazyness it should be at > least in some cases non-reliable. Again, I suspect we are falling prey to fuzzy language. CPython will reclaim objects as soon as their reference count reaches zero. This is not the garbage collector. The garbage collector is a separate facility which kicks in every once in a while to find objects that have non-zero reference counts, even though they are unreachable, because of circular references. Some people say "garbage collector" or "garbage collection" to mean the usual reclamation of objects when their refcount reaches zero, but this is imprecise and confusing when mixed with people who use the term differently. I know nothing about the internals of QObjects, but like most others, I *strongly* suspect that they are doing nothing special above and beyond what Python does to determine the lifetime of an object. Their cleanup happens when the object is reclaimed (note I am careful not to say, "when the object is garbage collected"). The example on the PySide wiki is reproducible because it is not subject to garbage collector laziness. The "animation" name is a local in animate_stuff. At the end of that function, the name "animation" falls out of scope, decrementing the reference count on the QPropertyAnimation object it referenced. That object now has a reference count of zero, so it is reclaimed. The cleanup code is then invoked, destroying the native objects as well. --Ned. -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Fri Apr 26 17:54:49 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 26 Apr 2013 16:54:49 +0100 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> Message-ID: <517AA349.5080504@mrabarnett.plus.com> On 26/04/2013 14:02, anatoly techtonik wrote: > On Thu, Apr 25, 2013 at 9:17 AM, Steven D'Aprano > wrote: [snip] > I am not touching destructor methods. The idea is to make with statement > transparent - embed inside objects that require it. I am not sure what > the implementation should be. Probably object should have an ability to > enable context scope tracking in its constructor, to tell Python to call > its __exit__ method at the moment when its reference count reaches zero, > and before it is garbage collected. > > On the other hand, objects being freed is not deterministic. They'll > be freed when there are no longer any references to it, which may be > Never. > > Reference counting GCs are deterministic, but cannot deal with > circular references. Other GCs can deal with circular references, > but are non-deterministic. Even the Java GC doesn't guarantee that > the finalize() method will always be called. > > > This circular reference problem is interesting. In object space it > probably looks like a stellar detached from the visible (attached) > universe. Is the main problem in detecting it? > The problem is in knowing in which order the objects should be collected. For example, if A refers to B and B refers to A, should you collect A then B, or B then A? If you collect A first, then, for a time, B will be referring to a non-existent object. That's not good if the objects have destructors which need to be run. From g.brandl at gmx.net Fri Apr 26 19:17:26 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 26 Apr 2013 19:17:26 +0200 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> Message-ID: Am 26.04.2013 17:25, schrieb anatoly techtonik: > Right, there is no proposal, only vague handwaving. I haven't seen anything > yet that wouldn't require the use of refcounting (the file is closed as soon > as the last reference to the file object goes away), or some serious magic > (when you want the file object to be closed even when read raises an exeption). > > When you want to propose something you need to do some work yourself. That > doesn't mean you have to provide a patch, but you do need to specify your > proposal detailed enough to understand it without trying to second guess you. > > The batteries of my crystal ball ran out, > > > Ok. The proposal is patch Python to be able to write: > > boolean = open(resource).use() > > Instead of: > > boolean = None > with open(resource) as tempvar: > boolean = tempvar.use() > > That's it. I am not pretending I know how to implement it. I just expressed my > opinion that this might be possible, because PySide seems to do this somehow. Well, then consider this proposal rejected. It's nothing new, objects have cleaned up resources in their __del__ (or equivalent C-level destructor/finalizer) for ages. PySide doesn't do anything different, and as others have mentioned, due to CPython's use of reference counting you can get deterministic behavior if you are careful not to create cycles. However, at least for resources like files this is *exactly* what we have been moving away from even before the introduction of the "with" statement; in fact, in today's Python the destructor of file objects emits a ResourceWarning (which is silent by default, since many users still rely on this behavior; but you can see it with "python -Wa"). There are several good reasons for this: * Explicit is better than implicit: letting Python handle resource cleanup is possible to manage, especially for small numbers of resources and small pieces of code, it quickly gets annoying and creates exactly the sort of problem that Python usually does away with: tracking object lifetimes yourself. With a "with" statement, you know that your resources *are* cleaned up, and when. * Most other Python implementations have never had reference counting, and never will (e.g. PyPy). Collecting their unreachable objects differently enables optimizations. Georg From python at mrabarnett.plus.com Fri Apr 26 19:20:09 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 26 Apr 2013 18:20:09 +0100 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517AA349.5080504@mrabarnett.plus.com> Message-ID: <517AB749.3040501@mrabarnett.plus.com> On 26/04/2013 17:52, Chris Angelico wrote: > On Sat, Apr 27, 2013 at 1:54 AM, MRAB wrote: >> On 26/04/2013 14:02, anatoly techtonik wrote: >>> This circular reference problem is interesting. In object space it >>> probably looks like a stellar detached from the visible (attached) >>> universe. Is the main problem in detecting it? >>> >> The problem is in knowing in which order the objects should be >> collected. >> >> For example, if A refers to B and B refers to A, should you collect A >> then B, or B then A? If you collect A first, then, for a time, B will >> be referring to a non-existent object. That's not good if the objects >> have destructors which need to be run. > > Spin-off thread from python-ideas to discuss a more general question > of garbage collection of cyclic structures. > > Once it's been proven that there's an unreferenced cycle, why not > simply dispose of one of the objects, and replace all references to it > (probably only one - preferably pick an object with the fewest > references) with a special temporary object? In fact, that could > probably be done in CPython by wiping out the object in memory and > replacing it with a special marker of some sort, which would then > automatically "take over" all references to the old object. Any > attempt to manipulate this object could simply pop back with a > DestructedObject exception or something. > I wonder whether it would be best to call the __del__ method of the newest object (if it's possible to determine which is the newest) in such a case, then replace _that_ object with the DestructedObject (the "special marker" would be just a special "destructed" object). > Is this a plausible (never mind viable yet, just conceptually > plausible) alternative to sticking them into gc.garbage and ignoring > them? It'd allow a double-linked list/tree to function cleanly - > imagine, for instance, something like the DOM facilities available to > web browser scripts: > > class DOMObject: > def __init__(self,parent): > self.parent=parent > self.firstchild=self.sibling=None > if not parent: return > if not parent.firstchild: > parent.firstchild=self > else: > child=parent.firstchild > while child.sibling: > child=child.sibling > child.sibling=self > def __del__(self): > print("Disposing of id #%d"%id(self)) > > document=DOMObject(None) > body=DOMObject(document) > p=DOMObject(body) > p=DOMObject(body) > p=DOMObject(body) > del document,body,p > gc.collect() > > The __del__ method would need to clean up the external resources used > by this object, but wouldn't have to walk the tree. Yet, just because > there is a reference loop and there are __del__ methods, the garbage > collector gives up and leaves it to the program author to deal with. > > I can understand if this is considered too complicated and too unusual > a circumstance to be worth bothering to support, but I'm curious as to > whether it's at least conceptually reasonable to do something like > this. > It does sound like an interesting idea. From techtonik at gmail.com Fri Apr 26 19:26:44 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 26 Apr 2013 20:26:44 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <517A9FAC.4090004@nedbatchelder.com> References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517A874D.7020204@stoneleaf.us> <517A9FAC.4090004@nedbatchelder.com> Message-ID: On Fri, Apr 26, 2013 at 6:39 PM, Ned Batchelder wrote: > > On 4/26/2013 11:21 AM, anatoly techtonik wrote: > > """ >>> If a QObject falls out of scope in Python, it will get deleted. You have >>> to take care of keeping a reference to the object: >>> >>> * Store it as an attribute of an object you keep around, e.g. >>> self.window = QMainWindow() >>> * Pass a parent QObject to the object???s constructor, so it gets owned >>> by the parent >>> >>> """ >>> >>> This thread on the PySide mailing list suggests that you are >>> mistaken, PySide does not have superpowers over and >>> above Python's garbage collector, and is subject to the exact same >>> non-deterministic destructors as any other Python >>> object. Whether you call that destructor __del__ or __exit__ makes >>> no difference. >>> >>> >>> http://www.mail-archive.com/__pyside at lists.openbossa.org/__msg01029.html >>> < >>> http://www.mail-archive.com/pyside at lists.openbossa.org/msg01029.html> >>> >> >> You'll notice it doesn't say "gets /immediately/ deleted" -- because it >> doesn't. It gets deleted when it gets garbage collected. >> > > Are you sure about that? The example on the PySide wiki is pretty > reproducible. With current garbage collector lazyness it should be at least > in some cases non-reliable. > > > Again, I suspect we are falling prey to fuzzy language. CPython will > reclaim objects as soon as their reference count reaches zero. This is not > the garbage collector. The garbage collector is a separate facility which > kicks in every once in a while to find objects that have non-zero reference > counts, even though they are unreachable, because of circular references. > Some people say "garbage collector" or "garbage collection" to mean the > usual reclamation of objects when their refcount reaches zero, but this is > imprecise and confusing when mixed with people who use the term differently. > > I know nothing about the internals of QObjects, but like most others, I > *strongly* suspect that they are doing nothing special above and beyond > what Python does to determine the lifetime of an object. Their cleanup > happens when the object is reclaimed (note I am careful not to say, "when > the object is garbage collected"). > > The example on the PySide wiki is reproducible because it is not subject > to garbage collector laziness. The "animation" name is a local in > animate_stuff. At the end of that function, the name "animation" falls out > of scope, decrementing the reference count on the QPropertyAnimation object > it referenced. That object now has a reference count of zero, so it is > reclaimed. The cleanup code is then invoked, destroying the native objects > as well. > Thanks. That makes it more clear. So if I create a circular reference with two QObjects and lose a link to both of them - they won't be reclaimed until GC starts, because their mutual reference count is not zero, right? And the same will be true for the object returned by open() involved in circular reference. I guess that's the reason why the example in file.close() could not be updated to note that "with" is not needed anymore. http://docs.python.org/2/library/stdtypes.html#file.close -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Fri Apr 26 19:31:06 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Fri, 26 Apr 2013 20:31:06 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <517AA349.5080504@mrabarnett.plus.com> References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517AA349.5080504@mrabarnett.plus.com> Message-ID: On Fri, Apr 26, 2013 at 6:54 PM, MRAB wrote: > On 26/04/2013 14:02, anatoly techtonik wrote: > >> On Thu, Apr 25, 2013 at 9:17 AM, Steven D'Aprano > > wrote: >> > [snip] > > I am not touching destructor methods. The idea is to make with statement >> transparent - embed inside objects that require it. I am not sure what >> the implementation should be. Probably object should have an ability to >> enable context scope tracking in its constructor, to tell Python to call >> its __exit__ method at the moment when its reference count reaches zero, >> and before it is garbage collected. >> >> On the other hand, objects being freed is not deterministic. They'll >> be freed when there are no longer any references to it, which may be >> Never. >> >> Reference counting GCs are deterministic, but cannot deal with >> circular references. Other GCs can deal with circular references, >> but are non-deterministic. Even the Java GC doesn't guarantee that >> the finalize() method will always be called. >> >> >> This circular reference problem is interesting. In object space it >> probably looks like a stellar detached from the visible (attached) >> universe. Is the main problem in detecting it? >> >> The problem is in knowing in which order the objects should be > collected. > > For example, if A refers to B and B refers to A, should you collect A > then B, or B then A? If you collect A first, then, for a time, B will > be referring to a non-existent object. That's not good if the objects > have destructors which need to be run. And how does GC solve that? Can it complain about those stellars? -------------- next part -------------- An HTML attachment was scrubbed... URL: From python at mrabarnett.plus.com Fri Apr 26 20:20:35 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 26 Apr 2013 19:20:35 +0100 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517AA349.5080504@mrabarnett.plus.com> Message-ID: <517AC573.4090606@mrabarnett.plus.com> On 26/04/2013 18:31, anatoly techtonik wrote: > On Fri, Apr 26, 2013 at 6:54 PM, MRAB > wrote: > > On 26/04/2013 14:02, anatoly techtonik wrote: > > On Thu, Apr 25, 2013 at 9:17 AM, Steven D'Aprano > > >> wrote: > > [snip] > > I am not touching destructor methods. The idea is to make with > statement > transparent - embed inside objects that require it. I am not > sure what > the implementation should be. Probably object should have an > ability to > enable context scope tracking in its constructor, to tell Python > to call > its __exit__ method at the moment when its reference count > reaches zero, > and before it is garbage collected. > > On the other hand, objects being freed is not > deterministic. They'll > be freed when there are no longer any references to it, > which may be > Never. > > Reference counting GCs are deterministic, but cannot deal with > circular references. Other GCs can deal with circular > references, > but are non-deterministic. Even the Java GC doesn't > guarantee that > the finalize() method will always be called. > > > This circular reference problem is interesting. In object space it > probably looks like a stellar detached from the visible (attached) > universe. Is the main problem in detecting it? > > The problem is in knowing in which order the objects should be > collected. > > For example, if A refers to B and B refers to A, should you collect A > then B, or B then A? If you collect A first, then, for a time, B will > be referring to a non-existent object. That's not good if the objects > have destructors which need to be run. > > > And how does GC solve that? Can it complain about those stellars? > It doesn't solve that. From g.brandl at gmx.net Fri Apr 26 20:44:21 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Fri, 26 Apr 2013 20:44:21 +0200 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517A874D.7020204@stoneleaf.us> <517A9FAC.4090004@nedbatchelder.com> Message-ID: Am 26.04.2013 19:26, schrieb anatoly techtonik: > Again, I suspect we are falling prey to fuzzy language. CPython will > reclaim objects as soon as their reference count reaches zero. This is not > the garbage collector. The garbage collector is a separate facility which > kicks in every once in a while to find objects that have non-zero reference > counts, even though they are unreachable, because of circular references. > Some people say "garbage collector" or "garbage collection" to mean the > usual reclamation of objects when their refcount reaches zero, but this is > imprecise and confusing when mixed with people who use the term differently. > > I know nothing about the internals of QObjects, but like most others, I > *strongly* suspect that they are doing nothing special above and beyond what > Python does to determine the lifetime of an object. Their cleanup happens > when the object is reclaimed (note I am careful not to say, "when the object > is garbage collected"). > > The example on the PySide wiki is reproducible because it is not subject to > garbage collector laziness. The "animation" name is a local in > animate_stuff. At the end of that function, the name "animation" falls out > of scope, decrementing the reference count on the QPropertyAnimation object > it referenced. That object now has a reference count of zero, so it is > reclaimed. The cleanup code is then invoked, destroying the native objects > as well. > > > Thanks. That makes it more clear. So if I create a circular reference with two > QObjects and lose a link to both of them - they won't be reclaimed until GC > starts, because their mutual reference count is not zero, right? And the same > will be true for the object returned by open() involved in circular reference. > > I guess that's the reason why the example in file.close() could not be updated > to note that "with" is not needed anymore. > http://docs.python.org/2/library/stdtypes.html#file.close You got it backwards. "with" has been introduced to supersede both the correct-but-clumsy try-finally idiom and the sometimes-but-not-everywhere- correct implicit "automatic" cleanup. Georg From random832 at fastmail.us Fri Apr 26 20:54:20 2013 From: random832 at fastmail.us (random832 at fastmail.us) Date: Fri, 26 Apr 2013 14:54:20 -0400 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> Message-ID: <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> On Fri, Apr 26, 2013, at 11:25, anatoly techtonik wrote: > Ok. The proposal is patch Python to be able to write: > > boolean = open(resource).use() > > Instead of: > > boolean = None > with open(resource) as tempvar: > boolean = tempvar.use() What about a with expression? boolean = x.use() with x as open(resource) From andrew.svetlov at gmail.com Fri Apr 26 21:30:18 2013 From: andrew.svetlov at gmail.com (Andrew Svetlov) Date: Fri, 26 Apr 2013 22:30:18 +0300 Subject: [Python-ideas] Automatic context managers In-Reply-To: <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> Message-ID: On Fri, Apr 26, 2013 at 9:54 PM, wrote: > What about a with expression? > > boolean = x.use() with x as open(resource) interesting idea. I see nothing bad with proposed construction. Any objections? I've miss something? -- Thanks, Andrew Svetlov From mertz at gnosis.cx Fri Apr 26 21:37:29 2013 From: mertz at gnosis.cx (David Mertz) Date: Fri, 26 Apr 2013 12:37:29 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> Message-ID: <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> On Apr 26, 2013, at 11:54 AM, random832 at fastmail.us wrote: >> Ok. The proposal is patch Python to be able to write: >> >> boolean = open(resource).use() >> >> Instead of: >> >> boolean = None >> with open(resource) as tempvar: >> boolean = tempvar.use() > > What about a with expression? > boolean = x.use() with x as open(resource) The initial 'boolean=None' is superfluous in any case. So the ideas are that instead of the current: with open(resources) as x: boolean = x.use() We might right: boolean = open(resource).use() Or: boolean = x.use() with x as open(resource) The with expression saves two characters, the probably entirely unworkable "automatic context manager" would save 13 characters. The existing one-liner seems perfectly clear to me and perfectly compact. A "with expression" doesn't seem absurd to me, but I'm not sure that the percentage of the time you really do want a single expression inside the 'with' block is common enough to warrant a special form. In contrast, the 'if expression' ternary operator that was introduced really does seem to express a very common pattern. I'd note that there's no reason you couldn't use the so-called "automatic context manager" already, it's just a matter of writing your own function rather than the built-in 'open()'. So, e.g. with a few lines of definition, you might use: boolean = SafeOpen(resource).use() -- mertz@ | The specter of free information is haunting the `Net! All the gnosis | powers of IP- and crypto-tyranny have entered into an unholy .cx | alliance...ideas have nothing to lose but their chains. Unite | against "intellectual property" and anti-privacy regimes! From jimjjewett at gmail.com Fri Apr 26 21:45:58 2013 From: jimjjewett at gmail.com (Jim Jewett) Date: Fri, 26 Apr 2013 15:45:58 -0400 Subject: [Python-ideas] Automatic context managers In-Reply-To: <517AB749.3040501@mrabarnett.plus.com> References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517AA349.5080504@mrabarnett.plus.com> <517AB749.3040501@mrabarnett.plus.com> Message-ID: (Quoting from MRAB's quote, since I don't see the original -- I suspect I also mangled some attributions internally) On 26/04/2013 17:52, Chris Angelico wrote: > On Sat, Apr 27, 2013 at 1:54 AM, MRAB wrote: >> On 26/04/2013 14:02, anatoly techtonik wrote: >>> This circular reference problem is interesting. In object space it >>> probably looks like a stellar detached from the visible (attached) >>> universe. Is the main problem in detecting it? Yes. That is where Reference Counting fails, and is the reason that CPython added (cylic) garbage collection. Note that Garbage Collectors need to have a list of "roots" which can keep things alive, and a way of recognizing links. If some objects (even those implemented in C) use pointers that the garbage collector doesn't know about (e.g., by adding a constant to a base address instead of storing the address directly, or storing tag bits in the low-order portion of the address), then there will be objects that cannot ever be safely collected. Officially, that can be a bug in the object implementation, but if it leads to a segfault, python still looks bad. >> The problem is in knowing in which order the objects should be >> collected. This is a problem only once a garbage cycle has already been detected. But it is indeed a major problem. The above mean that garbage collectors must look at every live object in the entire system for every full collection; there is plenty of research on how to speed things up (or even just make the system more responsive) by doing "extra" work for partial collections. (I put "extra" in scare-quotes, because these heuristics increase the worst-case and the theoretical average case, but often decrease the normal-case workload.) Of course, if you're paying this full price anyhow, why bother paying the additional price of reference-counting? (Because it is one of those heuristics that actually save work in practice, if your data isn't very cyclic. But if you use a very cyclic style, or library...) >> For example, if A refers to B and B refers to A, should you collect A >> then B, or B then A? If you collect A first, then, for a time, B will >> be referring to a non-existent object. That's not good if the objects >> have destructors which need to be run. > Once it's been proven that there's an unreferenced cycle, why not > simply dispose of one of the objects, and replace all references to it > (probably only one - preferably pick an object with the fewest > references) with a special temporary object? Backwards compatibility. If my pointed-to object no longer has the methods I expect (perhaps even just "close"), I will get exceptions. They won't be the ones for which I was prepared. Now, instead of leaking a few resources (only until the program exits), I will be exiting prematurely, perhaps without a chance to do other cleanup. (Mrab wrote:) I wonder whether it would be best to call the __del__ method of the newest object (if it's possible to determine which is the newest) in such a case, then replace _that_ object with the DestructedObject (the "special marker" would be just a special "destructed" object). You can get most of the way there with object address, and farther with timestamping at creation (which also costs more memory). But is the difference between 99.5 and 99.8 worth complicating things and possibly breaking the last 0.2 more severely? I *would* like a __close__ magic method that worked like __del__, except that it would be OK to call as soon as you found the object in a garbage cycle. (This also means that the __close__ method's contract should state explicitly that it might be called multiple times, and cycles might be broken in an arbitrary order.) In the past, this has been rejected as insufficiently motivated, but that may have changed. -jJ From bruce at leapyear.org Fri Apr 26 21:45:45 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 26 Apr 2013 12:45:45 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> Message-ID: On Fri, Apr 26, 2013 at 12:30 PM, Andrew Svetlov wrote: > On Fri, Apr 26, 2013 at 9:54 PM, wrote: > > What about a with expression? > > > > boolean = x.use() with x as open(resource) > > interesting idea. I see nothing bad with proposed construction. > Any objections? I've miss something? > The most obvious thing as that 'as' is backwards from the with statement. Also this requires you to come up with a name which you have to repeat while with(open(resource)).use() doesn't. On the other hand, it allows you to do: result = [x.foo(), x.bar()] with open(resource) as x which opens the way to freewheeling inline assignment: @contextmanager def assign(x): yield x result = (-b + sqrt(b*b - 4 * a * c)) / (2 * a) with assign(3) as a with assign(4) as b with assign(5) as c I don't think that's a good thing. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Fri Apr 26 21:46:57 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 26 Apr 2013 12:46:57 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> Message-ID: On Fri, Apr 26, 2013 at 12:37 PM, David Mertz wrote: > I'd note that there's no reason you couldn't use the so-called "automatic > context manager" already, it's just a matter of writing your own function > rather than the built-in 'open()'. So, e.g. with a few lines of > definition, you might use: > > boolean = SafeOpen(resource).use() > I'd like to see those few lines if they are indeed possible. I don't see how the function would know when to call __exit__. It has to be after use() finishes. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Fri Apr 26 22:12:26 2013 From: mertz at gnosis.cx (David Mertz) Date: Fri, 26 Apr 2013 13:12:26 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> Message-ID: <17163806-A689-46D3-8609-3035CC178F17@gnosis.cx> On Apr 26, 2013, at 12:46 PM, Bruce Leban wrote: > On Fri, Apr 26, 2013 at 12:37 PM, David Mertz wrote: > I'd note that there's no reason you couldn't use the so-called "automatic context manager" already, it's just a matter of writing your own function rather than the built-in 'open()'. So, e.g. with a few lines of definition, you might use: > > boolean = SafeOpen(resource).use() > > I'd like to see those few lines if they are indeed possible. I don't see how the function would know when to call __exit__. It has to be after use() finishes. You might get really perverse with stack inspection and whatnot. But I was thinking more of just a proxy method that introduced the safety. class SafeOpen(object): def __init__(self, resource): self.resource = resource def __getattr__(self, name): def f(*args, **kws): with open(self.resource) as x: y = getattr(x, name)(*args, **kws) return y return f -- If I seem shortsighted to you, it is only because I have stood on the backs of midgets. From python at mrabarnett.plus.com Fri Apr 26 22:15:16 2013 From: python at mrabarnett.plus.com (MRAB) Date: Fri, 26 Apr 2013 21:15:16 +0100 Subject: [Python-ideas] CPython's cyclic garbage collector (was Automatic context managers) In-Reply-To: <517AD5FE.60706@davea.name> References: <517ABC73.4000605@davea.name> <517AD5FE.60706@davea.name> Message-ID: <517AE054.5090003@mrabarnett.plus.com> On 26/04/2013 20:31, Dave Angel wrote: > On 04/26/2013 01:57 PM, Chris Angelico wrote: >> On Sat, Apr 27, 2013 at 3:42 AM, Dave Angel wrote: >>> I don't see what your "special" temporary object actually accomplishes. >>> Seems to me you need to declare that your __del__() methods promise not to >>> reference each other, and the gc would then check all objects in the cycle, >>> and do its present behavior if any of the destructors is not specially >>> declared. >> >> It wouldn't be declared; it'd simply throw an exception if anything >> different happened. >> >>> I'm not sure how often you'd have a non-trivial destructor that wouldn't >>> reference any objects. And doing a static analysis of what will happen >>> during the destructor would be pretty messy. So the best I and come up with >>> is to keep the declaration, but require a try/catch to cleanly terminate >>> each destructor if it ever references anything in the tree. >> >> And yeah. If you catch the exception inside __del__, you can cope with >> the destructed object yourself (or LBLY, if you wish). Alternatively, >> you just proceed as normal, and when your __del__ throws an exception, >> the gc then copes (not sure *how* it should cope - log it to stderr >> and carry on?). Same as normal exception handling. >> >> The advantage of this style is that the code to deal with the cycle is >> kept right in the cyclic object's destructor - right where the problem >> is. Doing it through gc.garbage requires that some other operation >> periodically check for garbage - after the GC has done its own >> periodic check. Seems simpler/cleaner to do it as part of the gc run >> itself. >> > > You must think me dense by now. But I don't understand what the two > different garbage collection operations are that you're positing. > > As far as I know, there's ref counting, which is quick, and frees > something as soon as the count goes to zero. Then there's gc, which has > to scan through all the objects from a known starting set, and identify > those things which aren't accessible, and free any that don't have a > __del__() method. > > And it's only in the gc step that cycles and such are identifiable. > Currently, if the GC finds a cycle and the objects in that cycle have __del__ methods, the objects are not collected. The suggestion is that in such a case the GC could call the __del__ method and then replace the object with a special "deleted" object, removing any references to other objects and thus breaking the cycle. If a __del__ method in one of the other objects subsequently tries to use the deleted object, an exception would be raised (a kind of AttributeError perhaps). From bruce at leapyear.org Sat Apr 27 00:33:33 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 26 Apr 2013 15:33:33 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: <17163806-A689-46D3-8609-3035CC178F17@gnosis.cx> References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> <17163806-A689-46D3-8609-3035CC178F17@gnosis.cx> Message-ID: On Fri, Apr 26, 2013 at 1:12 PM, David Mertz wrote: > You might get really perverse with stack inspection and whatnot. But I > was thinking more of just a proxy method that introduced the safety. > > class SafeOpen(object): > def __init__(self, resource): > self.resource = resource > def __getattr__(self, name): > def f(*args, **kws): > with open(self.resource) as x: > y = getattr(x, name)(*args, **kws) > return y > return f > Wouldn't that close the resource before the use function is actually called? As I read it, it opens, calls getattr(x, 'use'), closes x, then calls x.use(). --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From bruce at leapyear.org Sat Apr 27 01:14:41 2013 From: bruce at leapyear.org (Bruce Leban) Date: Fri, 26 Apr 2013 16:14:41 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> <17163806-A689-46D3-8609-3035CC178F17@gnosis.cx> Message-ID: On Fri, Apr 26, 2013 at 3:55 PM, David Mertz wrote: > > Wouldn't that close the resource before the use function is actually > called? As I read it, it opens, calls getattr(x, 'use'), closes x, then > calls x.use(). > > Nope. The call is inside the context manager. Whatever is returned is > stored in 'y' before the resource is closed, and that value of y is > returned by the proxy function f. > Yup, you're right. Clever. And I read the code a bit too quickly. :-) I think it's a bit more complicated to write a general wrapper though. For example, to handle chained calls that SQLAlchemy uses you have to know which calls should close the object and which ones shouldn't. data = SafeSession(...).query(...).filter(...).order_by(...).values(...) The session should be closed after the call to values() [assuming for the purpose of this discussion that we actually want to close the session]. If we do it before, it will fail. If we don't do it then, we no longer have a handle to the session to close it. So you have to know which calls should close it and which ones shouldn't. Caveat: Actually, SQLAlchemy is unsuitable for automatic closing because in some cases it returns an array of objects which are attached to the session and you need to keep the session alive as long as you need to access any of the objects, which an automatic mechanism like this wouldn't be able to handle But it serves to illustrate the point. --- Bruce Latest blog post: Alice's Puzzle Page http://www.vroospeak.com Learn how hackers think: http://j.mp/gruyere-security -------------- next part -------------- An HTML attachment was scrubbed... URL: From mertz at gnosis.cx Sat Apr 27 00:55:39 2013 From: mertz at gnosis.cx (David Mertz) Date: Fri, 26 Apr 2013 15:55:39 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> <17163806-A689-46D3-8609-3035CC178F17@gnosis.cx> Message-ID: On Apr 26, 2013, at 3:33 PM, Bruce Leban wrote: > class SafeOpen(object): > def __init__(self, resource): > self.resource = resource > def __getattr__(self, name): > def f(*args, **kws): > with open(self.resource) as x: > y = getattr(x, name)(*args, **kws) > return y > return f > Wouldn't that close the resource before the use function is actually called? As I read it, it opens, calls getattr(x, 'use'), closes x, then calls x.use(). Nope. The call is inside the context manager. Whatever is returned is stored in 'y' before the resource is closed, and that value of y is returned by the proxy function f. It's easy to try: In [23]: SafeOpen('test').read(5) Out[23]: 'this\n' In [24]: SafeOpen('test').readlines() Out[24]: ['this\n', 'and\n', 'that\n', 'and\n', 'other\n'] It would be easy to generalize this to be a SafeAnything class rather than only handle 'open()'. You could just pass in the name of the context manager to the initializer rather than hardcode it as being 'open()'. Actually, I sort of like a factory for producing SafeAnythings better: def safe_factory(context_manager): class SafeAnything(object): def __init__(self, resource, cm=context_manager): self.resource = resource self.cm = cm def __getattr__(self, name): def f(*args, **kws): with self.cm(self.resource) as x: y = getattr(x, name)(*args, **kws) return y return f return SafeAnything SafeOpen = safe_factory(open) -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions. From mertz at gnosis.cx Sat Apr 27 01:51:47 2013 From: mertz at gnosis.cx (David Mertz) Date: Fri, 26 Apr 2013 16:51:47 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> <557D2939-7DC6-4DED-B75E-AC265F287C11@gnosis.cx> <17163806-A689-46D3-8609-3035CC178F17@gnosis.cx> Message-ID: <68297A86-A470-442F-B5B6-5A20CE2791E4@gnosis.cx> On Apr 26, 2013, at 4:14 PM, Bruce Leban wrote: > Yup, you're right. Clever. And I read the code a bit too quickly. :-) I think it's a bit more complicated to write a general wrapper though. For example, to handle chained calls that SQLAlchemy uses you have to know which calls should close the object and which ones shouldn't. > data = SafeSession(...).query(...).filter(...).order_by(...).values(...) Yeah sure. I was just demonstrating what's possible in a toy way. I actually feel like simply using the 'with' statement is perfectly fine and perfectly clear. But you *can* you could also wrap the bunch of chained methods if you wanted too. I'd have to think about how to define that properly for a few minutes, but if you get to the point where there are various conditional exits to the chain, just use a 'with' block, for gosh sake. with session(...) as x: x.query(...) x.filter(...) if x.something(): x.order_by(...) else: x.order_by(...) if not x.time_to_leave(): data = x.values() Or whatever details apply to your own program logic. I don't really know SQLAlchemy, but I guess it must be that most of those chained methods return a mutated object, right? That would be easy enough to check for in the proxy, and I guess whenever it got to something that wasn't a mutated object but rather some "plain" values (a list, scalar, dict, etc), that would be time to leave the context manager. I'll leave that as an exercise :-). -- >>> THE MERTZ PRINCIPLE <<< There are two essential virtues in which a sentence might engage: seduction and alliteration. Ancillary virtues include truthfulness, effectivity, and artfulness. From greg.ewing at canterbury.ac.nz Sat Apr 27 01:59:19 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 27 Apr 2013 11:59:19 +1200 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: <20130426135623.GA11244@iskra.aviel.ru> References: <20130426135623.GA11244@iskra.aviel.ru> Message-ID: <517B14D7.6080709@canterbury.ac.nz> Oleg Broytman wrote: > Are there cross-platform audio libraries that Python could wrap? There's OpenAL: http://connect.creativelabs.com/openal/default.aspx -- Greg From greg.ewing at canterbury.ac.nz Sat Apr 27 02:50:47 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sat, 27 Apr 2013 12:50:47 +1200 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517AA349.5080504@mrabarnett.plus.com> <517AB749.3040501@mrabarnett.plus.com> Message-ID: <517B20E7.4070003@canterbury.ac.nz> Jim Jewett wrote: > Note that Garbage Collectors need to have a list of "roots" which can > keep things alive, and a way of recognizing links. If some objects > ... use pointers that the garbage collector > doesn't know about ... then there will be objects that > cannot ever be safely collected. Officially, that can be a bug in the > object implementation, but if it leads to a segfault, python still > looks bad. Python's GC is more robust in this respect than traditional mark-and-sweep collectors, because it uses the opposite logic: instead of assuming everything is garbage unless it can prove that it's not, it assumes that nothing is garbage unless it can prove that it is. So if it misses a reference, that might lead to a memory leak, but it won't cause a crash. It also doesn't rely on "roots". It does relies on reference counting to achieve this, however; if something keeps a pointer to an object without increasing its refcount, that could lead to a crash. > Of course, if you're paying this full price anyhow, why bother paying > the additional price of reference-counting? In the case of CPython, it's because the cyclic GC is actually built on top of the refcounting mechanism and wouldn't work without it. So it's not really an additional cost at all. -- Greg From random832 at fastmail.us Sat Apr 27 03:43:29 2013 From: random832 at fastmail.us (Random832) Date: Fri, 26 Apr 2013 21:43:29 -0400 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <5B977B1B-CD97-489C-9357-F5851E7E542C@mac.com> <5712AC18-169A-4EA3-B201-E76C75A8F296@mac.com> <1367002460.31273.140661223235105.509DAA0B@webmail.messagingengine.com> Message-ID: <517B2D41.9030103@fastmail.us> On 04/26/2013 03:45 PM, Bruce Leban wrote: > > On Fri, Apr 26, 2013 at 12:30 PM, Andrew Svetlov > > wrote: > > On Fri, Apr 26, 2013 at 9:54 PM, > wrote: > > What about a with expression? > > > > boolean = x.use() with x as open(resource) > > interesting idea. I see nothing bad with proposed construction. > Any objections? I've miss something? > > > The most obvious thing as that 'as' is backwards from the with > statement. Also this requires you to come up with a name which you > have to repeat while > > with(open(resource)).use() > > > doesn't. On the other hand, it allows you to do: > > result = [x.foo(), x.bar()] with open(resource) as x > > which opens the way to freewheeling inline assignment: > > @contextmanager > def assign(x): > yield x > > result = (-b + sqrt(b*b - 4 * a * c)) / (2 * a) with assign(3) as > a with assign(4) as b with assign(5) as c > > > I don't think that's a good thing. If someone wanted to do that, lambda's already here. (lambda a, b, c: (-b + sqrt(b*b - 4 * a * c)) / (2 * a))(3,4,5) The original syntax I was going to suggest would have been something more along the lines of (with open(resource) as x: x.use()), but figured the syntax I ended up posting would be more pythonic (by analogy to list comprehensions or if-else) The purpose of this would be to call cleanup code. Anything can be misused. -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat Apr 27 05:35:49 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 26 Apr 2013 20:35:49 -0700 Subject: [Python-ideas] Automatic context managers In-Reply-To: References: <20130425042335.GA60739@cskk.homeip.net> <5178CA66.3060205@pearwood.info> <517A874D.7020204@stoneleaf.us> Message-ID: On Apr 26, 2013, at 8:21, anatoly techtonik wrote: > On Fri, Apr 26, 2013 at 4:55 PM, Ethan Furman wrote: >> On 04/26/2013 06:02 AM, anatoly techtonik wrote: >>> On Thu, Apr 25, 2013 at 9:17 AM, Steven D'Aprano > wrote: >>> >>> On 25/04/13 14:47, anatoly techtonik wrote: >>> >>> On Thu, Apr 25, 2013 at 7:23 AM, Cameron Simpson > wrote: >>> >>> >>> Then aren't you just talking about the __del__ method? >>> >>> >>> No. The __del__ method is only called during garbage collection phase which >>> may be delayed. In PySide the QObject is deleted immediately. >>> >>> >>> Citation please. Where is this documented? >>> >>> >>> Here:? http://qt-project.org/wiki/PySide_Pitfalls >>> >>> >>> """ >>> If a QObject falls out of scope in Python, it will get deleted. You have to take care of keeping a reference to the object: >>> >>> * Store it as an attribute of an object you keep around, e.g. self.window = QMainWindow() >>> * Pass a parent QObject to the object???s constructor, so it gets owned by the parent >>> >>> """ >>> >>> This thread on the PySide mailing list suggests that you are mistaken, PySide does not have superpowers over and >>> above Python's garbage collector, and is subject to the exact same non-deterministic destructors as any other Python >>> object. Whether you call that destructor __del__ or __exit__ makes no difference. >>> >>> http://www.mail-archive.com/__pyside at lists.openbossa.org/__msg01029.html >>> >> >> You'll notice it doesn't say "gets /immediately/ deleted" -- because it doesn't. It gets deleted when it gets garbage collected. > > Are you sure about that? The example on the PySide wiki is pretty reproducible. With current garbage collector lazyness it should be at least in some cases non-reliable. You're missing something fundamental here. PySide isn't advertising that you can automatically clean up objects just by leaking them; it's warning you that you must retain objects if you don't want them cleaned up. Put another way: correct Python code cannot assume that objects _do_ outlive their last reference... But that doesn't mean you can assume they _don't_, either. You have to assume that both are possible, and code accordingly. And there is really no obvious change to the language that would "fix" that in either direction. Or, rather, there is: make it easier to do the right thing explicitly. Which is exactly what with statements are for. > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From abarnert at yahoo.com Sat Apr 27 05:39:30 2013 From: abarnert at yahoo.com (Andrew Barnert) Date: Fri, 26 Apr 2013 20:39:30 -0700 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: <517B14D7.6080709@canterbury.ac.nz> References: <20130426135623.GA11244@iskra.aviel.ru> <517B14D7.6080709@canterbury.ac.nz> Message-ID: <885647AC-8D19-4186-A797-6D8BAB25240C@yahoo.com> On Apr 26, 2013, at 16:59, Greg Ewing wrote: > Oleg Broytman wrote: >> Are there cross-platform audio libraries that Python could wrap? > > There's OpenAL: > > http://connect.creativelabs.com/openal/default.aspx > There's actually a bunch of options. The hard question is picking one and endorsing it as "right", or at least "good enough to enshrine in stdlib ala tkinter". > -- > Greg > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas From jeanpierreda at gmail.com Sat Apr 27 06:44:16 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 27 Apr 2013 00:44:16 -0400 Subject: [Python-ideas] Specificity in AttributeError Message-ID: Code: class A: @property def text(self): return self.foo Behavior: >>> hasattr(A(), 'text') False Actually, this is absolutely correct. "A().y" does not give a result -- in fact, it raises AttributeError. Behavior wise, this is exactly what it means for an attribute to not exist. The problem is that this may disguise other issues in one's code. Suppose one tries to do "duck-type checking", where a function might get either an object of type A, or an object of type B. It checks if the object has a certain attribute that objects of type A have, and treats it as one or the other as a result. This is ugly, but works, and even works if someone writes a new type that emulates the API of A or B. Real Problem: class B(object): def __init__(self, x): if hasattr(x, 'text'): x = x.text self.x = x Because of an error inside the text property, B has classified x as the wrong type, and this will cause errors later in the execution of the program that might be hard to diagnose. That is what happens when mistakes are made. Worse is that the error is silenced, in a manner similar to how hasattr used to silence things like ValueError. I would not ask that the semantics of hasattr() be changed. hasattr() has a specific meaning, and I don't know what happens if it is changed again. What I would ask is that enough information be added to AttributeError that I can figure out things for myself. Specifically an attribute that stores the object for which attribute access failed, and the attribute (string) for which attribute access failed. This information is known when AttributeError is instantiated, but it's used to generate a string description and then thrown out. I would like to be able to write this function: def hasattr_lite(obj, attr): try: getattr(obj, attr) except AttributeError as e: if e.object is obj and e.attribute is attr: return False raise return True This would let me do this "duck type checking", but also would reduce the number of errors hidden in the process. So I would like it if AttributeError gained those attributes, or some equivalent functionality. Does that sound reasonable? -- Devin From ethan at stoneleaf.us Sat Apr 27 07:49:09 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 26 Apr 2013 22:49:09 -0700 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: References: Message-ID: <517B66D5.1000108@stoneleaf.us> On 04/26/2013 09:44 PM, Devin Jeanpierre wrote: > Code: > class A: > @property > def text(self): > return self.foo > > Behavior: > >>> hasattr(A(), 'text') > False > > Actually, this is absolutely correct. "A().y" does not give a result > -- in fact, it raises AttributeError. Behavior wise, this is exactly > what it means for an attribute to not exist. I think you "A().foo" and not "A().y" above. > The problem is that this may disguise other issues in one's code. Like bugs? ;) > I would like to be able to write this function: > > def hasattr_lite(obj, attr): > try: > getattr(obj, attr) > except AttributeError as e: > if e.object is obj and e.attribute is attr: > return False > raise > return True > > Does that sound reasonable? While it's always nice to have extra info in exceptions, why are you coding against bugs? If this is your own code you should have unit tests to catch such things. If this is someone else's code... well, it's their bug. -- ~Ethan~ From jeanpierreda at gmail.com Sat Apr 27 08:59:01 2013 From: jeanpierreda at gmail.com (Devin Jeanpierre) Date: Sat, 27 Apr 2013 02:59:01 -0400 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: <517B66D5.1000108@stoneleaf.us> References: <517B66D5.1000108@stoneleaf.us> Message-ID: On Sat, Apr 27, 2013 at 1:49 AM, Ethan Furman wrote: > I think you "A().foo" and not "A().y" above. Oops. Sorry. > > >> The problem is that this may disguise other issues in one's code. > > > Like bugs? ;) Yes. > While it's always nice to have extra info in exceptions, why are you coding > against bugs? > > If this is your own code you should have unit tests to catch such things. > > If this is someone else's code... well, it's their bug. Discovering that there is a bug is one thing, discovering why is another. The problems that result from silently doing the wrong thing can be significantly harder to diagnose than an exception traceback is, and this hasattr_lite would let me get an exception in cases where I might otherwise have silently wrong behavior. I mean, yes, this error was in fact found in my unit test suite. I spent a lot of time tracking it down (perhaps too much time, because I was expecting something else [oops, too many changes in one changeset :X]), and eventually narrowed it down to a try/except AttributeError I had. Even then, trying to filter out the legitimate AttributeErrors and the illegitimate one from the same test case was annoying. I ended up breaking at the except block and individually examining the exceptions, both legitimate and not. This solved everything. hasattr_lite (if it worked) would've reported the problem as an AttributeError, with exactly the typo I had made in an attribute name inside a property. Seconds to figure out and fix. So what I mean is, it isn't necessary, but I would find it helpful and convenient. -- Devin From ethan at stoneleaf.us Sat Apr 27 09:39:29 2013 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 27 Apr 2013 00:39:29 -0700 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: References: <517B66D5.1000108@stoneleaf.us> Message-ID: <517B80B1.40605@stoneleaf.us> On 04/26/2013 11:59 PM, Devin Jeanpierre wrote: > On Sat, Apr 27, 2013 at 1:49 AM, Ethan Furman wrote: > > Discovering that there is a bug is one thing, discovering why is > another. The problems that result from silently doing the wrong thing > can be significantly harder to diagnose than an exception traceback > is, and this hasattr_lite would let me get an exception in cases where > I might otherwise have silently wrong behavior. > > I mean, yes, this error was in fact found in my unit test suite. I > spent a lot of time tracking it down (perhaps too much time, because I > was expecting something else [oops, too many changes in one changeset > :X]), and eventually narrowed it down to a try/except > AttributeError I had. Even then, trying to filter out the legitimate > AttributeErrors and the illegitimate one from the same test case was > annoying. I ended up breaking at the except block and individually > examining the exceptions, both legitimate and not. This solved > everything. > > hasattr_lite (if it worked) would've reported the problem as an > AttributeError, with exactly the typo I had made in an attribute name > inside a property. Seconds to figure out and fix. > > So what I mean is, it isn't necessary, but I would find it helpful and > convenient. I absolutely agree, and would like to see the available info on other exceptions (KeyError, IndexError, etc.) as well. I suspect it would take some serious effort to upgrade all the exceptions from all the places they can be raised from, though. -- ~Ethan~ From solipsis at pitrou.net Sat Apr 27 12:51:26 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Sat, 27 Apr 2013 12:51:26 +0200 Subject: [Python-ideas] Cross Platform Python Sound Module/Library References: <20130426135623.GA11244@iskra.aviel.ru> <517B14D7.6080709@canterbury.ac.nz> <885647AC-8D19-4186-A797-6D8BAB25240C@yahoo.com> Message-ID: <20130427125126.45c7ce59@fsol> On Fri, 26 Apr 2013 20:39:30 -0700 Andrew Barnert wrote: > On Apr 26, 2013, at 16:59, Greg Ewing wrote: > > > Oleg Broytman wrote: > >> Are there cross-platform audio libraries that Python could wrap? > > > > There's OpenAL: > > > > http://connect.creativelabs.com/openal/default.aspx > > > There's actually a bunch of options. > > The hard question is picking one and endorsing it as "right", or at least > "good enough to enshrine in stdlib ala tkinter". When you notice how "good enough" tkinter is (and has been for 10 years at least), you realize the trap hidden in this question. Really, see my message earlier in this thread. This is better left to third-party libraries (which already exist, please do some research). Regards Antoine. From ned at nedbatchelder.com Sat Apr 27 13:53:54 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sat, 27 Apr 2013 07:53:54 -0400 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: <517B80B1.40605@stoneleaf.us> References: <517B66D5.1000108@stoneleaf.us> <517B80B1.40605@stoneleaf.us> Message-ID: <517BBC52.1010707@nedbatchelder.com> On 4/27/2013 3:39 AM, Ethan Furman wrote: > On 04/26/2013 11:59 PM, Devin Jeanpierre wrote: >> On Sat, Apr 27, 2013 at 1:49 AM, Ethan Furman >> wrote: >> >> Discovering that there is a bug is one thing, discovering why is >> another. The problems that result from silently doing the wrong thing >> can be significantly harder to diagnose than an exception traceback >> is, and this hasattr_lite would let me get an exception in cases where >> I might otherwise have silently wrong behavior. >> >> I mean, yes, this error was in fact found in my unit test suite. I >> spent a lot of time tracking it down (perhaps too much time, because I >> was expecting something else [oops, too many changes in one changeset >> :X]), and eventually narrowed it down to a try/except >> AttributeError I had. Even then, trying to filter out the legitimate >> AttributeErrors and the illegitimate one from the same test case was >> annoying. I ended up breaking at the except block and individually >> examining the exceptions, both legitimate and not. This solved >> everything. >> >> hasattr_lite (if it worked) would've reported the problem as an >> AttributeError, with exactly the typo I had made in an attribute name >> inside a property. Seconds to figure out and fix. >> >> So what I mean is, it isn't necessary, but I would find it helpful and >> convenient. > > I absolutely agree, and would like to see the available info on other > exceptions (KeyError, IndexError, etc.) as well. I suspect it would > take some serious effort to upgrade all the exceptions from all the > places they can be raised from, though. I also agree that more information can only be a good thing. Unless someone can show why it could be harmful (cycles caused by the exception keeping a reference to the offending object??), the only downside I can see is the work needed to change the throw points. --Ned. > > -- > ~Ethan~ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From g.brandl at gmx.net Sat Apr 27 14:04:58 2013 From: g.brandl at gmx.net (Georg Brandl) Date: Sat, 27 Apr 2013 14:04:58 +0200 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: <517BBC52.1010707@nedbatchelder.com> References: <517B66D5.1000108@stoneleaf.us> <517B80B1.40605@stoneleaf.us> <517BBC52.1010707@nedbatchelder.com> Message-ID: Am 27.04.2013 13:53, schrieb Ned Batchelder: >>> hasattr_lite (if it worked) would've reported the problem as an >>> AttributeError, with exactly the typo I had made in an attribute name >>> inside a property. Seconds to figure out and fix. >>> >>> So what I mean is, it isn't necessary, but I would find it helpful and >>> convenient. >> >> I absolutely agree, and would like to see the available info on other >> exceptions (KeyError, IndexError, etc.) as well. I suspect it would >> take some serious effort to upgrade all the exceptions from all the >> places they can be raised from, though. > > I also agree that more information can only be a good thing. Unless > someone can show why it could be harmful (cycles caused by the exception > keeping a reference to the offending object??), the only downside I can > see is the work needed to change the throw points. It is kind of harmful to duck-typing and, to a lesser degree, inheritance: so far Python has never guaranteed anything about exception arguments. If the exception attributes become part of the interface of standard types, everyone implementing a replacement will have to conform (and there are lots and lots of such replacements out there). This change should be treated akin to adding a new method to dictionaries, for example. That said, personally I would be in favour of such a change, because the advantage for unhandled exceptions alone is significant. Georg From masklinn at masklinn.net Sat Apr 27 15:05:50 2013 From: masklinn at masklinn.net (Masklinn) Date: Sat, 27 Apr 2013 15:05:50 +0200 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: References: Message-ID: <11BA3CEF-2115-4398-92AB-4A6E5C034BF7@masklinn.net> I don't know if it ties into this proposal or not, but I've had a pair of issues with attribute resolution in the past and more specifically with __getattr__: 1. __getattr__ is not implemented on object, thus implementing __getattr__ in an inheritance hierarchy (where an other object in an MRO may also have implemented __getattr__) requires boilerplate along the lines of: sup = getattr(super(Cls, self), '__getattr__', None) if sup is not None: return sup(key) 2. __getattr__ *must raise* to signify a missing attribute as None will simply be returned. The issue here is twofold: * the exception message will often be nothing like the default one * the stack will be all wrong, as it will show "within" the __getattr__ call, making it harder to discriminate between an expected attribute error and something unexpectedly blowing up within __getattr__ I was wondering if it wouldn't be possible to add __getattr__ to object, which would return NotImplemented. And NotImplemented would be interpreted by the attribute resolution process as "raise the normal AttributeError" as if there had not been a __getattr__. This way, attribute errors from __getattr__ not matching the provided name would look much more natural. I also believe it is backwards compatible: current __getattr__ implementations which just raise & don't delegate to a super() will behave exactly the same way, with the same issues. From brett at python.org Sat Apr 27 17:10:33 2013 From: brett at python.org (Brett Cannon) Date: Sat, 27 Apr 2013 11:10:33 -0400 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: References: <517B66D5.1000108@stoneleaf.us> <517B80B1.40605@stoneleaf.us> <517BBC52.1010707@nedbatchelder.com> Message-ID: On Sat, Apr 27, 2013 at 8:04 AM, Georg Brandl wrote: > Am 27.04.2013 13:53, schrieb Ned Batchelder: > >>>> hasattr_lite (if it worked) would've reported the problem as an >>>> AttributeError, with exactly the typo I had made in an attribute name >>>> inside a property. Seconds to figure out and fix. >>>> >>>> So what I mean is, it isn't necessary, but I would find it helpful and >>>> convenient. >>> >>> I absolutely agree, and would like to see the available info on other >>> exceptions (KeyError, IndexError, etc.) as well. I suspect it would >>> take some serious effort to upgrade all the exceptions from all the >>> places they can be raised from, though. >> >> I also agree that more information can only be a good thing. Unless >> someone can show why it could be harmful (cycles caused by the exception >> keeping a reference to the offending object??), the only downside I can >> see is the work needed to change the throw points. > > It is kind of harmful to duck-typing and, to a lesser degree, inheritance: > so far Python has never guaranteed anything about exception arguments. > If the exception attributes become part of the interface of standard types, > everyone implementing a replacement will have to conform (and there are > lots and lots of such replacements out there). This change should be > treated akin to adding a new method to dictionaries, for example. > > That said, personally I would be in favour of such a change, because the > advantage for unhandled exceptions alone is significant. I can speak from two bits of experience on this. First is http://python.org/dev/peps/pep-0352/ where I tried to make BaseException only accept a single argument and add a 'message' attribute. I actually gave up on the single argument version because too much called relied on the *args acceptance of BaseException. I gave up on 'message' because too many people subclassed exceptions and used 'message' as an attribute. But I also added 'name' and 'path' to ImportError in Python 3.3 successfully. I made them keyword-only arguments to the module's constructor to avoid API problems and not enough people construct ImportError instances directly for me to have heard complaints about the attribute names. My point is that Georg is right: tweaking the base exceptions by even adding an attribute can be touchy, but worth it. From phd at phdru.name Sat Apr 27 18:42:17 2013 From: phd at phdru.name (Oleg Broytman) Date: Sat, 27 Apr 2013 20:42:17 +0400 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: <517B14D7.6080709@canterbury.ac.nz> References: <20130426135623.GA11244@iskra.aviel.ru> <517B14D7.6080709@canterbury.ac.nz> Message-ID: <20130427164217.GA12418@iskra.aviel.ru> On Sat, Apr 27, 2013 at 11:59:19AM +1200, Greg Ewing wrote: > Oleg Broytman wrote: > >Are there cross-platform audio libraries that Python could wrap? > > There's OpenAL: > > http://connect.creativelabs.com/openal/default.aspx And there is PyOpenAl. There is SDL and PyGame on top of it. Hence the original poster's problem has already been solved. In many ways -- see http://wiki.python.org/moin/PythonGameLibraries for other solutions. Oleg. -- Oleg Broytman http://phdru.name/ phd at phdru.name Programmers don't die, they just GOSUB without RETURN. From ericsnowcurrently at gmail.com Sat Apr 27 20:04:44 2013 From: ericsnowcurrently at gmail.com (Eric Snow) Date: Sat, 27 Apr 2013 12:04:44 -0600 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: References: Message-ID: On Apr 26, 2013 10:45 PM, "Devin Jeanpierre" wrote: > > Code: > class A: > @property > def text(self): > return self.foo > > Behavior: > >>> hasattr(A(), 'text') > False > > Actually, this is absolutely correct. "A().y" does not give a result > -- in fact, it raises AttributeError. Behavior wise, this is exactly > what it means for an attribute to not exist. > What about using inspect.getattr_static()? def hasattr_static(obj, name): try: getattr_static(obj, name) except AttributeError: return False else: return True -eric -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Sat Apr 27 22:12:38 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 27 Apr 2013 23:12:38 +0300 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: <517B14D7.6080709@canterbury.ac.nz> References: <20130426135623.GA11244@iskra.aviel.ru> <517B14D7.6080709@canterbury.ac.nz> Message-ID: On Sat, Apr 27, 2013 at 2:59 AM, Greg Ewing wrote: > Oleg Broytman wrote: > >> Are there cross-platform audio libraries that Python could wrap? >> > > There's OpenAL: > > http://connect.creativelabs.**com/openal/default.aspx Proprietary since v1.1 (c) http://en.wikipedia.org/wiki/OpenAL -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Sat Apr 27 22:19:20 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sat, 27 Apr 2013 23:19:20 +0300 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: <20130427125126.45c7ce59@fsol> References: <20130426135623.GA11244@iskra.aviel.ru> <517B14D7.6080709@canterbury.ac.nz> <885647AC-8D19-4186-A797-6D8BAB25240C@yahoo.com> <20130427125126.45c7ce59@fsol> Message-ID: On Sat, Apr 27, 2013 at 1:51 PM, Antoine Pitrou wrote: > On Fri, 26 Apr 2013 20:39:30 -0700 > Andrew Barnert wrote: > > On Apr 26, 2013, at 16:59, Greg Ewing > wrote: > > > > > Oleg Broytman wrote: > > >> Are there cross-platform audio libraries that Python could wrap? > > > > > > There's OpenAL: > > > > > > http://connect.creativelabs.com/openal/default.aspx > > > > > There's actually a bunch of options. > > > > The hard question is picking one and endorsing it as "right", or at least > > "good enough to enshrine in stdlib ala tkinter". > > When you notice how "good enough" tkinter is (and has been for 10 > years at least), you realize the trap hidden in this question. > > Really, see my message earlier in this thread. This is better left to > third-party libraries (which already exist, please do some research). > >From the other side if 80% of cases can be covered without Python packaging problems - that's already an advantage. For example most people find date / time functionality in Python enough to avoid using mxDateTime as a dependency. As for audio, most people find it insufficient. -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Sat Apr 27 22:59:01 2013 From: mal at egenix.com (M.-A. Lemburg) Date: Sat, 27 Apr 2013 22:59:01 +0200 Subject: [Python-ideas] Cross Platform Python Sound Module/Library In-Reply-To: References: <20130426135623.GA11244@iskra.aviel.ru> <517B14D7.6080709@canterbury.ac.nz> <885647AC-8D19-4186-A797-6D8BAB25240C@yahoo.com> <20130427125126.45c7ce59@fsol> Message-ID: <517C3C15.5060906@egenix.com> On 27.04.2013 22:19, anatoly techtonik wrote: > On Sat, Apr 27, 2013 at 1:51 PM, Antoine Pitrou wrote: > >> On Fri, 26 Apr 2013 20:39:30 -0700 >> Andrew Barnert wrote: >>> On Apr 26, 2013, at 16:59, Greg Ewing >> wrote: >>> >>>> Oleg Broytman wrote: >>>>> Are there cross-platform audio libraries that Python could wrap? >>>> >>>> There's OpenAL: >>>> >>>> http://connect.creativelabs.com/openal/default.aspx >>>> >>> There's actually a bunch of options. >>> >>> The hard question is picking one and endorsing it as "right", or at least >>> "good enough to enshrine in stdlib ala tkinter". >> >> When you notice how "good enough" tkinter is (and has been for 10 >> years at least), you realize the trap hidden in this question. >> >> Really, see my message earlier in this thread. This is better left to >> third-party libraries (which already exist, please do some research). >> > >>From the other side if 80% of cases can be covered without Python packaging > problems - that's already an advantage. For example most people find date / > time functionality in Python enough to avoid using mxDateTime as a > dependency. As for audio, most people find it insufficient. I'm not sure whether 3D audio support is really needed as core feature in a general purpose programming language ;-) I'd suggest to have a look at http://www.libsdl.org/, which can be used from Python via http://pygame.org/ -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Apr 27 2013) >>> Python Projects, Consulting and Support ... http://www.egenix.com/ >>> mxODBC.Zope/Plone.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2013-04-17: Released eGenix mx Base 3.2.6 ... http://egenix.com/go43 ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From greg.ewing at canterbury.ac.nz Sun Apr 28 04:12:25 2013 From: greg.ewing at canterbury.ac.nz (Greg Ewing) Date: Sun, 28 Apr 2013 14:12:25 +1200 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: <11BA3CEF-2115-4398-92AB-4A6E5C034BF7@masklinn.net> References: <11BA3CEF-2115-4398-92AB-4A6E5C034BF7@masklinn.net> Message-ID: <517C8589.4090007@canterbury.ac.nz> Masklinn wrote: > 1. __getattr__ is not implemented on object, thus implementing __getattr__ > in an inheritance hierarchy (where an other object in an MRO may also > have implemented __getattr__) requires boilerplate The same applies to any other special method that object doesn't implement. What is it about __getattr__ that makes it deserving of this treatment? > I was wondering if it wouldn't be possible to add __getattr__ to object, > which would return NotImplemented. And NotImplemented would be interpreted > by the attribute resolution process as "raise the normal AttributeError" That would make it impossible for NotImplemented to be the value of any attribute of anything. -- Greg From random832 at fastmail.us Sun Apr 28 05:03:20 2013 From: random832 at fastmail.us (Random832) Date: Sat, 27 Apr 2013 23:03:20 -0400 Subject: [Python-ideas] Specificity in AttributeError In-Reply-To: References: Message-ID: <517C9178.8050009@fastmail.us> On 04/27/2013 12:44 AM, Devin Jeanpierre wrote: > Code: > class A: > @property > def text(self): > return self.foo > > Behavior: > >>> hasattr(A(), 'text') > False > > Real Problem: > class B(object): > def __init__(self, x): > if hasattr(x, 'text'): > x = x.text > self.x = x > What this is telling me: class A: def __hasattr__(self, name): if name == 'text': return True return super(A, self).__hasattr__(name) This may or may not be something that @property, or some other decorator, ought to take care of. From haoyi.sg at gmail.com Sun Apr 28 05:05:25 2013 From: haoyi.sg at gmail.com (Haoyi Li) Date: Sat, 27 Apr 2013 20:05:25 -0700 Subject: [Python-ideas] Macros for Python In-Reply-To: References: <8A011CC8-5B7E-4BFB-A3A1-6C939712AC47@yahoo.com> Message-ID: I pushed a simple implementation of case classes using Macros, as well as a really nice to use parser combinator library. The case classes are interesting because they overlap a lot with enumerations: auto-generated __str__, __repr__, inheritence via nesting, they can have members and methods, etc. They also show off pretty well how far Python's syntax (and semantic!) can be stretched using macros, so if anyone still has some crazy ideas for enumerations and wants to prototype them without hacking the CPython interpreter, this is your chance! Thanks! -Haoyi On Wed, Apr 24, 2013 at 3:15 PM, Haoyi Li wrote: > @Jonathan: That would be possible, although I can't say I know how to do > it. A naive macro that wraps everything and has a "substitute awaits for > yields, wrap them in inlineCallbacks(), and substitute returns for > returnValue()s" may work, but I'm guessing it would run into a forest of > edge cases where the code isn't so simple (what if you *want* a return? > etc.). > > pdb *should* show the code after macro expansion. Without source maps, I'm > not sure there's any way around that, so debugging may be hard. > > Of course, if the alternative is macros of forking the interpreter, maybe > macros is the easier way to do it =) Debugging a buggy custom-forked > interpreter probably isn't easy either! > > > On Wed, Apr 24, 2013 at 5:48 PM, Jonathan Slenders wrote: > >> One use case I have is for Twisted's inlineCallbacks. I forked the >> pypy project to implement the await-keyword. Basically it transforms: >> >> def async_function(deferred_param): >> a = await deferred_param >> b = await some_call(a) >> return b >> >> into: >> >> @defer.inlineCallbacks >> def async_function(deferred_param): >> a = yield deferred_param >> b = yield some_call(a) >> yield defer.returnValue(b) >> >> >> Are such things possible? And if so, what lines of code would pdb show >> during introspection of the code? >> >> It's interesting, but when macros become more complicated, the >> debugging of these things can turn out to be really hard, I think. >> >> >> 2013/4/24 Haoyi Li : >> > I haven't tested in on various platforms, so hard to say for sure. >> MacroPy >> > basically relies on a few things: >> > >> > - exec/eval >> > - PEP 302 >> > - the ast module >> > >> > All of these are pretty old pieces of python (almost 10 years old!) so >> it's >> > not some new-and-fancy functionality. Jython seems to have all of them, >> I >> > couldn't find any information about PyPy. >> > >> > When the project is more mature and I have some time, I'll see if I can >> get >> > it to work cross platform. If anyone wants to fork the repo and try it >> out, >> > that'd be great too! >> > >> > -Haoyi >> > >> > >> > >> > >> > >> > On Wed, Apr 24, 2013 at 11:55 AM, Andrew Barnert >> wrote: >> >> >> >> On Apr 24, 2013, at 8:05, Haoyi Li wrote: >> >> >> >> You actually can get a syntax like that without macros, using >> >> stack-introspection, locals-trickery and lots of `eval`. The question >> is >> >> whether you consider macros more "extreme" than stack-introspection, >> >> locals-trickery and `eval`! A JIT compiler will probably be much >> happier >> >> with macros. >> >> >> >> >> >> That last point makes this approach seem particularly interesting to >> me, >> >> which makes me wonder: Is your code CPython specific, or does it also >> work >> >> with PyPy (or Jython or Iron)? While PyPy is obviously a whole lot >> easier to >> >> mess with in the first place than CPython, having macros at the same >> >> language level as your code is just as interesting in both >> implementations. >> >> >> >> >> >> On Wed, Apr 24, 2013 at 10:35 AM, Terry Jan Reedy >> >> wrote: >> >>> >> >>> On 4/23/2013 11:49 PM, Haoyi Li wrote: >> >>>> >> >>>> I thought this may be of interest to some people on this list, even >> if >> >>>> not strictly an "idea". >> >>>> >> >>>> I'm working on MacroPy , a >> little >> >>>> >> >>>> pure-python library that allows user-defined AST rewrites as part of >> the >> >>>> import process (using PEP 302). >> >>> >> >>> >> >>> From the readme >> >>> ''' >> >>> String Interpolation >> >>> >> >>> a, b = 1, 2 >> >>> c = s%"%{a} apple and %{b} bananas" >> >>> print c >> >>> #1 apple and 2 bananas >> >>> ''' >> >>> I am a little surprised that you would base a cutting edge extension >> on >> >>> Py 2. Do you have it working with 3.3 also? >> >>> >> >>> '''Unlike the normal string interpolation in Python, MacroPy's string >> >>> interpolation allows the programmer to specify the variables to be >> >>> interpolated inline inside the string.''' >> >>> >> >>> Not true as I read that. >> >>> >> >>> a, b = 1, 2 >> >>> print("{a} apple and {b} bananas".format(**locals())) >> >>> print("%(a)s apple and %(b)s bananas" % locals()) >> >>> #1 apple and 2 bananas >> >>> #1 apple and 2 bananas >> >>> >> >>> I rather like the anon funcs with anon params. That only works when >> each >> >>> param is only used once in the expression, but that restriction is the >> >>> normal case. >> >>> >> >>> I am interested to see what you do with pattern matching. >> >>> >> >>> tjr >> >>> >> >>> _______________________________________________ >> >>> Python-ideas mailing list >> >>> Python-ideas at python.org >> >>> http://mail.python.org/mailman/listinfo/python-ideas >> >> >> >> >> >> _______________________________________________ >> >> Python-ideas mailing list >> >> Python-ideas at python.org >> >> http://mail.python.org/mailman/listinfo/python-ideas >> > >> > >> > >> > _______________________________________________ >> > Python-ideas mailing list >> > Python-ideas at python.org >> > http://mail.python.org/mailman/listinfo/python-ideas >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Sun Apr 28 12:22:56 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sun, 28 Apr 2013 13:22:56 +0300 Subject: [Python-ideas] itertools.chunks() In-Reply-To: References: Message-ID: On Sat, Apr 6, 2013 at 3:50 PM, Giampaolo Rodol? wrote: > def chunks(total, step): > assert total >= step > while total > step: > yield step; > total -= step; > if total: > yield total > > >>> chunks(12, 4) > [4, 4, 4] > >>> chunks(13, 4) > [4, 4, 4, 1] > > > I'm not sure how appropriate "chunks" is as a name for such a function. > This name is better to be reserved for chunking actual data rather than indexes: http://stackoverflow.com/questions/312443/how-do-you-split-a-list-into-evenly-sized-chunks-in-python > Now I wonder, would it make sense to have something like this into > itertools module? > -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From techtonik at gmail.com Sun Apr 28 17:37:24 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Sun, 28 Apr 2013 18:37:24 +0300 Subject: [Python-ideas] Personal views/filters (summaries) for discussions Message-ID: I find it really hard to track proposals, ideas and various deviations in mailing lists, which is especially actual for lists such as python-ideas. I bet other people experience this problem too. The typical scenario: 1. You make a proposal 2. The discussion continues 3. Part of the discussion is hijacked 4. Another part brings the problem you haven't seen 5. You don't have time to investigate the problem 6. Discussion continues 7. Thread quickly gets out of scope of daily emails 8. Contact lost Several week later you remember about the proposal: 9. You open the original proposal to notice a small novel 10. You start to reread 11. Got confused 13. Recall the details 14, Find a way out from irrelevant deviation 15. Encounter the problem 16. Spend what is left to investigate the problem 17. Run out of time The major problem I have is steps 9-15. Sometimes these take the most of the time. What would help to make all the collaboration here more productive are colored view/filters (summaries) for discussions. It would work like so: 00. The discussion is laid out as a single page 01. You define some aspect of discussion (name the filter) 02. You mark text related to the aspect 03. You save the markings. 04. You insert summaries and TODOs 05. Now you select the aspect 06. Irrelevant parts are grayed out 07. Additionally you can collapse grayed sections An ability to edit and enhance these filters will allow to devote a small bits of free time to analyze and summarize the discussion state instead of requiring a single big piece to reread the whole discussion. This way you can split the task of dealing with complexity over time, which I think is more than actual nowadays. IMO this process can be very beneficial for Python development. -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ned at nedbatchelder.com Sun Apr 28 18:35:54 2013 From: ned at nedbatchelder.com (Ned Batchelder) Date: Sun, 28 Apr 2013 12:35:54 -0400 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: References: Message-ID: <517D4FEA.9090605@nedbatchelder.com> On 4/28/2013 11:37 AM, anatoly techtonik wrote: > > An ability to edit and enhance these filters will allow to devote a > small bits of free time to analyze and summarize the discussion state > instead of requiring a single big piece to reread the whole discussion. > > This way you can split the task of dealing with complexity over time, > which I think is more than actual nowadays. IMO this process can be > very beneficial for Python development. > -- > anatoly t. Anatoly, there are dozens if not hundreds of collaboration tools. Each implements a particular model of how people will work together, and each has its proponents and detractors. Large projects like CPython use mailing lists because email is an established platform-agnostic technology that everyone has access to. It's not fancy, and there are issues like top vs bottom posting; people quoting too much, or not enough; threads being derailed; etc. But human communication is inherently messy and difficult. I very much doubt that any structured tool would find wide acceptance, or would significantly change the dynamics of our discussions together. People are difficult: they all think differently, in different languages, on different time scales, in different time zones. An idea I think is obviously good, you may think is obviously bad. Getting to the heart of how two reasonable people can disagree so starkly is difficult work. No amount of workflow is going to make it easier. Don't invest in collaboration tools. Invest in understanding people. Proposing changes to Python is difficult. There are many people who need convincing, and convincing them takes time. They will have objections that have to be addressed, and it may be difficult to understand their objections. The original proposal may have been unclear, and you have to work to figure out what was obvious to you that has to be spelled out, and then spell it out. This is all a lot of work, and it takes a lot of time. I don't see a way around that. If someone understood the entire discussion well enough to apply filters, etc, then we'd already have reached an agreement. I don't know how much appetite the Python-ideas list will have for further discussions of these ideas. Feel free to write to me off-list if you like. --Ned. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas -------------- next part -------------- An HTML attachment was scrubbed... URL: From stephen at xemacs.org Sun Apr 28 20:42:17 2013 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Mon, 29 Apr 2013 03:42:17 +0900 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: <517D4FEA.9090605@nedbatchelder.com> References: <517D4FEA.9090605@nedbatchelder.com> Message-ID: <87fvyawl86.fsf@uwakimon.sk.tsukuba.ac.jp> Ned Batchelder writes: > I don't see a way around that. If someone understood the entire > discussion well enough to apply filters, etc, then we'd already > have reached an agreement. +1 Almost a tautology, but even so a crucial insight. > I don't know how much appetite the Python-ideas list will have for > further discussions of these ideas. It's really off-topic. Python-ideas is for proposals to change Python (the language) or cpython (or other implementations) that aren't concrete enough or are too bike-sheddable to belong on python-dev. If he were to write such a tool in Python and ask for advice, Oleg would show up and tell him to ask on python-list. :-) If he had a specific proposal to adopt an existing workflow tool that could be just plugged in, that would be on-topic (for lack of an open- subscription python-cabal list).[1] Footnotes: [1] open-cabal is an oxymoron, of course. And TINC. Of course. ;-) From tjreedy at udel.edu Sun Apr 28 20:43:56 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Sun, 28 Apr 2013 14:43:56 -0400 Subject: [Python-ideas] Personal views/filters (summaries) for discussions In-Reply-To: References: Message-ID: On 4/28/2013 11:37 AM, anatoly techtonik wrote: > I find it really hard to track proposals, ideas and various deviations > in mailing lists, which is especially actual for lists such as > python-ideas. I bet other people experience this problem too. The > typical scenario: > > 1. You make a proposal > 2. The discussion continues > 3. Part of the discussion is hijacked > 4. Another part brings the problem you haven't seen > 5. You don't have time to investigate the problem > 6. Discussion continues > 7. Thread quickly gets out of scope of daily emails > 8. Contact lost > > Several week later you remember about the proposal: > > 9. You open the original proposal to notice a small novel > 10. You start to reread > 11. Got confused > 13. Recall the details > 14, Find a way out from irrelevant deviation > 15. Encounter the problem > 16. Spend what is left to investigate the problem > 17. Run out of time > > The major problem I have is steps 9-15. Sometimes these take the most of > the time. What would help to make all the collaboration here more > productive are colored view/filters (summaries) for discussions. It > would work like so: > > 00. The discussion is laid out as a single page This is what the PEP process is about. Anyone can summarize a idea as a proto-pep either initially or after preliminary discussion. Objections and unresolved issues are part of a pep. Revisions and reposting are part of the process. > 01. You define some aspect of discussion (name the filter) > 02. You mark text related to the aspect > 03. You save the markings. > 04. You insert summaries and TODOs > > 05. Now you select the aspect > 06. Irrelevant parts are grayed out > 07. Additionally you can collapse grayed sections > > An ability to edit and enhance these filters will allow to devote a > small bits of free time to analyze and summarize the discussion state > instead of requiring a single big piece to reread the whole discussion. > > This way you can split the task of dealing with complexity over time, > which I think is more than actual nowadays. IMO this process can be very > beneficial for Python development. > -- > anatoly t. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > From techtonik at gmail.com Sun Apr 28 23:48:43 2013 From: techtonik at gmail.com (anatoly techtonik) Date: Mon, 29 Apr 2013 00:48:43 +0300 Subject: [Python-ideas] hexdump In-Reply-To: <20120512091543.GA5284@iskra.aviel.ru> References: <20120512091543.GA5284@iskra.aviel.ru> Message-ID: On Sat, May 12, 2012 at 12:15 PM, Oleg Broytman wrote: > On Sat, May 12, 2012 at 11:59:03AM +0300, anatoly techtonik < > techtonik at gmail.com> wrote: > > Just an idea of usability fix for Python 3. > > hexdump module (function or bytes method is better) as simple, easy > > and intuitive way for dumping binary data when writing programs in > > Python. > > Well, you know, the way to add such modules to Python is via > Cheeseshop. Done. https://pypi.python.org/pypi/hexdump -- anatoly t. -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Apr 29 10:59:18 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Apr 2013 10:59:18 +0200 Subject: [Python-ideas] hexdump References: <20120512091543.GA5284@iskra.aviel.ru> Message-ID: <20130429105918.525e04d0@pitrou.net> Le Mon, 29 Apr 2013 00:48:43 +0300, anatoly techtonik a ?crit : > On Sat, May 12, 2012 at 12:15 PM, Oleg Broytman > wrote: > > > On Sat, May 12, 2012 at 11:59:03AM +0300, anatoly techtonik < > > techtonik at gmail.com> wrote: > > > Just an idea of usability fix for Python 3. > > > hexdump module (function or bytes method is better) as simple, > > > easy and intuitive way for dumping binary data when writing > > > programs in Python. > > > > Well, you know, the way to add such modules to Python is via > > Cheeseshop. > > Done. https://pypi.python.org/pypi/hexdump Actually, I think a hexdump() function in pprint would be a nice addition. I find myself wanting it when inspecting some binary protocols (e.g. pickle :-)). Regards Antoine. From ubershmekel at gmail.com Mon Apr 29 13:32:46 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 29 Apr 2013 14:32:46 +0300 Subject: [Python-ideas] hexdump In-Reply-To: <20130429105918.525e04d0@pitrou.net> References: <20120512091543.GA5284@iskra.aviel.ru> <20130429105918.525e04d0@pitrou.net> Message-ID: On Mon, Apr 29, 2013 at 11:59 AM, Antoine Pitrou wrote: > > Actually, I think a hexdump() function in pprint would be a nice > addition. I find myself wanting it when inspecting some binary protocols > (e.g. pickle :-)). > Python 2.7 had >>> 'alkdjfa'.encode('hex') '616c6b646a6661' So why not: >>> b'asdf'.decode('hexdump') '61 73 64 66' Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From solipsis at pitrou.net Mon Apr 29 13:43:15 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Mon, 29 Apr 2013 13:43:15 +0200 Subject: [Python-ideas] hexdump References: <20120512091543.GA5284@iskra.aviel.ru> <20130429105918.525e04d0@pitrou.net> Message-ID: <20130429134315.175a7c0e@pitrou.net> Le Mon, 29 Apr 2013 14:32:46 +0300, Yuval Greenfield a ?crit : > On Mon, Apr 29, 2013 at 11:59 AM, Antoine Pitrou > wrote: > > > > > Actually, I think a hexdump() function in pprint would be a nice > > addition. I find myself wanting it when inspecting some binary > > protocols (e.g. pickle :-)). > > > > Python 2.7 had > >>> 'alkdjfa'.encode('hex') > '616c6b646a6661' > > So why not: > > >>> b'asdf'.decode('hexdump') > '61 73 64 66' Command-line hexdump has a bit more options and abilities, such as wrapping to N character width, printing an ASCII transcript beside the representation, etc. To support this flexibility, a module function is better than a codec :-) Regards Antoine. From ubershmekel at gmail.com Mon Apr 29 14:53:37 2013 From: ubershmekel at gmail.com (Yuval Greenfield) Date: Mon, 29 Apr 2013 15:53:37 +0300 Subject: [Python-ideas] hexdump In-Reply-To: <20130429134315.175a7c0e@pitrou.net> References: <20120512091543.GA5284@iskra.aviel.ru> <20130429105918.525e04d0@pitrou.net> <20130429134315.175a7c0e@pitrou.net> Message-ID: On Mon, Apr 29, 2013 at 2:43 PM, Antoine Pitrou wrote: > Command-line hexdump has a bit more options and abilities, such as > wrapping to N character width, printing an ASCII transcript beside the > representation, etc. > > To support this flexibility, a module function is better than a > codec :-) > I agree. I also agree that pprint is a good place for this. Though you could: b'asdf'.decode('hexdump80chars') b'asdf'.decode('hexdump40chars') b'asdf'.decode('hexdump80chars-trans') b'asdf'.decode('hexdump40chars-trans') Jokes aside, this makes me wonder why decode/encode work like they do. It'd be more sensible to: b'asdf'.decode.utf16(little_endian=True) 'asdf'.encode.utf8(bom=True) Yuval -------------- next part -------------- An HTML attachment was scrubbed... URL: From eric at trueblade.com Mon Apr 29 17:21:54 2013 From: eric at trueblade.com (Eric V. Smith) Date: Mon, 29 Apr 2013 11:21:54 -0400 Subject: [Python-ideas] hexdump In-Reply-To: References: <20120512091543.GA5284@iskra.aviel.ru> <20130429105918.525e04d0@pitrou.net> <20130429134315.175a7c0e@pitrou.net> Message-ID: <517E9012.1000905@trueblade.com> On 04/29/2013 08:53 AM, Yuval Greenfield wrote: > I agree. I also agree that pprint is a good place for this. Another place to do it would be in bytes.__format__. Then you could invent a language for widths, breaks, etc. Eric. From python at mrabarnett.plus.com Mon Apr 29 18:49:03 2013 From: python at mrabarnett.plus.com (MRAB) Date: Mon, 29 Apr 2013 17:49:03 +0100 Subject: [Python-ideas] CPython's cyclic garbage collector (was Automatic context managers) In-Reply-To: References: <517ABC73.4000605@davea.name> <517AD5FE.60706@davea.name> Message-ID: <517EA47F.9030504@mrabarnett.plus.com> On 27/04/2013 02:56, Chris Angelico wrote: > On Sat, Apr 27, 2013 at 9:45 AM, Dave Angel wrote: >> I didn't know there was a callback that a user could hook into. That's very >> interesting. >> > > On Sat, Apr 27, 2013 at 10:22 AM, Skip Montanaro wrote: >>> Whenever the GC finds a cycle that is unreferenced but uncollectable, >>> it stores those objects in the list gc.garbage. At that point, if the >>> user wishes to clean up those cycles, it is up to them to delve into >>> gc.garbage, untangle the objects contained within, break the cycles, >>> and remove them from the list so that they can be freed by the ref >>> counter. >> >> I wonder if it would be useful to provide a gc.garbagehook analogous >> to sys.excepthook? >> Users could assign a function of their choice to much the cyclic >> garbage periodically. >> >> Just a thought, flying out of my fingers before my brain could stop it... > > As far as I know, Dave, there isn't currently one; Skip, that's close > to what I'm talking about - it saves on the periodic check. But > burying it in gc.garbagehook implies having a separate piece of code > that knows how to break the reference cycles, whereas the __del__ > method puts the code right there in the code that has the problem. > Actually, *ANY* solution to this problem implies having __del__ able > to cope with the cycle being broken. Here's an example, perhaps a > silly one, but not far different in nature from some things I've done > in C++. (Granted, all the Python implementations of those same > algorithms have involved built-in types rather than linked lists, but > still.) > > class DLCircList: > def __init__(self,payload): > self.payload=payload > self.next=self.prev=self > print("Creating node: %s"%self.payload) > def __del__(self): > print("Deleting node %s from cycle %s"%(self.payload,self.enum())) > self.prev.next=self.next > self.next.prev=self.prev > def attach(self,other): > assert(self.next==self) # Don't attach twice > self.prev=other > self.next=other.next > other.next=self > self.next.prev=self > print("Adding node %s to cycle %s"%(self.payload,self.enum())) > def enum(self): > """Return a list of all node payloads in this cycle.""" > ptr=self.next > nodes=[self.payload] > while ptr!=self: > nodes.append(ptr.payload) > ptr=ptr.next > return nodes > > lst=DLCircList("foo") > DLCircList("bar").attach(lst) > DLCircList("quux").attach(lst) > DLCircList("asdf").attach(lst) > DLCircList("qwer").attach(lst) > DLCircList("zxcv").attach(lst) > print("Enumerating list: %s"%lst.enum()) > > del lst > import gc > gbg=gc.collect() > print("And we have garbage: %s"%gbg) > print(gc.garbage) > > > > Supposing you did this many many times, and you wanted decent garbage > collection. How would you write a __del__ method, how would you write > something to clean up gc.garbage? One way or another, something will > have to deal with the possibility that the invariants have been > broken, so my theory is that that possibility should be entirely > within __del__. (Since __del__ calls enum(), it's possible for enum() > to throw DestructedObject or whatever, but standard exception handling > will deal with that.) > How about this: If an object has a __collect__ method, then that method will be called whenever the object is collected, either because its reference count has reached 0 (or maybe this should be done explicitly), or because it has been detected by the GC as being part of a cycle. If the method is called (and doesn't raise an exception?), then the object is not added to the garbage list. The principal purpose of method is to give the object the chance to break any cycles. It should be noted that the method could be called more than once. Here is a modified version of the code: class DLCircList: def __init__(self,payload): self._collected = False self.payload = payload self.next = self.prev = self print("Creating node: %s" % self.payload) def __del__(self): self.__collect__() # Implicit or explicit? print("Deleting node %s" % self.payload) def attach(self,other): assert self.next == self # Don't attach twice self.prev = other self.next = other.next other.next = self self.next.prev = self print("Adding node %s to cycle %s" % (self.payload, self.enum())) def enum(self): """Return a list of all node payloads in this cycle.""" ptr = self.next nodes = [self.payload] while ptr != self: nodes.append(ptr.payload) ptr = ptr.next return nodes def __collect__(self): print("Collecting node % s" % self.payload) if self.prev is None: # Already broken the cycle. print("Already collected %s" % self.payload) return self.prev.next = self.next self.next.prev = self.prev # Break the cycle. self.prev = self.next = None def callback(phase, info): if phase == "stop": new_garbage = [] for obj in gc.garbage: if hasattr(obj, "__collect__"): obj.__collect__() else: new_garbage.append(obj) gc.garbage[:] = new_garbage import gc gc.callbacks.append(callback) lst = DLCircList("foo") DLCircList("bar").attach(lst) DLCircList("quux").attach(lst) DLCircList("asdf").attach(lst) DLCircList("qwer").attach(lst) DLCircList("zxcv").attach(lst) print("Enumerating list: % s" % lst.enum()) del lst print("And we have garbage #1: % s" % gc.collect()) print("And we have garbage #2: % s" % gc.collect()) From g.rodola at gmail.com Mon Apr 29 19:37:03 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 29 Apr 2013 19:37:03 +0200 Subject: [Python-ideas] Make traceback messages aware of line continuation Message-ID: Consider the following: assert \ 1 == 0, \ "error" It will produce: Traceback (most recent call last): File "foo.py", line 3, in "error" AssertionError: error The information about the statement which produced the exception is lost. Instead I would expect: Traceback (most recent call last): File "foo.py", line 1, in assert \ 1 == 0, \ "error" AssertionError: error Not sure how easy this is to implement but I think it would be a good enhancement. Thoughts? --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From g.rodola at gmail.com Mon Apr 29 19:42:56 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Mon, 29 Apr 2013 19:42:56 +0200 Subject: [Python-ideas] Make traceback messages aware of line continuation In-Reply-To: References: Message-ID: 2013/4/29 Giampaolo Rodola' : > Consider the following: > > assert \ > 1 == 0, \ > "error" > > It will produce: > > Traceback (most recent call last): > File "foo.py", line 3, in > "error" > AssertionError: error > > The information about the statement which produced the exception is lost. > Instead I would expect: > > Traceback (most recent call last): > File "foo.py", line 1, in > assert \ > 1 == 0, \ > "error" > AssertionError: error > > > Not sure how easy this is to implement but I think it would be a good > enhancement. > Thoughts? > > > --- Giampaolo > https://code.google.com/p/pyftpdlib/ > https://code.google.com/p/psutil/ > https://code.google.com/p/pysendfile/ Shame on me. It seems this is already tracked in http://bugs.python.org/issue12458 Let's say this is a revamping attempt then. =) --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From rosuav at gmail.com Tue Apr 30 00:27:11 2013 From: rosuav at gmail.com (Chris Angelico) Date: Tue, 30 Apr 2013 08:27:11 +1000 Subject: [Python-ideas] CPython's cyclic garbage collector (was Automatic context managers) In-Reply-To: <517EA47F.9030504@mrabarnett.plus.com> References: <517ABC73.4000605@davea.name> <517AD5FE.60706@davea.name> <517EA47F.9030504@mrabarnett.plus.com> Message-ID: On Tue, Apr 30, 2013 at 2:49 AM, MRAB wrote: > How about this: > > If an object has a __collect__ method, then that method will be called > whenever the object is collected... > > It should be noted that the method could be called more than once. Interesting. Adds complication (splitting __del__ into two functions __del__ and __collect__), where one of them might be called more than once. But that complication is now contained entirely within the class that needs it. > def __del__(self): > self.__collect__() # Implicit or explicit? > print("Deleting node %s" % self.payload) > def __collect__(self): > print("Collecting node % s" % self.payload) > if self.prev is None: > # Already broken the cycle. > print("Already collected %s" % self.payload) > return > > self.prev.next = self.next > self.next.prev = self.prev > > # Break the cycle. > self.prev = self.next = None Hmm. There's no guarantee that, once __collect__ is called, __del__ and deallocation will shortly follow. There might be a refloop created externally, somewhere. So there is the possibility that this object will have methods called after __collect__ is called, meaning those methods - ergo *all* methods - will have to deal with the possibility that the invariants are broken. And if I change that last line to "self.prev = self.next = self" (which would disconnect it from its chain and leave it as a stand-alone list of its own - as per __init__), the garbage hangs around. I'd rather, if possible, be able to guarantee that the object's invariants are always true (aside from deliberate fiddling and breaking from external code), rather than having the exception that "once the gc tries to dispose of you once, you're broken". > def callback(phase, info): > if phase == "stop": > new_garbage = [] > > for obj in gc.garbage: > if hasattr(obj, "__collect__"): > obj.__collect__() > else: > new_garbage.append(obj) > > gc.garbage[:] = new_garbage > > import gc > gc.callbacks.append(callback) Aside: I do like the way this can be implemented in pure Python. Ever so much easier to POC than throwing patch files around! ChrisA From python at mrabarnett.plus.com Tue Apr 30 01:28:18 2013 From: python at mrabarnett.plus.com (MRAB) Date: Tue, 30 Apr 2013 00:28:18 +0100 Subject: [Python-ideas] CPython's cyclic garbage collector (was Automatic context managers) In-Reply-To: References: <517ABC73.4000605@davea.name> <517AD5FE.60706@davea.name> <517EA47F.9030504@mrabarnett.plus.com> Message-ID: <517F0212.9010804@mrabarnett.plus.com> On 29/04/2013 23:27, Chris Angelico wrote:> On Tue, Apr 30, 2013 at 2:49 AM, MRAB wrote: >> How about this: >> >> If an object has a __collect__ method, then that method will be >> called whenever the object is collected... >> >> It should be noted that the method could be called more than once. > > Interesting. Adds complication (splitting __del__ into two functions > __del__ and __collect__), where one of them might be called more > than once. But that complication is now contained entirely within the > class that needs it. > >> def __del__(self): >> self.__collect__() # Implicit or explicit? >> print("Deleting node %s" % self.payload) >> def __collect__(self): >> print("Collecting node % s" % self.payload) >> if self.prev is None: >> # Already broken the cycle. >> print("Already collected %s" % self.payload) >> return >> >> self.prev.next = self.next >> self.next.prev = self.prev >> >> # Break the cycle. >> self.prev = self.next = None > > Hmm. There's no guarantee that, once __collect__ is called, __del__ > and deallocation will shortly follow. There might be a refloop > created externally, somewhere. So there is the possibility that this > object will have methods called after __collect__ is called, meaning > those methods - ergo *all* methods - will have to deal with the > possibility that the invariants are broken. Well, consenting adults and all that... With no __collect__ method, it'll behave as it does currently. > And if I change that last line to "self.prev = self.next = self" > (which would disconnect it from its chain and leave it as a > stand-alone list of its own - as per __init__), the garbage hangs > around. I'd rather, if possible, be able to guarantee that the > object's invariants are always true (aside from deliberate fiddling > and breaking from external code), rather than having the exception > that "once the gc tries to dispose of you once, you're broken". > >> def callback(phase, info): >> if phase == "stop": >> new_garbage = [] >> >> for obj in gc.garbage: >> if hasattr(obj, "__collect__"): >> obj.__collect__() >> else: >> new_garbage.append(obj) >> >> gc.garbage[:] = new_garbage >> >> import gc >> gc.callbacks.append(callback) > > Aside: I do like the way this can be implemented in pure Python. > Ever so much easier to POC than throwing patch files around! > From tjreedy at udel.edu Tue Apr 30 02:40:37 2013 From: tjreedy at udel.edu (Terry Jan Reedy) Date: Mon, 29 Apr 2013 20:40:37 -0400 Subject: [Python-ideas] Make traceback messages aware of line continuation In-Reply-To: References: Message-ID: On 4/29/2013 1:42 PM, Giampaolo Rodola' wrote: > 2013/4/29 Giampaolo Rodola' : >> Consider the following: >> >> assert \ >> 1 == 0, \ >> "error" >> >> It will produce: >> >> Traceback (most recent call last): >> File "foo.py", line 3, in >> "error" >> AssertionError: error >> >> The information about the statement which produced the exception is lost. >> Instead I would expect: >> >> Traceback (most recent call last): >> File "foo.py", line 1, in >> assert \ >> 1 == 0, \ >> "error" >> AssertionError: error >> >> >> Not sure how easy this is to implement but I think it would be a good >> enhancement. >> Thoughts? Very dubious idea, for multiple reasons given on the issue. > It seems this is already tracked in http://bugs.python.org/issue12458 For your example, the OP of that issue would replace the line '"error"' with 'assert', which would not be helpful at all. If your statement was assert some_fairly_long_expression_with_calls ==\ something_else, "error" then is would not be clear that backing up would be helpful. Terry From solipsis at pitrou.net Tue Apr 30 13:12:16 2013 From: solipsis at pitrou.net (Antoine Pitrou) Date: Tue, 30 Apr 2013 13:12:16 +0200 Subject: [Python-ideas] Make traceback messages aware of line continuation References: Message-ID: <20130430131216.339fbc98@fsol> On Mon, 29 Apr 2013 20:40:37 -0400 Terry Jan Reedy wrote: > >> > >> The information about the statement which produced the exception is lost. > >> Instead I would expect: > >> > >> Traceback (most recent call last): > >> File "foo.py", line 1, in > >> assert \ > >> 1 == 0, \ > >> "error" > >> AssertionError: error > >> > >> > >> Not sure how easy this is to implement but I think it would be a good > >> enhancement. > >> Thoughts? > > Very dubious idea, for multiple reasons given on the issue. > > > It seems this is already tracked in http://bugs.python.org/issue12458 > > For your example, the OP of that issue would replace the line '"error"' > with 'assert', which would not be helpful at all. If your statement was > > assert some_fairly_long_expression_with_calls ==\ > something_else, "error" > > then is would not be clear that backing up would be helpful. Perhaps you've missed that Giampaolo's suggestion was to print the *entire* statement, not just one line chosen at random? There's one thing this proposal would make more difficult, which is machine-processing of tracebacks. Otherwise it does look better to me. Regards Antoine. From guido at python.org Tue Apr 30 15:54:51 2013 From: guido at python.org (Guido van Rossum) Date: Tue, 30 Apr 2013 06:54:51 -0700 Subject: [Python-ideas] Make traceback messages aware of line continuation In-Reply-To: <20130430131216.339fbc98@fsol> References: <20130430131216.339fbc98@fsol> Message-ID: Would it also understand line continuations using parentheses ( the more common style)? On Tuesday, April 30, 2013, Antoine Pitrou wrote: > On Mon, 29 Apr 2013 20:40:37 -0400 > Terry Jan Reedy > wrote: > > >> > > >> The information about the statement which produced the exception is > lost. > > >> Instead I would expect: > > >> > > >> Traceback (most recent call last): > > >> File "foo.py", line 1, in > > >> assert \ > > >> 1 == 0, \ > > >> "error" > > >> AssertionError: error > > >> > > >> > > >> Not sure how easy this is to implement but I think it would be a good > > >> enhancement. > > >> Thoughts? > > > > Very dubious idea, for multiple reasons given on the issue. > > > > > It seems this is already tracked in http://bugs.python.org/issue12458 > > > > For your example, the OP of that issue would replace the line '"error"' > > with 'assert', which would not be helpful at all. If your statement was > > > > assert some_fairly_long_expression_with_calls ==\ > > something_else, "error" > > > > then is would not be clear that backing up would be helpful. > > Perhaps you've missed that Giampaolo's suggestion was to print the > *entire* statement, not just one line chosen at random? > > There's one thing this proposal would make more difficult, which is > machine-processing of tracebacks. Otherwise it does look better to me. > > Regards > > Antoine. > > > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From g.rodola at gmail.com Tue Apr 30 16:27:50 2013 From: g.rodola at gmail.com (Giampaolo Rodola') Date: Tue, 30 Apr 2013 16:27:50 +0200 Subject: [Python-ideas] Make traceback messages aware of line continuation In-Reply-To: <20130430131216.339fbc98@fsol> References: <20130430131216.339fbc98@fsol> Message-ID: 2013/4/30 Antoine Pitrou : > On Mon, 29 Apr 2013 20:40:37 -0400 > Terry Jan Reedy wrote: >> >> >> >> The information about the statement which produced the exception is lost. >> >> Instead I would expect: >> >> >> >> Traceback (most recent call last): >> >> File "foo.py", line 1, in >> >> assert \ >> >> 1 == 0, \ >> >> "error" >> >> AssertionError: error >> >> >> >> >> >> Not sure how easy this is to implement but I think it would be a good >> >> enhancement. >> >> Thoughts? >> >> Very dubious idea, for multiple reasons given on the issue. >> >> > It seems this is already tracked in http://bugs.python.org/issue12458 >> >> For your example, the OP of that issue would replace the line '"error"' >> with 'assert', which would not be helpful at all. If your statement was >> >> assert some_fairly_long_expression_with_calls ==\ >> something_else, "error" >> >> then is would not be clear that backing up would be helpful. > > Perhaps you've missed that Giampaolo's suggestion was to print the > *entire* statement, not just one line chosen at random? Exactly. 2013/4/30 Guido van Rossum : > Would it also understand line continuations using parentheses ( the more > common style)? Yes, definitively (see http://bugs.python.org/msg188159). I came up with this idea because this is especially annoying during tests, where it's not rare to have long self.assert* statements split over multiple lines. Every time you get a failure you'll likely have to open the test file with an editor and go to line N in order to figure out what the entire assert statement looked like. --- Giampaolo https://code.google.com/p/pyftpdlib/ https://code.google.com/p/psutil/ https://code.google.com/p/pysendfile/ From felipecruz at loogica.net Tue Apr 30 16:41:38 2013 From: felipecruz at loogica.net (Felipe Cruz) Date: Tue, 30 Apr 2013 11:41:38 -0300 Subject: [Python-ideas] Make traceback messages aware of line continuation In-Reply-To: References: <20130430131216.339fbc98@fsol> Message-ID: +1 It would be great to have this feature. 2013/4/30 Giampaolo Rodola' > 2013/4/30 Antoine Pitrou : > > On Mon, 29 Apr 2013 20:40:37 -0400 > > Terry Jan Reedy wrote: > >> >> > >> >> The information about the statement which produced the exception is > lost. > >> >> Instead I would expect: > >> >> > >> >> Traceback (most recent call last): > >> >> File "foo.py", line 1, in > >> >> assert \ > >> >> 1 == 0, \ > >> >> "error" > >> >> AssertionError: error > >> >> > >> >> > >> >> Not sure how easy this is to implement but I think it would be a good > >> >> enhancement. > >> >> Thoughts? > >> > >> Very dubious idea, for multiple reasons given on the issue. > >> > >> > It seems this is already tracked in http://bugs.python.org/issue12458 > >> > >> For your example, the OP of that issue would replace the line '"error"' > >> with 'assert', which would not be helpful at all. If your statement was > >> > >> assert some_fairly_long_expression_with_calls ==\ > >> something_else, "error" > >> > >> then is would not be clear that backing up would be helpful. > > > > Perhaps you've missed that Giampaolo's suggestion was to print the > > *entire* statement, not just one line chosen at random? > > Exactly. > > 2013/4/30 Guido van Rossum : > > Would it also understand line continuations using parentheses ( the more > > common style)? > > Yes, definitively (see http://bugs.python.org/msg188159). > I came up with this idea because this is especially annoying during > tests, where it's not rare to have long self.assert* statements split > over multiple lines. > Every time you get a failure you'll likely have to open the test file > with an editor and go to line N in order to figure out what the entire > assert statement looked like. > > --- Giampaolo > https://code.google.com/p/pyftpdlib/ > https://code.google.com/p/psutil/ > https://code.google.com/p/pysendfile/ > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > http://mail.python.org/mailman/listinfo/python-ideas > -- Felipe Cruz http://about.me/felipecruz -------------- next part -------------- An HTML attachment was scrubbed... URL: